Monday, February 20, 2012

removing accents and doing accent-insensitive processing

Hi,
I have an app that does case-insensitive, accent-insensitive searches using
sql server fulltext engine. It deals only with western-european lanaguages
(mostly just English and French). The tables are stored in Latin1 General
CP1.
However, I also need to process the search results in an accent-insensitive
way outside of SQL server (using C#).
The approach I took was to create a separate table using CP1253 (greek)
which has no accented characters, and fulltext index that table instead.
When I copy the data from CP1 to the CP1253, SQL Server removes the accents
so I get back the search results without accents on (which is what I want)
However, if the search criteria includes accented characters, I get no
results. The text engine does not strip the accents off the search string
even though the table I indexed is accent insensitive.
It seems that what I need to do is strip accents off the search strings
before submitting the text search. does anyone know of a way to do this?
Failing that, is there another solution that would get me round this
problem?
TIA
Andy
you have to run a series of replace statement in your search string to
remove them.
"Andy Fish" <ajfish@.blueyonder.co.uk> wrote in message
news:ueUnF8WIFHA.3332@.TK2MSFTNGP14.phx.gbl...
> Hi,
> I have an app that does case-insensitive, accent-insensitive searches
> using sql server fulltext engine. It deals only with western-european
> lanaguages (mostly just English and French). The tables are stored in
> Latin1 General CP1.
> However, I also need to process the search results in an
> accent-insensitive way outside of SQL server (using C#).
> The approach I took was to create a separate table using CP1253 (greek)
> which has no accented characters, and fulltext index that table instead.
> When I copy the data from CP1 to the CP1253, SQL Server removes the
> accents so I get back the search results without accents on (which is what
> I want)
> However, if the search criteria includes accented characters, I get no
> results. The text engine does not strip the accents off the search string
> even though the table I indexed is accent insensitive.
> It seems that what I need to do is strip accents off the search strings
> before submitting the text search. does anyone know of a way to do this?
> Failing that, is there another solution that would get me round this
> problem?
> TIA
> Andy
>
|||Hi,
in ASP, you can use this function :
Public Function removeAccent(source)
avantConversion = "ǒ"
apresConversion = "aAaAaAaAeEeEeEeEiIiIiIoOoOoOuUuUuUcC'n"
temp = source
For boucle = 1 To Len(avantConversion)
temp = Replace(temp, Mid(avantConversion, boucle, 1),
Mid(apresConversion, boucle, 1))
Next
temp = Replace(temp, "", "oe")
removeAccent = temp
End Function
You can create this function directly in sql server like this :
CREATE FUNCTION replaceAccentChar (@.source as varchar(255))
RETURNS varchar(255) AS
BEGIN
declare @.charList as varchar(20)
declare @.temp as varchar(255)
declare @.t as int
set @.temp = @.source
set @.charList = 'aeioucn'
set @.t = 0
while @.t <= len(@.charList)
begin
set @.temp = replace(@.temp, substring(@.charList, @.t, 1),
substring(@.charList, @.t, 1))
set @.t = @.t + 1
end
set @.temp = Replace(@.temp, '', 'oe')
set @.temp = Replace(@.temp, '', '''')
return @.temp
END
Daniel
Andy Fish wrote:
> Hi,
> I have an app that does case-insensitive, accent-insensitive searches using
> sql server fulltext engine. It deals only with western-european lanaguages
> (mostly just English and French). The tables are stored in Latin1 General
> CP1.
> However, I also need to process the search results in an accent-insensitive
> way outside of SQL server (using C#).
> The approach I took was to create a separate table using CP1253 (greek)
> which has no accented characters, and fulltext index that table instead.
> When I copy the data from CP1 to the CP1253, SQL Server removes the accents
> so I get back the search results without accents on (which is what I want)
> However, if the search criteria includes accented characters, I get no
> results. The text engine does not strip the accents off the search string
> even though the table I indexed is accent insensitive.
> It seems that what I need to do is strip accents off the search strings
> before submitting the text search. does anyone know of a way to do this?
> Failing that, is there another solution that would get me round this
> problem?
> TIA
> Andy
>
|||A very nice bit of code! You forgot though
Hilary Cotter
Looking for a SQL Server replication book?
http://www.nwsu.com/0974973602.html
Looking for a FAQ on Indexing Services/SQL FTS
http://www.indexserverfaq.com
"Daniel Blais" <daniel@.danielblais.net> wrote in message
news:4229E151.3020703@.danielblais.net...[vbcol=seagreen]
> Hi,
> in ASP, you can use this function :
> Public Function removeAccent(source)
> avantConversion = "ǒ"
> apresConversion = "aAaAaAaAeEeEeEeEiIiIiIoOoOoOuUuUuUcC'n"
> temp = source
> For boucle = 1 To Len(avantConversion)
> temp = Replace(temp, Mid(avantConversion, boucle, 1),
> Mid(apresConversion, boucle, 1))
> Next
> temp = Replace(temp, "", "oe")
> removeAccent = temp
> End Function
>
> You can create this function directly in sql server like this :
> CREATE FUNCTION replaceAccentChar (@.source as varchar(255))
> RETURNS varchar(255) AS
> BEGIN
> declare @.charList as varchar(20)
> declare @.temp as varchar(255)
> declare @.t as int
> set @.temp = @.source
> set @.charList = 'aeioucn'
> set @.t = 0
> while @.t <= len(@.charList)
> begin
> set @.temp = replace(@.temp, substring(@.charList, @.t, 1),
> substring(@.charList, @.t, 1))
> set @.t = @.t + 1
> end
> set @.temp = Replace(@.temp, '', 'oe')
> set @.temp = Replace(@.temp, '', '''')
> return @.temp
> END
> Daniel
>
> Andy Fish wrote:
using[vbcol=seagreen]
lanaguages[vbcol=seagreen]
General[vbcol=seagreen]
accent-insensitive[vbcol=seagreen]
accents[vbcol=seagreen]
want)[vbcol=seagreen]
string
>
>
>

No comments:

Post a Comment