HOWTO remove diacritics in a string

When comparing strings you sometimes need to ignore diacritics (accents) like é, ö etc.
The following method removes the diacritics:

With the following extension method you can compare two strings for equality: 

static string RemoveDiacritics(string stIn)
{
    string stFormD = stIn.Normalize(NormalizationForm.FormD);
    StringBuilder sb = new StringBuilder();
    for (int ich = 0; ich < stFormD.Length; ich++)
    {
        UnicodeCategory uc = CharUnicodeInfo.GetUnicodeCategory(stFormD[ich]);
        if (uc != UnicodeCategory.NonSpacingMark)
        {

            sb.Append(stFormD[ich]);
        }
    }
    return (sb.ToString().Normalize(NormalizationForm.FormC));
}


public static bool Equals2(this string source, string toCheck)

{

    return RemoveDiacritics(source).ToLower().Trim() == RemoveDiacritics(toCheck).ToLower().Trim();

}

The following code gives as result ‘True’.


string x = "Azië";

bool equal = x.Equals2("azie");

Geplaatst in Uncategorized. Tags: . Reageer »

Geef een reactie

Fill in your details below or click an icon to log in:

WordPress.com logo

Je reageert onder je WordPress.com account. Log Out / Bijwerken )

Twitter-afbeelding

Je reageert onder je Twitter account. Log Out / Bijwerken )

Facebook foto

Je reageert onder je Facebook account. Log Out / Bijwerken )

Verbinden met %s

Follow

Get every new post delivered to your Inbox.