extend-arabic-query

Check the package on npm.

This npm package provides a function to convert any text written in Arabic letters to a Regular Expression? string that can be used as test or match method paramter against any other string to compare for equality.

Normally, comparing words that contain letters that can have multiple forms like the letter "أ" can give false results. For example, comparing the two forms of writing the name "Ahmad" ("أحمد" and "احمد") against each other, should be true, but it doesn't, even if you tried the strict or abstract equality signs [ "أحمد" == "احمد" ], since the two forms of the letter "alef" (with and without "hamzah") are different characters since they have different unicode codes (namely, 623 for "أ", and 627 for "ا").

Also, all the other forms of the letter alef should map to any of the other forms, since they usually get mistakenly swapped for eachother. Say the same for many other letters.

The current version of the package maps the following letters to their corresponding forms or alternatives:

Letter GroupGroup NameSubstitution
ة - هhaa groupsame*
ا - أ - إ - آ - ءalef groupsame, plus, all other forms of hamza (ئ - ؤ)
ئhamza on yaa groupsame as for alef, plus all other forms of yaa
ؤhamza on waw groupsame as for alef, plus all other forms of yaa
وwaw groupو - ؤ
ي - ىyaa groupsame, plus "ئ"
* "same" means a RegEx list of all the characters, i.e. if any of either "ه" or "ة" was found in the string, it will be replaced with [ةه] in the RegEx string.

The library also takes account for possible mis-spellings due to local pronounciation. Currently, the following groups are considered:

Letter GroupGroup NameSubstitution
ز - ذzai groupsame, since many Arabic dialects use both interchangeably
ث - سseen groupsame, since many Arabic dialects use both interchangeably

Also, since many would write the words "أبو" and "عبد" with a trailing space or without it. It's been taken into consideration as well!

Also, not to forget irregularly spelled names, like "داوود" and "يحيى" as some may write them with different amount of vowels. This, too, has been taken into consideration!

Try IT!

Write whatever you want in this input, and it will immediately show you the RegEx string output beneath it.

compare("عبد الجيد أحمد حماده ابو ذكري", "عبدالجيد احمد حمادة أبوذكرى") This expression evaluates to true
The Underlying RegEx String:

(?:عبدال|عبد ال)ج[يئى]د [اأإآءئؤ]حمد حم[اأإآءئؤ]د[ةه] (?:[اأإآ]ب[وؤ][ء-ي]|[اأإآ]ب[وؤ] [ء-ي])كر[يئى]

A More Clear Syntax
const text = "عبد الجيد أحمد حماده ابو ذكري";
const text_to_compare = "عبدالجيد احمد حمادة أبوذكرى";

const result = compare(text, text_to_compare);
    
javascript

Now you should be able to use this function in any check including any Arabic strings. You can even use it inside the pattern attribute for inputs to generate highly specific, and smart patterns.

The sky is the limit!