startcodein/indicsoundex

Indian language soundex package based on Santhosh Thottingal's algorithm.

Installs: 47

Dependents: 0

Suggesters: 0

Security: 0

Stars: 2

Watchers: 1

Forks: 2

Open Issues: 0

pkg:composer/startcodein/indicsoundex

0.1.0 2015-10-17 07:19 UTC

This package is not auto-updated.

Last update: 2026-01-18 01:13:03 UTC


README

Indian laguage soundex package based on Santhosh Thottingal's algorithm. For more info on algorithm check here

Soundex is phonetic algorithm for indexing names by sound as pronounced in English. This module implements Soundex algorithm for Engish as well as a modified version of soundex algorithm for Indian languages.

This include Indian major languages:

  • Hindi (hi_IN)
  • Bengali (bn_IN)
  • Punjabi (pa_IN)
  • Gujarati (gu_IN)
  • Oriya (or_IN)
  • Tamil (ta_IN)
  • Telugu (te_IN)
  • Kannada (kn_IN)
  • Malayalam (ml_IN)
  • English (en_US)

This can be extended to any language by including soundex character map for it.
Quick start

Installing using git

git clone https://github.com/startcodein/IndicSoundex.git

Installing using composer

composer require startcodein/indicsoundex:@dev

Generating soundex

<?php 

   use Startcode\IndicSoundex\IndicSoundex as IndicSoundex;
   
   $sound = new IndicSoundex();

   echo $sound->soundex('ಬೆಂಗಳೂರು').PHP_EOL;
   echo $sound->soundex('आम्र् फल्').PHP_EOL;
   echo $sound->soundex('vasudev').PHP_EOL;
   echo $sound->soundex('Rupert्').PHP_EOL;

This will give output

ಬDNFQCPC
आNPMQ000
v2310000
r1630000

Comparing string soundex

<?php 

   use Startcode\IndicSoundex\IndicSoundex as IndicSoundex;
   
   $sound = new IndicSoundex();

   echo $sound->compare('बॆंगळूरु','आम्र् फल्').PHP_EOL;
   echo $sound->compare('Bangalore','ಬೆಂಗಳೂರು').PHP_EOL;
   echo $sound->compare('बॆंगळूरु','बॆंगळूरु').PHP_EOL;
   echo $sound->compare('അമ്മ','അമ').PHP_EOL;

This will give output like this

-1  //Not equal
-1  //Not equal
0   // Same word
1   // Similar
2   //Diff lang similar

License

Copyright(c) 2015 Sanoob Pattanath

This program is free software; you can redistribute it and/or modify it under the terms of the GNU General Public License as published by the Free Software Foundation; either version 3 of the License, or (at your option) any later version. This program is distributed in the hope that it will be useful, but WITHOUT ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU Library General Public License for more details.

Contributions

Any kind of contributions are really appreciated. If you find any bugs or security issues please email hello[at]pattanath.com or raise an issue on github.