shdev / phpflashtext
A port of the flashtext python implementation
Installs: 29 977
Dependents: 0
Suggesters: 0
Security: 0
Stars: 20
Watchers: 1
Forks: 5
Open Issues: 0
pkg:composer/shdev/phpflashtext
Requires
- php: >=5.6.0
Requires (Dev)
- php-coveralls/php-coveralls: ^2.0
- phpunit/phpunit: ^5.7
- symfony/stopwatch: ^3.4
- symfony/var-dumper: ^3.4
This package is not auto-updated.
Last update: 2025-10-26 11:08:07 UTC
README
It's a port from the wonderful python project https://github.com/vi3k6i5/flashtext, for internals of the algorithm look there.
This algorithm allows you to extract or replace several keywords at ones. If you deal with 300 keywords, which have 5 variants each a regex approach is slower than the flashtext approach. For 1000 keyword with 5 variants each the regex can't be build.
In PHP 5.6 using regex is really slow. In newer verions it performs better.
Install
composer require shdev/phpflashtext
Usage
<?php use Shdev\FlashText\KeywordProcessor; $keywordProcessor= new KeywordProcessor(); $keywords = [ 'java' => ['java_2e', 'java programing'], 'product management' => ['product management techniques', 'product management'], ]; $keywordProcessor->addKeywordsFromAssocArray($keywords); $sentence = 'I know java_2e and product management techniques'; $keywordsExtracted = $keywordProcessor->extractKeywords($sentence); // $keywordsExtracted = ['java', 'product management'] $keywordsExtractedWithSpanInfo = $keywordProcessor->extractKeywords($sentence, true); // $keywordsExtractedWithSpanInfo = [ // ['java', 7, 14], // ['product management', 19, 48], //] $sentenceNew = $keywordProcessor->replaceKeywords($sentence); // $sentenceNew = 'I know java and product management';
Citation
The original paper published on FlashText algorithm.
    @ARTICLE{2017arXiv171100046S,
       author = {{Singh}, V.},
        title = "{Replace or Retrieve Keywords In Documents at Scale}",
      journal = {ArXiv e-prints},
    archivePrefix = "arXiv",
       eprint = {1711.00046},
     primaryClass = "cs.DS",
     keywords = {Computer Science - Data Structures and Algorithms},
         year = 2017,
        month = oct,
       adsurl = {http://adsabs.harvard.edu/abs/2017arXiv171100046S},
      adsnote = {Provided by the SAO/NASA Astrophysics Data System}
    }
The article published on Medium freeCodeCamp.