arthurkushman / detox
Detox is a library to detect toxic text (comments/posts etc) of variable length with different patterns
Installs: 16 274
Dependents: 0
Suggesters: 0
Security: 0
Stars: 12
Watchers: 3
Forks: 1
Open Issues: 2
Type:detox
Requires
- php: ^7.1
Requires (Dev)
- fzaninotto/faker: ^1.7
- mockery/mockery: ~1.0
- phpunit/phpunit: >=6.5
This package is auto-updated.
Last update: 2024-12-20 17:19:41 UTC
README
It's inspired by providing tool for simple scoring/filtering with just a PHP implementation (without the need to set multiple libs probably on C/Python, or importing db dumps etc).
Installation
composer require arthurkushman/detox
Using words and word patterns
To get toxicity score on any text:
$text = new Text('Some text'); $words = new Words(new EnglishSet(), $text); $words->processWords(); if ($words->getScore() >= 0.5) { echo 'Toxic text detected'; }
to test an input string on asterisk pattern occurrences:
$words->processPatterns(); if ($words->getScore() >= 0.5) { echo 'Toxic text detected'; }
Using phrases
Another option is to check for phrases:
$phrases = new Phrases(new EnglishSet(), $text); $phrases->processPhrases(); if ($words->getScore() >= 0.5) { echo 'Toxic text detected'; }
There are no constraints to use all options at once, so u can do the following:
// Phrases object extends Words - just use all inherited methods $detox = new Phrases(new EnglishSet(), $text); $detox->processWords(); // change string in Text object $text->setString('Another text'); // inject Text object to Phrases $detox->setText($text); $detox->processPhrases(); $text->setString('Yet another text'); $detox->setText($text); $detox->processPatterns(); if ($detox->getScore() >= 0.5) { echo 'Toxic text detected'; }
Replace with custom templates and prefix/postfix pre-sets
An additional option that u may need in particular situations is to replace words/phrases with pre-set template:
$this->text->setPrefix('['); $this->text->setPostfix(']'); $this->text->setReplaceChars('____'); $this->text->setString('Just piss off dude'); $this->text->setReplaceable(true); $this->phrases->setText($this->text); $this->phrases->processPhrases(); echo $this->phrases->getText()->getString(); // output: Just [____] dude
By default pattern is 5 dashes, so u can call only $this->text->setReplaceable(true);
before any processor to achieve replacement with default settings.
Creating custom data-set
$customSet = new CustomSet(); $customSet->setWords([ '0.9' => ['weird'] ]); $this->text->setString('This weird text should be detected'); $this->words->setText($this->text); $this->words->setDataSet($customSet); $this->words->processWords(); echo $this->words->getScore(); // output: 0.9
Run tests
In root directory (in console) run the following:
phpunit
Be sure to install phpunit globally, or run it from vendor:
vendor/bin/phpunit
All contributions are welcome