hugsbrugs / php-keywords
PHP Text Utilities
Installs: 100
Dependents: 0
Suggesters: 0
Security: 0
Stars: 0
Watchers: 2
Forks: 1
Open Issues: 1
Language:HTML
Requires
- html2text/html2text: dev-master
- hugsbrugs/php-xpath: dev-master
- patrickschur/language-detection: ^5.3.0
- voku/stop-words: dev-master
Requires (Dev)
This package is auto-updated.
Last update: 2024-12-18 10:07:31 UTC
README
Test my free and online Keyword Density Checker
This library provides PHP functions to extract keywords from HTML and text. Read PHP DOC
Install
Install package with composer
composer require hugsbrugs/php-keywords
In your PHP code, load librairy
require_once __DIR__ . '/vendor/autoload.php'; use Hug\Keywords\Keywords as Keywords;
Usage
If you have HTML as input, first extract text from HTML (it also returns Title and meta description)
$text = Keywords::get_text_from_html($html);
Then call the Keyword class with text as only required parameter. In this case the library patrickschur/language-detection will be used to automatically detect language.
$Keywords = new Keywords($text); $kws = $Keywords->keywords;
If you know which language is used in your text, then pass parameter lang as second argument. It will allow to load stop words list from voku/stop-words library. Supported languages codes by this library are : ar, bg, ca, cz, da, de, el, en, eo, es, et, fi, fr, hi, hr, hu, id, it, ka, lt, lv, nl, no, pl, pt, ro, ru, sk, sv, tr, uk, vi.
$Keywords = new Keywords($text, $lang); $kws = $Keywords->keywords;
If your language is not supported by voku/stop-words library or if you want to use your own stop words list, set 2nd argument as null and pass your own stop words array as 3rd argument.
$Keywords = new Keywords($text, 'auto', ['my custom stop word array']); $kws = $Keywords->keywords;
You can optionnaly pass a 4th argument as the max numbers of keywords to be returned. Set to 20 by default. Pass 0 if you want all keywords. In all cases it only returns keywords with occurence above 1.
$Keywords = new Keywords($text, 'fr', [], 10); $kws = $Keywords->keywords;
You can also pass an optionnal 5th argument array containing a list of chars you want to be removed from the analysed text. The default list is : | / & : , ; ! ? _ * - - ... → – « » + ✔ # ¿ < > [ ] { }
$Keywords = new Keywords($text, 'fr', [], 10, ['my custom chars list']); $kws = $Keywords->keywords;
The optionnal 6th argument represents minimum keyword occurence to be returned. Default value is set to 2 so only keywords with minimum 2 occurences will be returned. If you want to get all keywords including those with only 1 occurence, set this parameter to 1.
$Keywords = new Keywords($text, 'fr', [], 10, ['my custom chars list'], 1); $kws = $Keywords->keywords;
For the url https://naturo-paca.fr/definition-naturopathie, the library outputs :
[
{
"1": {
"naturopathe": [
12,
"0.61"
],
"m\u00e9decines": [
11,
"0.56"
],
"naturopathie": [
9,
"0.46"
],
"techniques": [
9,
"0.46"
],
"m\u00e9decine": [
9,
"0.46"
],
...
},
"2": {
"marie maugey": [
5,
"0.26"
],
"maugey naturopathe": [
4,
"0.20"
],
"\u2013 hippocrate": [
3,
"0.15"
],
"m\u00e9decines alternatives": [
2,
"0.10"
],
"m\u00e9decine conventionnelle": [
2,
"0.10"
],
...
},
"3": {
"marie maugey naturopathe": [
4,
"0.20"
],
"utilisation de techniques": [
3,
"0.15"
],
"associe cette technique": [
3,
"0.15"
],
"technique \u00e0 l\u2019\u00e9l\u00e9ment": [
3,
"0.15"
],
"s'adresse la naturopathie": [
2,
"0.10"
],
...
},
"4": {
"on associe cette technique": [
3,
"0.15"
],
"associe cette technique \u00e0": [
3,
"0.15"
],
"cette technique \u00e0 l\u2019\u00e9l\u00e9ment": [
3,
"0.15"
],
"qui s'adresse la naturopathie": [
2,
"0.10"
],
"la prise en charge": [
2,
"0.10"
],
...
}
}
]
Unit Tests
https://github.com/php-coveralls/php-coveralls
vendor/phpunit/phpunit/phpunit --configuration phpunit.xml
Author
Hugo Maugey visit my website ;)
Online Tools
https://copywritely.com/keyword-density-checker/
Dependecies
https://github.com/voku/stop-words https://github.com/mtibben/html2text