seoservice2020 / phpmorphy
phpMorphy - morphological analyzer library for Russian, Ukrainian, English and German languages.
Requires
- php: >=7.2 <8.0
- ext-mbstring: *
Requires (Dev)
- phpunit/phpunit: ^8.0
README
phpMorphy is morphological analyzer library for Russian, Ukrainian, English and German languages.
This version supports only PHP 7.2, 7.3 and 7.4.
This library allows to retrieve following morph information for any word:
- base (normal) form;
- all forms;
- grammatical (part of speech, grammems) information.
Installation
Run the following command from your terminal:
composer require seoservice2020/phpmorphy
Or add this to require section in your composer.json
file:
{ "require": { "seoservice2020/phpmorphy": "~2.2" } }
then run composer update
Usage
See examples in examples directory.
Building dictionaries
To build your dictionary from one of the sources:
-
Create an XML file from dictionary source native format, e.g. for AOT, use
bin/dict-processing/convert-mrd2xml.php
script:php bin/dict-processing/convert-mrd2xml.php path/to/aot/dict/file.mwz path/to/otput/
Also for Russian language, you may use
bin/dict-processing/convert-russian-jo.php
to convert XML with Russian dictionary into format withoutё
letter. -
Build phpMorphy dictionaries files using
bin/dict-build/build-dict.php
:At now package has some morphy builder tool for Windows (see
bin/morph-builder/
folder), but you can specify your own morphy builder tool version. Important! Morphy builder executable should be inbin/morphy_builder.exe
file.You may need to provide source-specific data for script, e.g. for AOT you will need to provide path to AOT sources root.
Both morphy builder path and AOT path arguments are optional. As it was before, you also may provide environment variables:
MORPHY_DIR
- morphy builder tool root pathRML
- AOT sources root path
Environment variables are checked first for backward compatibility.
Example:
php bin/dict-build/build-dict.php path/to/xml/ru_RU.xml path/to/otput/ utf-8 1 1 path/to/morphy/builder/root/folder/ path/to/aot/root/folder
Exporting dictionaries
If you need to use some specific dictionaries with phpMorphy, there are categorized dictionaries in dicts/categorized/
folder. All dictionaries are uppercase.
Default dictionaries are:
- Russian language: AOT UTF-8 uppercase dictionary with
ё
letter support - English language: AOT UTF-8 uppercase dictionary
- German language: AOT UTF-8 uppercase dictionary
- Ukrainian language: MySpell UTF-8 uppercase dictionary
Speed (DEPRECATED)
Single word mode
Bulk mode
Note:
All values are words per second speed. Test platform: PHP 5.2.3, AMD Duron 800 with 512Mb memory, WinXP.