seoservice2020/phpmorphy

phpMorphy - morphological analyzer library for Russian, Ukrainian, English and German languages.

2.2.2 2021-01-31 05:13 UTC

This package is auto-updated.

Last update: 2024-05-29 04:29:13 UTC


README

Latest Stable Version Total Downloads tests codecov License

phpMorphy is morphological analyzer library for Russian, Ukrainian, English and German languages.

This version supports only PHP 7.2, 7.3 and 7.4.

This library allows to retrieve following morph information for any word:

  • base (normal) form;
  • all forms;
  • grammatical (part of speech, grammems) information.

Installation

Run the following command from your terminal:

composer require seoservice2020/phpmorphy

Or add this to require section in your composer.json file:

{
    "require": {
        "seoservice2020/phpmorphy": "~2.2"
    }
}

then run composer update

Usage

See examples in examples directory.

Building dictionaries

To build your dictionary from one of the sources:

  1. Create an XML file from dictionary source native format, e.g. for AOT, use bin/dict-processing/convert-mrd2xml.php script:

    php bin/dict-processing/convert-mrd2xml.php path/to/aot/dict/file.mwz path/to/otput/

    Also for Russian language, you may use bin/dict-processing/convert-russian-jo.php to convert XML with Russian dictionary into format without ё letter.

  2. Build phpMorphy dictionaries files using bin/dict-build/build-dict.php:

    At now package has some morphy builder tool for Windows (see bin/morph-builder/ folder), but you can specify your own morphy builder tool version. Important! Morphy builder executable should be in bin/morphy_builder.exe file.

    You may need to provide source-specific data for script, e.g. for AOT you will need to provide path to AOT sources root.

    Both morphy builder path and AOT path arguments are optional. As it was before, you also may provide environment variables:

    • MORPHY_DIR - morphy builder tool root path
    • RML - AOT sources root path

    Environment variables are checked first for backward compatibility.

    Example:

    php bin/dict-build/build-dict.php path/to/xml/ru_RU.xml path/to/otput/ utf-8 1 1 path/to/morphy/builder/root/folder/ path/to/aot/root/folder

Exporting dictionaries

If you need to use some specific dictionaries with phpMorphy, there are categorized dictionaries in dicts/categorized/ folder. All dictionaries are uppercase.

Default dictionaries are:

  • Russian language: AOT UTF-8 uppercase dictionary with ё letter support
  • English language: AOT UTF-8 uppercase dictionary
  • German language: AOT UTF-8 uppercase dictionary
  • Ukrainian language: MySpell UTF-8 uppercase dictionary

Speed (DEPRECATED)

Single word mode

mode base form all forms all forms with gram. info
FILE 1000 800 600
SHM 2200 1100 800
MEM 2500 1200 900

Bulk mode

mode base form all forms all forms with gram. info
FILE 1700 800 700
SHM 3200 800 700
MEM 3500 800 700

Note:

All values are words per second speed. Test platform: PHP 5.2.3, AMD Duron 800 with 512Mb memory, WinXP.