denis-korolev / opencorpora
Library to serialize opencorpora export file data from xml to objects
v1.0.0
2022-01-18 20:29 UTC
Requires
- php: ^7.4
- ext-dom: *
- ext-simplexml: *
- jms/serializer: ^3.17
- symfony/cache: ^5.4
Requires (Dev)
- ext-xmlreader: *
- overtrue/phplint: ^2.0
- phpunit/phpunit: ^9.2
- roave/security-advisories: dev-master
- squizlabs/php_codesniffer: ^3.5
- vimeo/psalm: ^3.8
This package is auto-updated.
Last update: 2024-12-19 02:56:24 UTC
README
Library to serialize opencorpora export file data from xml to objects
This library will help you read Opencorpora export file. In library we have 5 processors:
- GrammemeProcessor (reads only Grammeme node)
- LemmaProcessor (reads only Lemma node)
- LinksProcessor (reads only Links node)
- LinkTypeProcessor (reads only LinkType node)
- RestrictionProcessor (reads only Restr node)
Use this processors to extract xml data to simple DTO objects. XML file opens and reads by PHP library XMLReader
and SimpleXMLElement
node by node. That why it use not a lot of memory.
Installation / Usage
Install the latest version via composer:
composer require denis-korolev/opencorpora
Here is an example of usage GrammemeProcessor
. Other processors using exactly same.
use JMS\Serializer\Naming\IdenticalPropertyNamingStrategy; use JMS\Serializer\Naming\SerializedNameAnnotationStrategy; use JMS\Serializer\SerializerBuilder; use Opencorpora\Dictionary\Grammeme; use Opencorpora\GrammemeProcessor; $serializer = SerializerBuilder::create()->setPropertyNamingStrategy( new SerializedNameAnnotationStrategy( new IdenticalPropertyNamingStrategy() ) ) ->build(); // path to file $fileName = $this->projectDir . DIRECTORY_SEPARATOR . 'var' . DIRECTORY_SEPARATOR . 'dict.opcorpora.xml'; $processor = new GrammemeProcessor($serializer); foreach ($processor->getData($fileName) as $grammeme) { /** * @var $grammeme Grammeme */ echo $grammeme->name; echo $grammeme->parent; echo $grammeme->description; echo $grammeme->alias; }