bitandblack / german-words
This library provides a huge dataset of german words and their grammar rules
Requires
- php: >=7.2
- ext-mbstring: *
- league/csv: ^9.2
Requires (Dev)
- phpstan/phpstan: ^1.0
- phpunit/phpunit: ^10.0
- rector/rector: ^0
- symplify/easy-coding-standard: ^12.0
Suggests
- bitandblack/composer-helper: Helps finding the path to the words file
README
German Words
This library provides a huge dataset of 102.500 german words and their grammar rules.
It is taken from gambolputty/german-nouns. More info about the different columns can be found here. Original source is WiktionaryDE with license Creative Commons Attribution-ShareAlike 3.0 Unported.
Installation
This library is made for the use with Composer. Add it to your project by running $ composer require bitandblack/german-words
.
Usage
Set up a Words object and give it the file loaded by the CSV loader:
<?php
use BitAndBlack\File\CSV;
use BitAndBlack\Words;
$datasetFull = 'data/words.csv';
$fullLoader = new CSV($datasetFull, 0);
$words = new Words($fullLoader);
You can access the words now by calling get()
, for example:
<?php
$word = $words->get('Hose')->getSingular(true);
var_dump($word);
This will dump die Hose
.
Performance
Cache
The dataset is very huge and takes a long time to load, that's why you can set up a cache. All the loaded words will stored in this file then. The full dataset will only load then if there's a request for a word which isn't found in the cached dataset file. To use the cache function, set up like that:
<?php
use BitAndBlack\Cache\Cache;
use BitAndBlack\File\CSV;
use BitAndBlack\Words;
$datasetFull = 'data/words.csv';
$datasetCached = 'data/words-cached.csv';
$fullLoader = new CSV($datasetFull, 0);
$cacheLoader = new CSV($datasetCached, 0);
$words = new Words(
$fullLoader,
new Cache($cacheLoader)
);
Ignore Words
When words don't exist in the cache the script will always load the dataset. If words don't exist there neither, you can store them in a list of ignored words. Whenever a word appears on this list has()
will return false without loading the whole dataset.
Set up like that:
<?php
use BitAndBlack\Cache\Cache;
use BitAndBlack\File\CSV;
use BitAndBlack\Words;
use BitAndBlack\IgnoredWords\IgnoredWords;
$datasetFull = 'data/words.csv';
$datasetCached = 'data/words-cached.csv';
$datasetIgnored = 'data/words-ignored.csv';
$fullLoader = new CSV($datasetFull, 0);
$cacheLoader = new CSV($datasetCached, 0);
$ignoredLoader = new CSV($datasetIgnored, 0);
$words = new Words(
$fullLoader,
new Cache($cacheLoader),
new IgnoredWords($ignoredLoader)
);
Help
If you have any questions feel free to contact us under hello@bitandblack.com
.
More information about Bit&Black can be found under www.bitandblack.com.