bitandblack/german-words

This library provides a huge dataset of german words and their grammar rules

0.9.0 2023-12-11 07:39 UTC

This package is auto-updated.

Last update: 2024-11-11 09:34:41 UTC


README

PHP from Packagist Codacy Badge Latest Stable Version Total Downloads License

German Words

This library provides a huge dataset of 102.500 german words and their grammar rules.

It is taken from gambolputty/german-nouns. More info about the different columns can be found here. Original source is WiktionaryDE with license Creative Commons Attribution-ShareAlike 3.0 Unported.

Installation

This library is made for the use with Composer. Add it to your project by running $ composer require bitandblack/german-words.

Usage

Set up a Words object and give it the file loaded by the CSV loader:

<?php 

use BitAndBlack\File\CSV;
use BitAndBlack\Words;

$datasetFull = 'data/words.csv';
$fullLoader = new CSV($datasetFull, 0);
$words = new Words($fullLoader);

You can access the words now by calling get(), for example:

<?php 

$word = $words->get('Hose')->getSingular(true);
var_dump($word);

This will dump die Hose.

Performance

Cache

The dataset is very huge and takes a long time to load, that's why you can set up a cache. All the loaded words will stored in this file then. The full dataset will only load then if there's a request for a word which isn't found in the cached dataset file. To use the cache function, set up like that:

<?php 

use BitAndBlack\Cache\Cache;
use BitAndBlack\File\CSV;
use BitAndBlack\Words;

$datasetFull = 'data/words.csv';
$datasetCached = 'data/words-cached.csv';

$fullLoader = new CSV($datasetFull, 0);
$cacheLoader = new CSV($datasetCached, 0);

$words = new Words(
    $fullLoader, 
    new Cache($cacheLoader)
);

Ignore Words

When words don't exist in the cache the script will always load the dataset. If words don't exist there neither, you can store them in a list of ignored words. Whenever a word appears on this list has() will return false without loading the whole dataset.

Set up like that:

<?php 

use BitAndBlack\Cache\Cache;
use BitAndBlack\File\CSV;
use BitAndBlack\Words;
use BitAndBlack\IgnoredWords\IgnoredWords;

$datasetFull = 'data/words.csv';
$datasetCached = 'data/words-cached.csv';
$datasetIgnored = 'data/words-ignored.csv';

$fullLoader = new CSV($datasetFull, 0);
$cacheLoader = new CSV($datasetCached, 0);
$ignoredLoader = new CSV($datasetIgnored, 0);

$words = new Words(
    $fullLoader, 
    new Cache($cacheLoader),
    new IgnoredWords($ignoredLoader)
);

Help

If you have any questions feel free to contact us under hello@bitandblack.com.

More information about Bit&Black can be found under www.bitandblack.com.