jity / tag-generator
Generate Tags from a given text.
v0.2.1
2012-11-24 10:38 UTC
Requires
- php: >=5.3.0
- doctrine/common: >=2.3.0
This package is not auto-updated.
Last update: 2025-02-09 05:47:40 UTC
README
About
This bundle is part of the Jity project. With the help of this generator you be able to transform any text to a usefull collection of tags.
Installation
Add JityTagGenerator to your composer.json:
{
"require": {
"jity/tag-generator": "dev-master"
}
}
Download bundle:
php composer.phar update
Add the JityTagGenerator to your AppKernel.php
public function registerBundles()
{
$bundles = array(
...
new Jity\TagGeneratorBundle\JityTagGeneratorBundle(),
...
);
...
}
Usage
This is a simple example how to use the TagGenerator.
use Jity\Tag\TagGenerator,
Jity\Tag\Filter\Score,
Jity\Tag\Filter\ScoreGroup,
Jity\Tag\Filter\Length,
Jity\Tag\Filter\Occurrence,
Jity\Tag\Filter\Dictionary,
Jity\Tag\Filter\Capitalized,
Jity\Tag\Filter\Uppercase,
Jity\Tag\Filter\Camelcase,
Jity\Tag\Filter\Regex;
/* ------------------------------------------------------ */
/* - Configuration */
/* ------------------------------------------------------ */
// Instantiate a new Generator
$generator = new TagGenerator();
// Configure all Filters
$generator
/* Remove words shorter than 3 chars */
->addFilter(
new Length(1, true, array(
'min' => 2
))
)
/* Remove most useless words from collection (stop-words) */
->addFilter(
new Dictionary(1, true, array(
'match' => true,
'casesensitive' => false,
'dictionaries' => array(
'german' => array(
'adjektive',
'verben',
'klein',
'fixwords'
)
)
))
)
/* Score occurrence of remaining words */
->addFilter(
new Occurrence(5)
)
/* Score uppercased words */
->addFilter(new Uppercase(15))
/* Score camelcased words */
->addFilter(new Camelcase(15))
/* Score capitalized words */
->addFilter(new Capitalized(5));
// Receive the collection of tags
$tags = $generator->getTags('Lorem ipsum etc');
Development
Write own filters
All you need to do this is to implement Jity\Tag\Filter\FilterInterface or extend Jity\Tag\Filter\AbstractFilter. A good and simple example is the Jity\Tag\Filter\Uppercase filter. Just have a look at this.
Recompile a dictionary
Go to resources/dictionaries/LANG/source and run:
for i in stopwords fixwords adjektive verben compound klein verben worte; do cat source/${i}*.txt | ../compiler.sh "$i"; done