camspiers/silverstripe-classifierbridge

0.1.1 2014-07-23 01:59 UTC

README

This library helps integrate classification services within SilverStripe sites.

Installation (with composer)

$ composer require camspiers/silverstripe-classifierbridge:dev-master

Usage

Integration via DataList and DataObject

  1. Implement the Document interface on your DataObject

use Camspiers\StatisticalClassifier\SilverStripe\Document;

class MyDataObject extends DataObject implements Document
{

	private static $db = array(
		'Content' => 'Text'
		'Spam' => 'Boolean'
	);
	
	
	public function getCategories()
	{
		return array($this->Spam ? 'spam' : 'ham');
	}

	public function getDocument()
	{
		return $this->Content;
	}

}
  1. Use a DataList to retrieve the existing DataObjects and classify a new DataObject
use Camspiers\StatisticalClassifier\Classifier\ComplementNaiveBayes;
use Camspiers\StatisticalClassifier\SilverStripe\DataSource;
use Camspiers\StatisticalClassifier\SilverStripe\Document;

// This DataObject could have been just populate via a form (e.g. $form->saveInto($myDataObject))
$dataObjectToClassify = new MyDataObject(
	array(
		'Content' => 'Some content'
	)
);

try {
	// A DataList is passed into a DataSource and then passed into the classifier
	$classifier = new ComplementNaiveBayes(new DataSource(MyDataObject::get()));
	if ($classifier->is('spam', $dataObjectToClassify->getDocument())) {
		// The document is spam
		// Perhaps set Spam = true on the DataObject and save it
	} else {
		// The document isn't spam
	}
} catch (Exception $e) {
	// Do something with the exception
}

Integration via SQLQuery

Using SQLQuery can improve memory usage and execution time, because it bypasses the creation of DataObjects for each record

use Camspiers\StatisticalClassifier\Classifier\ComplementNaiveBayes;
use Camspiers\StatisticalClassifier\DataSource\Grouped;
use Camspiers\StatisticalClassifier\SilverStripe\SQLQueryDataSource;
use Camspiers\StatisticalClassifier\SilverStripe\Document;

$spamQuery = new SQLQuery("Content, Spam", "MyDataObject", "Spam = 1");
$hamQuery = new SQLQuery("Content, Spam", "MyDataObject", "Spam = 0");

try {
	// Create the classifier by using a Grouped data source
	$classifier = new ComplementNaiveBayes(
		new Grouped(
			array(
				new SQLQueryDataSource("spam", $spamQuery, "Content"),
				new SQLQueryDataSource("ham", $hamQuery, "Content")
			)
		)
	);

	if ($classifier->is('spam', "Some content to classify")) {
		// The document is spam
		// Perhaps set Spam = true on the DataObject and save it
	} else {
		// The document isn't spam
	}
} catch (Exception $e) {
	// Do something with the exception
}

See PHP Classifier for documentation around caching and more advanced topics.