torian257x/ai-php-rubix-wrap

AI PHP is a wrapper for rubix ml

0.9.1.1 2021-08-11 04:22 UTC

This package is auto-updated.

Last update: 2024-03-11 09:31:31 UTC


README

A wrapper for Rubix ML to make it very approachable

Example:

    $report = RubixService::train($data, 'column_with_label');

Where column_with_label is the key of the multi dimensional array $data that contains the value that you want to predict.

Let's make a simple example:

$apartment_data = [
        ['space_m2' =>  10, 'price' => 100],
        ['space_m2' =>  20, 'price' => 200],
        ['space_m2' =>  30, 'price' => 300],
        ['space_m2' =>  40, 'price' => 400],
        //...
        ['space_m2' => 280, 'price' => 2800],
        ['space_m2' => 290, 'price' => 2900],
        ['space_m2' => 300, 'price' => 3000],
];

$report = RubixService::train($apartment_data, 'price');

var_export($report);

/* 
  array (
    'mean absolute error' => 68.88888888888889,
    ...
    'r squared' => 0.9796739130434783,
    ...
  )
*/ 

$prediction = RubixService::predict(['space_m2' => 250]);
//$prediciton ~2440
    

See full example of above code here

Reports / Errors / Accuracy

Mean absolute error is basically the actual error you can expect in average. So in average if trying to predict an apartment given the space, you'd be off, in average, by 68.88$

r squared on the other hand gives more of a feeling how good the algorithm is in %. A high r squared means it works well. For categorical features like cat or dog a different report is returned

Estimators / Machine Learning Algorithm

RubixService::train() will use a default estimator (machine learning algorithm) depending on the data. If you want to choose a different estimator I recommend reading here

rubix ml choosing an estimator

Notice: Neural network is called Multilayer Perceptron in Rubix. Linear regression is called Ridge.

Per default it uses K-d Neighbors or K-d Neighbors Regressor

RubixService::train() takes as well transformers

In detail RubixService:train() does

  1. shuffle of $data
  2. train against 70% of $data
  3. test against 30% of $data

You can change that behaviour by using the argument train_part_size e.g. if you want to train on 80%, and test on 20% you would do RubixService::train(... train_part_size: 0.8).