jotaelesalinas / php-mapreduce
A local implementation of the map-reduce strategy in PHP
Installs: 260
Dependents: 0
Suggesters: 0
Security: 0
Stars: 11
Watchers: 3
Forks: 5
Open Issues: 1
pkg:composer/jotaelesalinas/php-mapreduce
Requires
- php: >=8.0
Requires (Dev)
- phpunit/phpunit: ^9.5
- squizlabs/php_codesniffer: ^3.7
This package is auto-updated.
Last update: 2025-10-12 20:50:25 UTC
README
PHP PSR-4 compliant library to easily do non-distributed local map-reduce.
Install
Via Composer
$ composer require jotaelesalinas/php-mapreduce
Basic usage
require_once __DIR__ . '/vendor/autoload.php'; $source = [1, 2, 3, 4, 5, 6, 7, 8, 9, 10]; $mapper = fn($item) => $item * 2; $reducer = fn($carry, $item) => ($carry ?? 0) + $item; $result = MapReduce\MapReduce::create() ->setInput($source) ->setMapper($mapper) ->setReducer($reducer) ->run(); print_r($result);
The output is:
Array
(
[0] => 110
)
Filters
$odd_numbers = fn($item) => $item % 2 === 0; $greater_than_10 = fn($item) => $item > 10; $result = MapReduce\MapReduce::create([ "input" => $source, "mapper" => $mapper, "reducer" => $reducer, ]) // only odd numbers are passed to the mapper function ->setPreFilter($odd_numbers) // only numbers greater than 10 are passed to the reducer function ->setPostFilter($greater_than_10) ->run(); print_r($result);
The output is:
Array
(
[0] => 48
)
Groups
Group by the value of a field (valid for arrays and objects):
$source = [ [ "first_name" => "Susanna", "last_name" => "Connor", "member" => "y", "age" => 20], [ "first_name" => "Adrian", "last_name" => "Smith", "member" => "n", "age" => 22], [ "first_name" => "Mike", "last_name" => "Mendoza", "member" => "n", "age" => 24], [ "first_name" => "Linda", "last_name" => "Duguin", "member" => "y", "age" => 26], [ "first_name" => "Bob", "last_name" => "Svenson", "member" => "n", "age" => 28], [ "first_name" => "Nancy", "last_name" => "Potier", "member" => "y", "age" => 30], [ "first_name" => "Pete", "last_name" => "Adams", "member" => "n", "age" => 32], [ "first_name" => "Susana", "last_name" => "Zommers", "member" => "y", "age" => 34], [ "first_name" => "Adrian", "last_name" => "Deville", "member" => "n", "age" => 36], [ "first_name" => "Mike", "last_name" => "Cole", "member" => "n", "age" => 38], [ "first_name" => "Mike", "last_name" => "Angus", "member" => "n", "age" => 40], ]; // mapper does nothing $mapper = fn($x) => $x; // number of persons and sum of ages $reduceAgeSum = function ($carry, $item) { if (is_null($carry)) { return [ 'count' => 1, 'age_sum' => $item['age'], ]; } $count = $carry['count'] + 1; $age_sum = $carry['age_sum'] + $item['age']; return compact('count', 'age_sum'); }; $result = MapReduce\MapReduce::create([ "input" => $source, "mapper" => $mapper, "reducer" => $reduceAgeSum, ]) // group by field 'member' ->setGroupBy('member') ->run(); print_r($result);
The output is:
Array
(
[y] => Array
(
[count] => 4
[age_sum] => 110
)
[n] => Array
(
[count] => 7
[age_sum] => 220
)
)
Group by a custom value generated from each item:
$closestTen = fn($x) => floor($x['age'] / 10) * 10; $result = MapReduce\MapReduce::create([ "input" => $source, "mapper" => $mapper, "reducer" => $reduceAgeSum, ]) // group by age ranges of 10 ->setGroupBy($closestTen) ->run(); print_r($result);
The output is:
Array
(
[20] => Array
(
[count] => 5
[age_sum] => 120
)
[30] => Array
(
[count] => 5
[age_sum] => 170
)
[40] => Array
(
[count] => 1
[age_sum] => 40
)
)
Input
MapReduce accepts as input any data of type iterable. That means, arrays and traversables, e.g. generators.
This is very handy when reading from big files that do not fit in memory.
$result = MapReduce\MapReduce::create([ "mapper" => $mapper, "reducer" => $reducer, ]) ->setInput(csvReadGenerator('myfile.csv')) ->run();
Multiple inputs can be specified, passing several arguments to setInput(), as long as all of them are iterable:
$result = MapReduce\MapReduce::create([ "mapper" => $mapper, "reducer" => $reducer, ]) ->setInput($arrayData, csvReadGenerator('myfile.csv')) ->run();
Output
MapReduce can be configured to write the final data to one or more destinations.
Each destination has to be a Generator:
$result = MapReduce\MapReduce::create([ "mapper" => $mapper, "reducer" => $reducer, ]) ->setOutput(csvWriteGenerator('results.csv')) ->run();
Multiple outputs can be specified as well:
$result = MapReduce\MapReduce::create([ "mapper" => $mapper, "reducer" => $reducer, ]) ->setOutput(csvWriteGenerator('results.csv'), consoleGenerator()) ->run();
To help working with input and output generators, it is recommended to use the package jotaelesalinas/php-generators, but it is not mandatory.
You can see more elaborated examples under the folder examples.
Change log
Please see CHANGELOG for more information what has changed recently.
Testing
$ composer test
Contributing
Please see CONTRIBUTING and CONDUCT for details.
Security
If you discover any security related issues, please DM me to @jotaelesalinas instead of using the issue tracker.
To do
- Add events to help see progress in large batches
- Add docs
- Insurance example
- adapt to new library
- add insured values
- improve kml output (info, markers)
Credits
License
The MIT License (MIT). Please see License File for more information.