simsoft / data-flow
A simple ETL pipeline data flow.
1.0.10
2025-03-06 22:06 UTC
Requires
- php: >=8.2
- league/flysystem: ^3.29
- phpoffice/phpspreadsheet: *
- symfony/cache: ^7.2
Requires (Dev)
- phpmd/phpmd: >=2
- phpstan/phpstan: >=2
- phpunit/phpunit: >=11
This package is auto-updated.
Last update: 2025-03-06 22:07:59 UTC
README
Simple ETL Pipeline data flow.
Install
composer require simsoft/data-flow
Basic Usage
Example using extract, transform and load.
require "vendor/autoload.php"; use Simsoft\DataFlow\DataFlow; (new DataFlow()) ->from([1, 2, 3]) ->transform(function($num) { return $num * 2; }) ->load(function($num) { echo $num . PHP_EOL; }) ->run(); // Output: // 2 // 4 // 6
Limit
Limit data output.
require "vendor/autoload.php"; use Simsoft\DataFlow\DataFlow; (new DataFlow()) ->from([1, 2, 3, 4, 5, 6, 7, 8, 9, 10]) ->transform(function($num) { return $num * 2; }) ->limit(5) // output only 5 data. ->load(function($num) { echo $num . PHP_EOL; }) ->run(); // Output: // 2 // 4 // 6 // 8 // 10
Filter
Filter method help you to filter the data.
require "vendor/autoload.php"; use Simsoft\DataFlow\DataFlow; (new DataFlow()) ->from([1, 2, 3, 4, 5, 6, 7, 8, 9, 10]) ->filter(function($num) { // The call back should return bool. // In this case, return even number only. return $num % 2 === 0; }) ->load(function($num) { echo $num . PHP_EOL; }) ->run(); // Output: // 2 // 4 // 6 // 8 // 10
Chunk
Splitting data into smaller, manageable parts of a fixed size
(new DataFlow()) ->from([1, 2, 3, 4, 5, 6, 7, 8, 9, 10]) ->chunk(3) // set chunk size ->load(function(array $chunk, $key) { echo $key . '=' . json_encode($chunk, JSON_THROW_ON_ERROR) . PHP_EOL; }) ->run(); // Output: // 0=[1,2,3] // 1=[4,5,6] // 2=[7,8,9] // 3=[10]
Mapping
Mapping method allow you to convey the data to another format.
(new DataFlow()) ->from([ ['First Name' => 'John', 'Last Name' => 'Doe', 'age' => 20], ['First Name' => 'Jane', 'Last Name' => 'Doe', 'age' => 30], ['First Name' => 'John', 'Last Name' => 'Smith', 'age' => 50], ['First Name' => 'Jane', 'Last Name' => 'Smith', 'age' => 60], ]) ->map([ // rename the key 'first_name' => 'First Name', 'last_name' => 'Last Name', // customise data via callback method. 'full_name' => fn($data) => $data['first_name'] . ' ' . $data['last_name'], 'senior' => fn($data) => $data['age'] > 30 ? 'Yes' : 'No', ]) ->load(function($data) { echo $data['full_name'] . ' is ' . $data['age'] . ' years old. ' . $data['senior'] . PHP_EOL; }) ->run(); // Output: // John Doe is 20 years old. No // Jane Doe is 30 years old. Yes // John Smith is 50 years old. Yes // Jane Smith is 60 years old. Yes
Flow Continuation
Connecting flows into a chain.
$flow1 = (new DataFlow()) ->from([1, 2, 3]) ->transform(function($num) { return $num * 2; }); (new DataFlow()) ->from($flow1) // connect flow1 to flow2. ->transform(function($num) { return $num * 3; }) ->load(function($num) { echo $num . PHP_EOL; }) ->run(); // Output: // 6 // 12 // 18
Advanced Usage
- Using Closure
- Useful Processors
- Customized ETL Processor
- Create Reusable Data Flow
- Using Payload
- Macro & Mixin
License
The Simsoft DataFlow is licensed under the MIT License. See the LICENSE file for details