gbelsalvador / data-transformer
Professional PHP library for transforming data between multiple formats
Requires
- php: >=8.0
- phpoffice/phpspreadsheet: ^5.4
Requires (Dev)
- phpunit/phpunit: ^9.5
README
gbelsalvador/data-transformer is a PHP library for reading, transforming, validating, and exporting structured data across multiple formats.
It is designed for practical data workflows:
- import from
CSV,JSON,XML,SQL, orXLSX - normalize data into a consistent PHP array structure
- filter, map, and validate rows through a fluent pipeline
- export the result to another format
The project aims to offer a clean developer experience, predictable behavior, and a solid base for production-oriented data conversion workflows.
Documentation
- French README:
README.fr.md - HTML documentation:
docs/index.html
Why This Library
Most conversion utilities stop at file-to-file export. Data Transformer goes further by adding a transformation pipeline between the reader and writer.
That means you can:
- convert formats
- rename and reshape fields
- filter rows
- validate data before export
- get an execution report with counts and validation errors
Features
- Multi-format support:
CSV,JSON,XML,SQL,XLSX - Fluent transformation pipeline with
read(),filter(),map(),validate(),write() - Structured SQL filters for safer queries
- Execution reporting via
ExecutionResult - Spreadsheet export hardening for CSV/XLSX formula injection
- XML parsing hardened with
LIBXML_NONET - PHPUnit test suite for pipeline behavior and file integrations
- PSR-4 autoloading and PSR-12-friendly code style
Installation
composer require gbelsalvador/data-transformer
Requirements
- PHP
>= 8.0 phpoffice/phpspreadsheet^5.4
Supported Formats
| Format | Read | Write |
|---|---|---|
| CSV | Yes | Yes |
| JSON | Yes | Yes |
| XML | Yes | Yes |
| SQL | Yes | Yes |
| XLSX | Yes | Yes |
Quick Start
use Gbelsalvador\DataTransformer\Core\Transformer; use Gbelsalvador\DataTransformer\Readers\CsvReader; use Gbelsalvador\DataTransformer\Writers\JsonWriter; $transformer = new Transformer(); $result = $transformer->transform( new CsvReader('input.csv'), new JsonWriter('output.json') ); echo $result->rowsRead(); echo $result->rowsWritten();
Fluent Pipeline
The fluent API is the recommended approach for non-trivial workflows.
use Gbelsalvador\DataTransformer\Core\Transformer; use Gbelsalvador\DataTransformer\Readers\CsvReader; use Gbelsalvador\DataTransformer\Writers\JsonWriter; $result = (new Transformer()) ->read(new CsvReader('users.csv')) ->filter(fn (array $row) => ($row['active'] ?? null) === '1') ->map([ 'id' => 'user_id', 'full_name' => fn (array $row) => trim(($row['first_name'] ?? '') . ' ' . ($row['last_name'] ?? '')), 'email' => 'email', 'country' => 'country', ]) ->validate([ 'id' => 'required|integer', 'full_name' => 'required|max:120', 'email' => 'required|email', 'country' => 'in:FR,BE,CH,CA', ]) ->write(new JsonWriter('clean-users.json')); echo $result->rowsRead(); echo $result->rowsWritten(); echo $result->rowsInvalid(); print_r($result->validationErrors());
Core Concepts
Readers
Readers load data from a source and return a normalized PHP array:
[
['column1' => 'value1', 'column2' => 'value2'],
['column1' => 'value3', 'column2' => 'value4'],
]
Available readers:
CsvReaderJsonReaderXmlReaderSqlReaderXlsxReader
Writers
Writers receive normalized rows and persist them to a target format.
Available writers:
CsvWriterJsonWriterXmlWriterSqlWriterXlsxWriter
ExecutionResult
Every completed transformation returns an ExecutionResult object with runtime information:
rowsRead()rowsWritten()rowsFilteredOut()rowsInvalid()errorCount()duration()validationErrors()
Usage Examples
CSV to JSON
use Gbelsalvador\DataTransformer\Core\Transformer; use Gbelsalvador\DataTransformer\Readers\CsvReader; use Gbelsalvador\DataTransformer\Writers\JsonWriter; $result = (new Transformer())->transform( new CsvReader('products.csv', delimiter: ';'), new JsonWriter('products.json') );
SQL to XLSX
use Gbelsalvador\DataTransformer\Core\Transformer; use Gbelsalvador\DataTransformer\Readers\SqlReader; use Gbelsalvador\DataTransformer\Writers\XlsxWriter; $pdo = new PDO('mysql:host=localhost;dbname=my_database', 'user', 'password'); $pdo->setAttribute(PDO::ATTR_ERRMODE, PDO::ERRMODE_EXCEPTION); $result = (new Transformer())->transform( new SqlReader( pdo: $pdo, tableName: 'users', columns: ['id', 'name', 'email', 'created_at'], filters: [ 'active' => 1, 'created_at' => [ 'operator' => '>=', 'value' => '2026-01-01', ], ] ), new XlsxWriter('active-users.xlsx', sheetName: 'Users') );
JSON to XML
use Gbelsalvador\DataTransformer\Core\Transformer; use Gbelsalvador\DataTransformer\Readers\JsonReader; use Gbelsalvador\DataTransformer\Writers\XmlWriter; $result = (new Transformer())->transform( new JsonReader('catalog.json'), new XmlWriter('catalog.xml', rootElement: 'catalog', rowElement: 'item') );
Validation Rules
The current validation layer supports:
requiredemailnumericintegerbooleandatemax:<length>in:value1,value2,value3same:other_field
Example:
$transformer->validate([ 'email' => 'required|email', 'status' => 'required|in:active,inactive,pending', 'name' => 'max:120', ]);
SQL Safety
SqlReader supports two approaches:
filterswhereClause+whereParams
The recommended approach is filters, because it builds parameterized conditions from a structured array.
new SqlReader( pdo: $pdo, tableName: 'users', filters: [ 'active' => 1, 'role' => [ 'operator' => 'in', 'value' => ['admin', 'editor'], ], ] );
Supported structured operators:
=!=>>=<<=likeinis_nullis_not_null
Security Notes
The library includes several basic hardening measures:
- SQL identifiers are validated before query construction
- CSV and XLSX exports neutralize formula-like values
- XML reading uses
LIBXML_NONETto reduce external entity/network risks
These protections help reduce common issues, but you should still treat file paths, SQL configuration, and user-provided inputs as untrusted data in your application layer.
API Overview
Transformer
$transformer = new Transformer(); $transformer->read(ReaderInterface $reader); $transformer->filter(callable $callback); $transformer->map(callable|array $mapping); $transformer->validate(array $rules); $result = $transformer->write(WriterInterface $writer);
Reader Signatures
new CsvReader( string $filePath, string $delimiter = ',', bool $hasHeader = true, string $enclosure = '"', string $escape = '\\' ); new JsonReader( string $filePath, bool $assoc = true ); new XmlReader( string $filePath, string $rootElement = 'root' ); new SqlReader( PDO $pdo, string $tableName, array $columns = ['*'], ?string $whereClause = null, array $whereParams = [], array $filters = [] ); new XlsxReader( string $filePath, ?string $sheetName = null, bool $hasHeader = true );
Writer Signatures
new CsvWriter( string $filePath, string $delimiter = ',', string $enclosure = '"', bool $includeHeader = true ); new JsonWriter( string $filePath, int $flags = JSON_PRETTY_PRINT | JSON_UNESCAPED_UNICODE ); new XmlWriter( string $filePath, string $rootElement = 'root', string $rowElement = 'row' ); new SqlWriter( PDO $pdo, string $tableName, bool $truncateFirst = false ); new XlsxWriter( string $filePath, ?string $sheetName = null );
Testing
Install development dependencies:
composer install
Run the test suite:
composer test
Current coverage includes:
- pipeline behavior
- validation behavior
- CSV/JSON/XML integration flows
- SQL reader/writer tests when
pdo_sqliteis available
Note: SQL integration tests are skipped automatically if the pdo_sqlite driver is not enabled in your PHP environment.
Project Structure
src/
Contracts/
Core/
Exceptions/
Readers/
Writers/
Tests/
FileIntegrationTest.php
SqlReadWriteTest.php
TransformerPipelineTest.php
Roadmap
Planned improvements for upcoming versions:
- streaming for large datasets
- richer schema validation
- more transformation primitives
- append/merge output strategies
- additional input sources such as HTTP or YAML
Contributing
Contributions, bug reports, and improvement ideas are welcome.
If you want to contribute:
- Fork the repository
- Create a feature branch
- Add or update tests
- Open a pull request with a clear description
License
This project is released under the MIT License.