tbpixel / xml-streamer
Stream XML data and cast types to a provided classmap.
Requires
- php: ^7.2
- psr/http-message: ^1.0
Requires (Dev)
- friendsofphp/php-cs-fixer: ^2.14
- infection/infection: ^0.12.2
- phpstan/phpstan: ^0.11.1
- phpunit/phpunit: ^8.0
- symfony/var-dumper: ^4.2
README
Content
Installation
You can install this package via composer:
composer require tbpixel/xml-streamer
Purpose
I found myself in need of a way to work with large XML data efficiently. The built in XMLReader
PHP provides is fast and efficient, but can be a pain to work with at times. I wanted a dependency-free way to stream XML data and work with it using my provided classmap.
This package attempts to alleviate some of the headache of working with XMLReader, while also providing a collection of PSR-7 compatible XML streams. It offers a convenient way to iterate large XML data sets for reduced memory usage.
The optional client is also provided to allow for casting XML Strings to classmap objects.
Examples
Say we had an XML file called users.xml
with the following data:
<?xml version='1.0' encoding='UTF-8'?> <users> <user> <id>1</id> <name>John Doe</name> </user> <user> <id>2</id> <name>Theodor</name> </user> </users>
With this package, we can simply create a new Client, pass it a PSR-7 compatible stream, and work with our data using our types.
$stream = new \TBPixel\XMLStreamer\Streams\FileReaderStream('users.xml', 1); $client = new \TBPixel\XMLStreamer\Client($stream); foreach ($client->iterate() as $simpleXMLElement) { // Do something with the SimpleXMLElement } $client->close(); // Closes the client's provided stream
We can also loop the client directly, as it implements the IteratorAggregate
interface.
// Same as above loop foreach ($client as $simpleXMLElement) { // Do something with the SimpleXMLElement }
Iteration Depth
ReaderStream, the underlying XMLReader wrapper for the stream implementations found in this package, contains a final constructor argument called $depth
. This is an integer (default to 0) or string which represents a depth to start iterating records found in an XML document.
The string can be the name of an XML tag, allowing the stream to iterate until it finds the tag rather than having to know the depth in advance.
Given the same data as above, we could rewrite our code snippet as follows:
$stream = new \TBPixel\XMLStreamer\Streams\FileReaderStream('users.xml', 'users'); $client = new \TBPixel\XMLStreamer\Client($stream); foreach ($client->iterate() as $simpleXMLElement) { // Do something with the SimpleXMLElement } $client->close(); // Closes the client's provided stream
The previous value of 1 was acceptable for iterating this result set, but if the data was wrapped an arbitrary number of levels deep then this tag-name approach becomes convenient.
Automatic Casting
If we had a user which implemented the required CreateFromSimpleXML
interface, we could also cast the SimpleXMLElement
as we iterate for easier access.
use TBPixel\XMLStreamer\CreateFromSimpleXML; class User implements CreateFromSimpleXML { /** @var int */ public $id; /** @var string */ public $name; public function __construct(int $id, string $name) { $this->id = $id; $this->name = $name; } /** * Create a new instance from a Simple XML Element. * * @return static */ public static function fromSimpleXML(\SimpleXMLElement $element) { return new static( (int) $element->id, (string) $element->name ); } }
Now when we create our client, lets just pass the FQCN to the clients classmap and casting will be automated.
$stream = new \TBPixel\XMLStreamer\Streams\FileReaderStream('users.xml'); $client = new \TBPixel\XMLStreamer\Client($stream, [ 'user' => User::class, ]); foreach ($client->iterate() as $user) { // Work with the User object. } $client->close();
The clients second argument is an array of key value pairs mapping the XML element names to the FQCN.
Extending
Both the client and the streams are built ontop of PSR-7's Psr\Http\Message\StreamInterface
, meaning it should be possible for you to swap out the stream with your own implementation. If you plan to use XMLReader
but want a different resource handler, the TBPixel\XMLStreamer\ReaderStream
is an abstract implementation which expects to process XML streams via XMLReader and handles most functionality well.
Contributing
Please see CONTRIBUTING for details.
Changelog
Please see CHANGELOG for more information on what has changed recently.
Support Me
Hi! I'm a developer living in Vancouver, BC and boy is the housing market tough. If you wanna support me, consider following me on Twitter @TBPixel, or consider buying me a coffee.
License
The MIT License (MIT). Please see License File for more information.