webdevcraft/jsonreader

JSON streaming reader (memory safety parser) with chunking option

0.2.0 2016-10-29 14:58 UTC

This package is not auto-updated.

Last update: 2024-04-13 17:13:44 UTC


README

Build Status

JSON streaming reader (memory safety parser) with chunking option

Purpose and Advantages

  1. Memory safety read of huge JSON files: solve "out of RAM" problem. Successfully tested with 1.5Gb JSON files
  2. Possibility to split large JSON file to smaller JSON chunks
  3. Custom traverse through JSON tree
  4. Read any JSON source: file, string, resource, character iterator
  5. Installation with Composer, no need any extra PHP extensions

Requirements

PHP 5.6, 7.0

Installation

With Composer:

composer require webdevcraft/jsonreader

Basic Usage

JSON Reader iterates through JSON structure elements

Each $reader->read() iterates to next element until it returns false when finished

So common usage is:

while ($reader->read()) {
    // next element iteration
}

Reader provides information on each element:

  1. DEPTH in JSON tree starting from 1: $reader->getDepth()

  2. STATE (type of element): $reader->getState(). There is syntax sugar issers exist for each of states. State could be one of:

    • Object Start: JsonReaderInterface::STATE_OBJECT_START (or $reder->isObjectStartState())

    • Object Key: JsonReaderInterface::STATE_OBJECT_KEY (or $reder->isObjectKeyState())

    • Object End JsonReaderInterface::STATE_OBJECT_END (or $reder->isObjectEndState())

    • Array Start JsonReaderInterface::STATE_ARRAY_START (or $reder->isArrayStartState())

    • Array End JsonReaderInterface::STATE_ARRAY_END (or $reder->isArrayEndState())

    • Value JsonReaderInterface::STATE_VALUE (or $reder->isValueState())

  3. VALUE of element by $reader->getValue(). Types:

    1. For $reader->getState() === JsonReaderInterface::STATE_VALUE the type of value is automatically casted to PHP type: string, int, float, boolean, null

    2. For JsonReaderInterface::STATE_OBJECT_KEY state is always string

    3. For other states value is always null

Example of elements sequence for test.json is presented in JsonReaderTest.php::testReadTokens()

Chunks Usage

In case if your code is already dependent on JSON string denormanization you could use JsonReader library for chunking of huge file onto smaller JSON strings

Just start buffering chunk on element position you wish by $reader->startWriteChunk() and get chunk by $reader->finishWriteChunk(). That returnes JSON chunk string and flushes buffer.

As chunk is buffered in memory make sure your JSON schema guarantees acceptable size of chunk

That's nice practice to wrap chunks retrieve to iterator

Example of test.json chunking is presented in JsonReaderTest.php::testChunks()

Factory Usage

To create reader you have to use JsonReaderFactory

Example of JsonReader creating:

$factory = new JsonReaderFactory();
$reader = $factory->createByFilePath('/tmp/test.json');

Factory supports any sources by such methods:

  1. createByFilePath by file path in local filesystem
  2. createByString by JSON string passing
  3. createByResource by opened resource link, for example with fopen()
  4. createByCharacterTraversable - if options above do not suite your needs feel free to write your custom characters iterator using any source you have to deal with. For example you could iterate PSR-7 Message\StreamInterface::read($acceptableSize) and iterate characters inside each of read()

Example of characters iterators