ark4ne/xl-reader

High performance Excel Reader (CSV, TSV, XLSX)

v1.0.2 2024-04-22 12:21 UTC

This package is auto-updated.

Last update: 2025-01-22 14:07:06 UTC


README

Build Status Coverage Status

High performance excel reader, with very low memory consumption.

Installation

$ composer require ark4ne/xl-reader

File support

  • xlsx: fastest xlsx reader ever.
  • tsv: tsv reader.
  • csv: configurable csv reader (auto detect comma, or semi) .

Usage

Read file

$file = "my-calc.xlsx";

$reader = \Ark4ne\XlReader\Factory::createReader($file);

$reader->load();

foreach ($reader->read() as $row){
    // do stuff
}

Each $row will contains data indexed by column key (A, B, C, ...).

/*
my-calc.xlsx
| A     | B     | C    |
| abc   | 123   | some |
*/
foreach ($reader->read() as $row){
    $row === [
        'A' => 'abc',
        'B' => '123',
        'C' => 'some',
    ];
}

With excel empty cells will not be reported

/*
my-calc.xlsx
| A     | B     | C    |
| abc   |       | some |
*/

foreach ($reader->read() as $row){
    // do stuff
    $row === [
        'A' => 'abc',
        'C' => 'some',
    ];
}

With numbers (mac), and many xlsx other generator, empty cells will be reported as null

/*
my-calc.xlsx
| A     | B     | C    |
| abc   |       | some |
*/

foreach ($reader->read() as $row){
    // do stuff
    $row === [
        'A' => 'abc',
        'B' => null,
        'C' => 'some',
    ];
}

Work with sheets (XLSX Reader)

By default, the first sheet is read.

You can retrieve all worksheets with getWorksheets().

$worksheets = $reader->getWorksheets();
/*
[
    ['id' => 1, 'name' => 'sheet 1'],
    ['id' => 2, 'name' => 'sheet 2'],
]
*/

You can retrieve selected worksheet with getSelectedWorksheet().

$worksheet = $reader->getSelectedWorksheet();
/*
['id' => 1, 'name' => 'sheet 1'],
*/

You have three ways to select the sheets to work with:

  • by index : selectSheetByIndex(int $index)
  • by id : selectSheetById(int $id)
  • by name : selectSheetByName(string $name)

Performance

Memory

The memory usage is affected only by the number of strings contains in the file to read :

load() method is directly affected by this. More strings they are to load, more time we need to load. (logic, anything is magical)

Once the strings are loaded, we have reached the memory used peak.

Reading the data consumes almost no memory.
Only the current line will be loaded into memory.

Read

Benchmark with 1024 str :

The reading speed depends on the number of cells to be read. The more cells there are to read, the longer the reading time will be.

However, the number of cells to be read has no effect on the memory used.

All bench result