girgias/csv

A new and improved CSV file PHP extension which follows RFC 4180 instead of using a custom escape mechanism.Supports multi-bytes delimiters, enclosures, and providing a custom EOL sequence.

Fund package maintenance!
Github

Installs: 17

Dependents: 0

Suggesters: 0

Security: 0

Stars: 8

Forks: 4

Type:php-ext

0.4.3 2025-02-22 22:44 UTC

This package is not auto-updated.

Last update: 2025-03-22 23:09:29 UTC


README

A small PHP extension to add/improve the handling of CSV strings which follows RFC 4180

This extension is considered in an Alpha/Beta state and is subject to major changes between versions.

Sponsor this project

Via GitHub Sponsors

Requirement

PHP 8.0 is required as of version 0.4.0 of this extension due to the usage of new Zend APIs.

Installation

The best way to install this extension is to use PIE with the following command:

pie install girgias/csv

It is also available on PECL, but may have delayed releases.

Functionality

This extension provides five functions:

  • Csv\array_to_row()
  • Csv\row_to_array()
  • Csv\collection_to_buffer()
  • Csv\buffer_to_collection()
  • Csv\buffer_to_collection_lax()

And one class, which allows for lazy iteration:

namespace Csv;

final class LazyLaxCollection implements IteratorAggregate {
    private function __construct() {}

    public function getIterator(): \InternalIterator {}

    public static function createFromBuffer(
        string $buffer,
        string $delimiter = ',',
        string $enclosure = '"',
        string $eolSequence = "\r\n"
    ): LazyLaxCollection {}
}

Functions

They are inspired from the already existing fputcsv() and fgetcsv() functions.

These functions are binary safe (i.e. accept nul bytes) and accept multi-bytes strings as the delimiter, enclosure, and the EOL sequence.

The reason for being able to set these three arguments is that it allows the support of non ASCII compatible character encodings (e.g. UTF-16). Indeed, it's possible to provide the multibyte sequence for each of the corresponding components such that it can treat a file in a different encoding (or write to a file of different encoding).

function Csv\array_to_row(array $fields, string $delimiter = ',', string $enclosure = '"', string $eolSequence = "\r\n"): string

Formats a CSV row with the given delimiter, enclosure (the field escape sequence), and the EOL sequence.

function Csv\collection_to_buffer(iterable $collection, string $delimiter = ',', string $enclosure = '"', string $eolSequence = "\r\n"): string

Formats a collection (an iterable of arrays) into valid RFC 4180 CSV buffer which can be written to a file. It checks that each subarray has the same number of fields as the previous ones. If not it throws a ValueError.

function Csv\row_to_array(string $row, string $delimiter = ',', string $enclosure = '"', string $eolSequence = "\r\n"): array

Returns an array containing the fields provided from the $row string. This string should follow RFC 4180.

function Csv\buffer_to_collection(string $buffer, string $delimiter = ',', string $enclosure = '"', string $eolSequence = "\r\n"): array

Returns a collection (an array of arrays) containing the fields of each row from the string $buffer. This buffer should follow RFC 4180. This function will check that each row has the same number of fields, if not it will throw a ValueError indicating the offending line.

If parsing of a buffer with possible differing number of fields, and an error is not suitable, the Csv\buffer_to_collection_lax() method can be used instead.

LazyLaxCollection

This class allows to iterate a CSV buffer one row at a time, without storing the whole CSV collection. It can be iterated on using foreach or manually by grabbing the InternalIterator via the LazyLaxCollection::getIterator() instance method.

Currently, it only supports iterating a CSV file represented as a string buffer, by creating a new instance using the LazyLaxCollection::createFromBuffer() static constructor.

This class cannot be instantiated using new.

Future scope

Improvements and feature requests can be proposed on the GitLab issue tracker.

The plan is to propose an RFC and add this extension to the core of PHP with a timeline on deprecating the non-compliant str_getcsv(), fputcsv(), and fgetcsv() functions.

Bug reports

To report a bug, please provide a reproducible test script, in preference as a PHPT test file and open an issue on Gitlab at https://gitlab.com/Girgias/csv-php-extension/issues

For security issues please email me directly at girgias@php.net