mhitza / file-enumerators
File streaming and preprocessing library (via generators)
Requires
- php: >=5.5.0
Requires (Dev)
- codeclimate/php-test-reporter: dev-master
- phpunit/phpunit: ~4.4.0
README
File streaming library (via generators), for line by line readers and CSV parsing (other specializations may come up at some point).
It's important to remember that generators are forward-only iterators. For that you should take note that in the example
code I'm calling enumerate()
inside the foreach
construct instead of assigning it to a variable, and iterating over that
variable. That is the safe way of iterating over a generator, since enumerate()
is the Generator
builder, UNLESS you want
to constrain single passes over the streams, in which case binding the generator to a variable is prefered.
Install
Available as a composer package, requires PHP >=5.5.0
$ composer.phar require mhitza/file-enumerators
Example usage
Line by line reader
a.k.a. how the classic fgets function usage translates in this library
<?php use FileEnumerators\Reader\Line as LineReader; $enumerator = new FileEnumerators\Enumerator('testfile.txt', new LineReader); foreach($enumerator->enumerate() as $line) { echo $line; }
CSV reader - simple
<?php use FileEnumerators\Reader\CSV as CSVReader; $enumerator = new FileEnumerators\Enumerator('datafile.csv', new CSVReader); foreach($enumerator->enumerate() as $row) { echo "ROW\n"; foreach($row as $column) { echo "\t$column"; } }
CSV reader - more complex
Consider a CSV file that has 5 columns, yet we are only interested in the first, third and fifth column. Also we want to have semantically adequate keys for those columns instead of numbers. And maybe our fifth has a set of numbers separated by a dash, that we want to sum up.
<?php use FileEnumerators\Reader\CSV as CSVReader; use FileEnumerators\Reader\Transformer\CSV as CSVTransformer; $transformer = new CSVTransformer(); $transformer->onlyColumns(0,2,4) ->columnsToNames([ 0 => "title", 2 => "something-relevant", 4 => "user-ratings" ]) ->mapColumn(4, function($value){ return array_sum(array_map('intval', str_split('-', $value))); }); $reader = new CSVReader( CSVReader::COMMA_DELIMITED, $transformer ); $enumerator = new FileEnumerators\Enumerator('datafile.csv', $reader); foreach($enumerator->enumerate() as $row) { printf("%s %s %d", $row['title'], $row['something-relevant'], $row['user-ratings'] ); }
Or the personally prefered variant where everything is bundled up in a single builder set.
<?php use FileEnumerators\Reader\CSV as CSVReader; use FileEnumerators\Reader\Transformer\CSV as CSVTransformer; $enumerator = new FileEnumerators\Enumerator( 'datafile.csv', new CSVReader( CSVReader::COMMA_DELIMITED, (new CSVTransformer) ->onlyColumns(0,2,4) ->columnsToNames([ 0 => "Title", 2 => "Something relevant", 4 => "User ratings" ]) ->mapColumn(4, function($value){ return array_sum(array_map(str_split('-', $value), 'intval')); }) ) ); foreach($enumerator->enumerate() as $row) { printf("%s %s %d", $row['title'], $row['something-relevant'], $row['user-ratings'] ); }
Directory listing
A very small wrapper around DirectoryIterator to list files in a given directory (ignores directories and dot files)
<?php use FileEnumerators\Reader\Directory as DirectoryReader; $enumerator = new FileEnumerators\Enumerator(new DirectoryReader('.')); # @var $file \DirectoryIterator foreach($enumerator->enumerate() as $file) { echo $file->getFilename(); }