antevenio / stream-regex-iterator
PHP iterator of regex matches over streams
Installs: 11 400
Dependents: 0
Suggesters: 0
Security: 0
Stars: 1
Watchers: 2
Forks: 0
Open Issues: 0
Requires
- php: >=5.6
Requires (Dev)
- phpunit/phpunit: ^5.7
This package is auto-updated.
Last update: 2024-11-17 03:54:17 UTC
README
Find regular expresion matches on seekable text streams and return them inside an iterator.
Description
This iterator comes as a solution to having to run complex multi line regular expressions on big files without exhausting memory.
The iterator will read chunks of data from the stream and run preg_match_all()
on each one of them.
The iterator will read chunks of data in a way that it ensures no possible matches are lost through chunks, i.e. possible matches existing amidst chunk split points.
The iterator will return matches as preg_match_all()
would do when using the
PREG_SET_ORDER | PREG_OFFSET_CAPTURE
flags.
Limitations
The specified stream must be fully seekable (back and forward).
The specified buffer size must be able to store the longest possible full match of the regexp.
The iterator will require approximately twice the specified buffer size memory.
Requirements
The following versions of PHP are supported.
- PHP 5.6
- PHP 7.0
- PHP 7.1
- PHP 7.2
- PHP 7.3
Installation:
composer require antevenio/stream-regex-iterator
Usage
$inputString = "line1\nline2\nline3\nline4\nline5\nstart\nline6\nline7\nend"; $stream = fopen("data://text/plain," . $inputString, "r"); $matches = new Antevenio\StreamRegexIterator\Iterator( "/^start.*?end$/sm", $stream, 32 ); foreach ($matches as $match) { print_r($match); }
Would output:
Array
(
[0] => Array
(
[0] => start
line6
line7
end
[1] => 30
)
)