vend / chunky
A constant-time, variable-workload chunking tool with Doctrine2 integration
Installs: 89 808
Dependents: 1
Suggesters: 0
Security: 0
Stars: 4
Watchers: 4
Forks: 0
Open Issues: 0
Requires
- php: >=5.4.0
- psr/log: ~1.0.0
Requires (Dev)
- doctrine/dbal: >=2.4.0
- phpunit/phpunit: 4.1.*
- squizlabs/php_codesniffer: 1.*
Suggests
- doctrine/dbal: To use ReplicatedChunk
This package is not auto-updated.
Last update: 2021-04-26 10:03:00 UTC
README
A small library for dynamic chunking of large operations against an external system, like a database.
A 'chunk' is a unit of work with a target execution time. If each chunk in an iteration begins to take more time than the target to process, the size of future chunks is reduced.
This library also includes utilities for monitoring slave lag on a set of Doctrine2 connections, and can pause chunk processing to wait for replication to catch up. This sort of strategy is taken by tools like pt-online-schema-change in order to complete a process as fast as possible, but without impacting systems under production load.
Usage
Basic Usage
use Chunky\Chunk; $options = []; $chunk = new Chunk( 500, // Initial chunk size 0.2, // Target wallclock execution time in seconds $options ); for (/* ... */) { $size = $chunk->getEstimatedSize(); $chunk->begin(); // Process $size records $chunk->end(); }
Options
- int
min
: The minimum chunk size to ever return (default 2 * initial estimate) - int
max
: The maximum estimated size to ever return (default 0.01 * initial estimate) - float
smoothing
: The exponential smoothing factor, 0 < s < 1 (default 0.3)
Monitoring Replication Lag
A Chunk class is provided for monitoring MySQL slave lag on a set of slave
database servers: ReplicatedChunk
. This class is MySQL-specific (because getting
the current slave lag is not implemented for other drivers).
use Chunky\ReplicatedChunk; /* @var Doctrine\DBAL\Connection $conn */ /* @var Doctrine\DBAL\Connection $conn2 */ $chunk = new ReplicatedChunk(500, 0.2, $options); $chunk->setSlaves([$conn, $conn2]);
Options
- int
max_lag
: When replication lag reaches this many seconds, the slave is considered lagged - int
pause
: The number of microseconds to pause for when slave lag is detected (before rechecking lag) - int
max_pause
: The total number of microseconds the chunk will pause for before continuing or throwing an exception - boolean
continue
: Whether to continue ifmax_pause
is reached; default is to throw an exception and not continue
Installation
This library can be loaded yourself with PSR4, but you'd usually just install it with
Composer. The package name is vend/chunky
.