mehr-it / lara-worker-heartbeat
Adds a heartbeat and observer for laravel workers allowing detection of hanging or stuck worker processes
Requires
- php: >=7.1.0
- laravel/framework: ^6.0|^7.0|^8.0
Requires (Dev)
- ext-pcntl: *
- ext-posix: *
- orchestra/testbench: ^4.0|^5.0|^6.0
- phpunit/phpunit: ^7.4|^8.5
This package is auto-updated.
Last update: 2024-10-30 01:21:32 UTC
README
This package implements a queue worker heartbeat which is observed by another process to detect stuck or hanging queue worker processes.
Why is a heartbeat necessary?
Laravel has a built in timeout handling for worker processes which uses SIGALRM to let worker processes terminate themselves when they reach a given timeout. However this cannot handle edge cases when the whole process is stuck in a way that signals are not processed anymore or the worker gets stuck before the signal handler was registered.
Even supervisord is of no help in such cases because the worker process might still be running, but not doing anything anymore.
How does it work?
This package extends the queue:work
command by the ability to fork an observer process which
monitors an implemented worker heartbeat. Laravel's queue worker is extended to send regular
heartbeat signals and status information to the observer process. When the observer process does
not receive a heartbeat signal within the expected period, it will kill the worker process.
Installation
composer require mehr-it/lara-worker-heartbeat
This package uses Laravel's package auto-discovery, so the service provider will be loaded automatically.
Make sure the PHP extensions posix
and pcntl
are loaded. Otherwise queue workers cannot fork
the required observer process and will throw an error.
Usage
You don't have to make any changes to your application code to use this package. You only
have to pass the --heartbeat-timeout
option to the queue work command:
artisan queue:work default --heartbeat-timeout=5
This will start an observer thread expecting a heartbeat signal within every 5 seconds. Of course no heartbeat is expected while the worker process is sleeping or handling a job taking longer than 5s.
Choose the heartbeat timeout depending on the expected worker cycle duration (without any sleep time). Usually this is a little more than the time it takes to query the queue for new jobs. So most of the time 5s is a safe value. But in cases when the pop operation takes longer (eg. when using AWS SQS with long polling) you have to increase the timeout.
Implementation details
Heartbeat signals are sent on each iteration of the worker loop, which is looking for new jobs in the queue.
When the worker is going to sleep or starts processing a job, it does not send any heartbeat signals for some time. The worker will notify the observer about the period it will not be able to send any heartbeat and the observer will respect that.
If neither a timeout is set for the worker process nor for the currently processed job, the observer process does expect any heartbeat until the job is finished. In this situation the observer cannot detect hung or stuck processes, because it does not know when to expect the next heartbeat signal.
The observer process automatically stops when the observed worker process is not running anymore.