benmorel / apache-log-parser
PHP library to parse Apache log files
Fund package maintenance!
BenMorel
Installs: 1 419
Dependents: 1
Suggesters: 0
Security: 0
Stars: 23
Watchers: 4
Forks: 5
Open Issues: 0
Requires
- php: ^7.2 || ^8.0
Requires (Dev)
- php-coveralls/php-coveralls: ^2.4
- phpunit/phpunit: ^8.0
This package is auto-updated.
Last update: 2024-11-03 22:29:28 UTC
README
A PHP library to parse Apache logs.
Installation
This library is installable via Composer. Just run:
composer require benmorel/apache-log-parser
Requirements
This library requires PHP 7.1 or later.
Project status & release process
This library is under development.
The current releases are numbered 0.x.y
. When a non-breaking change is introduced (adding new methods, optimizing
existing code, etc.), y
is incremented.
When a breaking change is introduced, a new 0.x
version cycle is always started.
It is therefore safe to lock your project to a given release cycle, such as 0.1.*
.
If you need to upgrade to a newer release cycle, check the release history
for a list of changes introduced by each further 0.x.0
version.
Package contents
This library provides a single class, Parser
.
Quick start
First construct a Parser
object with the LogFormat
defined in the httpd.conf file of the server that generated the log file:
use BenMorel\ApacheLogParser\Parser; $logFormat = "%h %l %u %t \"%{Host}i\" \"%r\" %>s %b \"%{Referer}i\" \"%{User-Agent}i\""; $parser = new Parser($logFormat);
The library converts every format string of your log format to a field name;
the list of fields can be accessed through the getFieldNames()
method:
var_export( $parser->getFieldNames() );
array ( 0 => 'remoteHostname', 1 => 'remoteLogname', 2 => 'remoteUser', 3 => 'time', 4 => 'requestHeader:Host', 5 => 'firstRequestLine', 6 => 'status', 7 => 'responseSize', 8 => 'requestHeader:Referer', 9 => 'requestHeader:User-Agent', )
You're then ready to parse a single line of your log file: the parse()
method accepts the log line,
and a boolean to indicate whether you want the results as a numeric array, whose keys match the ones of the field names array:
$line = '1.2.3.4 - - [30/May/2018:15:00:23 +0200] "www.example.com" "GET / HTTP/1.0" 200 1234 "-" "Mozilla/5.0'; var_export( $parser->parse($line, false) );
array ( 0 => '1.2.3.4', 1 => '-', 2 => '-', 3 => '30/May/2018:15:00:23 +0200', 4 => 'www.example.com', 5 => 'GET / HTTP/1.0', 6 => '200', 7 => '1234', 8 => '-', 9 => 'Mozilla/5.0', )
Or as an associative array, with the field names as keys:
var_export( $parser->parse($line, true) );
array ( 'remoteHostname' => '1.2.3.4', 'remoteLogname' => '-', 'remoteUser' => '-', 'time' => '30/May/2018:15:00:23 +0200', 'requestHeader:Host' => 'www.example.com', 'firstRequestLine' => 'GET / HTTP/1.0', 'status' => '200', 'responseSize' => '1234', 'requestHeader:Referer' => '-', 'requestHeader:User-Agent' => 'Mozilla/5.0', )
If a line cannot be parsed, an InvalidArgumentException
is thrown. Be sure to wrap your parse()
calls in a try-catch block:
try { $parser->parse($line, true) } catch (\InvalidArgumentException $e) { // ... }
Field names returned by the library
This table shows how format strings are mapped to field names by the library:
If two or more format strings yield the same field name, the second one will get a :2
suffix, the third one a :3
suffix, etc.
Performance notes
You can expect to parse more than 250,000 records per second (> 50 MiB/s) when reading logs from a file on a modern server with an SSD drive.
Returning records as an associative array comes with a small performance penalty of about 6%.