souplette / fusbup
A fast & memory-efficient interface to the Mozilla Public Suffix List.
Requires
- php: >=8.1
- ext-intl: *
Requires (Dev)
- ju1ius/luigi: ^1.0
- symfony/stopwatch: ^6.2
This package is auto-updated.
Last update: 2024-11-11 01:42:26 UTC
README
A fast and memory-efficient PHP library to query the Mozilla public suffix list.
Installation
composer require souplette/fusbup
Basic usage
Querying effective top-level domains (eTLD)
use Souplette\FusBup\PublicSuffixList; $psl = new PublicSuffixList(); // get the eTLD (short for Effective Top-Level Domain) of a domain assert($psl->getEffectiveTLD('foo.co.uk') === 'co.uk'); // check if a domain is an eTLD assert($psl->isEffectiveTLD('fukushima.jp')); // split a domain into it's private and eTLD parts assert($psl->splitEffectiveTLD('www.foo.co.uk') === ['www.foo', 'co.uk']);
Querying registrable domains (AKA eTLD+1)
use Souplette\FusBup\PublicSuffixList; $psl = new PublicSuffixList(); // get the registrable part (eTLD+1) of a domain assert($psl->getRegistrableDomain('www.foo.co.uk') === 'foo.co.uk'); // split a domain into it's private and registrable parts. assert($psl->splitRegistrableDomain('www.foo.co.uk') === ['www', 'foo.co.uk']);
Checking the applicability of a cookie domain
The PublicSuffixList
class implements the
RFC6265 algorithm
for matching a cookie domain against a request domain.
use Souplette\FusBup\PublicSuffixList; $psl = new PublicSuffixList(); // check if a cookie domain is applicable to a hostname $requestDomain = 'my.domain.com' $cookieDomain = '.domain.com'; assert($psl->isCookieDomainAcceptable($requestDomain, $cookieDomain)); // cookie are rejected if their domain is an eTLD: assert(false === $psl->isCookieDomainAcceptable('foo.com', '.com'))
Internationalized domain names
All PublicSuffixList
methods that return domains
return them in their normalized ASCII form.
use Souplette\FusBup\PublicSuffixList; use Souplette\FusBup\Utils\Idn; $psl = new PublicSuffixList(); assert($psl->getRegistrableDomain('â.example') === 'xn--53h.example'); // use Idn::toUnicode() to convert them back to unicode if needed: assert(Idn::toUnicode('xn--53h.example') === 'â.example');
Performance
The public suffix list contains about 10 000 rules as of 2023.
In order to be maximally efficient for all uses cases,
the PublicSuffixList
class can use two search algorithms
with different performance characteristics.
The first one (and the default) uses a DAFSA compiled to a binary string (this is the algorithm used in the Gecko and Chromium engines). The second one uses a compressed suffix tree compiled to PHP code.
Here is a summary of their respective pros and cons:
- DAFSA
- ð more memory efficient (this is just a 50Kb string in memory)
- ð faster to load (around 20Ξs on a SSD)
- ð slower to search (in the order of 100 000 ops/sec)
- Suffix tree
- ð less memory efficient (about 4Mb in memory)
- ð slower to load (around 4ms without opcache, 500Ξs when using opcache preloading)
- ð faster to search (in the order of 1 000 000 ops/sec)
Note that in both cases, the database will be lazily loaded.
Which search algorithm should I use?
Well, it depends on your use case but based on the aforementioned characteristics I would say: stick to the default (DAFSA) algorithm unless your app is going to make more than a few hundreds searches per seconds.
Tell me how can I use them?
Both algorithm can be used by passing the appropriate loader to the PublicSuffixList
constructor.
DAFSA
use Souplette\FusBup\Loader\DafsaLoader; use Souplette\FusBup\PublicSuffixList; $psl = new PublicSuffixList(new DafsaLoader()); // since DafsaLoader is the default, the following is equivalent: $psl = new PublicSuffixList();
Suffix Tree
use Souplette\FusBup\Loader\SuffixTreeLoader; use Souplette\FusBup\PublicSuffixList; $psl = new PublicSuffixList(new SuffixTreeLoader());
You should also configure opcache to preload the database:
In your php.ini
:
opcache.enabled=1 opcache.preload=/path/to/my/preload-script.php
In your preload script:
opcache_compile_file('/path/to/vendor/ju1ius/fusbup/Resources/psl.php');