Swiss Army knife for urls.

v1.0.0-beta 2018-10-24 10:45 UTC


This package is for you when PHP's parse_url() is not enough.

Key Features:

  • Parse a url and access or modify all its components separately.
  • Resolve any relative url you may find in an Html document to an absolute url, with the document's url.
  • Get not only the full host of a url, but also the registrable domain, the domain suffix and the subdomain parts of the host separately (Thanks to the Mozilla Public Suffix List).
  • Compare components of different urls (e.g. checking if different urls point to the same host or domain)
  • Thanks to true/punycode it's also no problem to parse internationalized domain names (IDN).
  • Implements PSR-7 UriInterface.


Install the latest version with:

composer require crwlr/url


Including the package

use Crwlr\Url\Url;

To start using the library include composer's autoload file and import the Url class so you don't have to write the full namespace path again and again. Further code examples skip the above.

Parsing urls

$url = Url::parse('');
// Accessing url components via method calls
$port = $url->port();                   // => 8080
$domainSuffix = $url->domainSuffix();   // => "com"
$path = $url->path();                   // => "/foo"
$fragment = $url->fragment();           // => NULL
// Or as properties
$scheme = $url->scheme;                 // => "https"
$user = $url->user;                     // => "john"
$host = $url->host;                     // => ""
$domain = $url->domain;                 // => ""

Available url components

Below is a list of all components the Url class takes care of. The highlighted part in the example url shows what the component returns.

  • scheme
    https ://
  • user
    https:// john
  • pass or password (alias)
    https://john: 123
  • host
    https://john:123@ :8080/foo?bar=baz#anchor
  • domain
    https://john:123@subdomain. :8080/foo?bar=baz#anchor
  • domainLabel
    https://john:123@subdomain. example .com:8080/foo?bar=baz#anchor
  • domainSuffix
    https://john:123@subdomain.example. com :8080/foo?bar=baz#anchor
  • subdomain
    https://john:123@ subdomain
  • port 8080 /foo?bar=baz#anchor
  • path /foo ?bar=baz#anchor
  • query bar=baz #anchor
  • fragment anchor

When a component is not present in a url (e.g. it doesn't contain user and password) the corresponding properties will return NULL.

Combinations of components


There are situations where it can be very helpful to get the root as it's called here. It returns everything that comes before the path component.

$url = Url::parse('');
$root = $url->root();   // => ""

Complementary to the root you can also retrieve all components starting from the path (path, query and fragment) combined, via the relative property. It's called relative because it's like a relative url (without scheme and host information).

$url = Url::parse('');
$relative = $url->relative();   // => "/foo?bar=baz#anchor"

Parsing a query string

If you're after the query of a url you may want to get it as an array. Don't worry, nothing easier than that:

$url = Url::parse('');


array(2) {
  string(3) "baz"
  string(5) "value"

PSR-7 UriInterface methods

The component methods of the Url class are designed to combine getting and setting components with one method and therefore also have short names (->scheme() instead of ->getScheme()). But to be compatible with other libraries it also implements the PSR-7 UriInterface and therefore also provides these methods:

$url = '';
$url = \Crwlr\Url\Url::parse($url);
var_dump($url->getScheme());        // => 'https'
var_dump($url->getAuthority());     // => ''
var_dump($url->getUserInfo());      // => 'user:password'
var_dump($url->getHost());          // => ''
var_dump($url->getPort());          // => 1234
var_dump($url->getPath());          // => '/foo/bar'
var_dump($url->getQuery());         // => 'some=query'
var_dump($url->getFragment());      // => 'fragment'

var_dump($url->withScheme('http')->getScheme());        // => 'http'
var_dump($url->withUserInfo('u', 'p')->getUserInfo());  // => 'u:p'
var_dump($url->withHost('')->getHost());     // => ''
var_dump($url->withPort(666)->getPort());               // => 666
var_dump($url->withPath('/path')->getPath());           // => '/path'
var_dump($url->withQuery('foo=bar')->getQuery());       // => 'foo=bar'
var_dump($url->withFragment('baz')->getFragment());     // => 'baz'
var_dump($url->__toString()); // => ''

Modifying urls

All methods that are used to get a component's value can also be used to replace or set a value. So for example if you have an array of urls and you want to be sure that they are all on https, you can achieve that like this:

$urls = [
foreach ($urls as $key => $url) {
    $urls[$key] = Url::parse($url)->scheme('https')->toString();


array(4) {
  string(24) ""
  string(33) ""
  string(30) ""
  string(27) ""

Another example: most websites can be reached with or without the www subdomain. If you have an array of urls and want to assure that they all point to the version with www:

$urls = [
$urls = array_map(function($url) {
    return Url::parse($url)->host('')->toString();
}, $urls);


array(4) {
  string(29) ""
  string(28) ""
  string(32) ""
  string(31) ""

And that's the same for all components that are listed under the available url components. And for the query string you can also just provide an array:

$url = Url::parse('');
$url->queryArray(['param' => 'value', 'marco' => 'polo']);
echo $url;


Btw.: As you can see in the example above, you can use a Url object like a string because of its __toString() method.

Resolving relative urls

When you scrape urls from a website you will come across relative urls like /path/to/page, ../path/to/page, ?param=value, #anchor and alike. This package makes it a breeze to resolve these urls to absolute ones with the url of the page where they have been found on.

$documentUrl = Url::parse('');
$relativeLinks = [
$absoluteLinks = array_map(function($relativeLink) use ($documentUrl) {
    return $documentUrl->resolve($relativeLink)->toString();
}, $relativeLinks);



array(4) {
  string(36) ""
  string(40) ""
  string(47) ""
  string(42) ""

If you pass an absolute url to resolve() it will just return that absolute url.

Comparing url components

If you need to, it's really easy to compare components of 2 different urls.

$url1 = Url::parse('');
$url2 = Url::parse('');
if ($url1->compare($url2, 'host')) {
    echo "Urls 1 and 2 ARE on the same host.\n";
} else {
    echo "Urls 1 and 2 ARE NOT on the same host.\n";
if ($url1->compare($url2, 'subdomain')) {
    echo "Urls 1 and 2 ARE on the same subdomain.\n";
} else {
    echo "Urls 1 and 2 ARE NOT on the same subdomain.\n";
if ($url1->compare($url2, 'query')) {
    echo "Urls 1 and 2 HAVE the same query.\n";
} else {
    echo "Urls 1 and 2 DO NOT HAVE the same query.\n";


Urls 1 and 2 ARE NOT on the same host.
Urls 1 and 2 ARE on the same subdomain.
Urls 1 and 2 DO NOT HAVE the same query.

And again, this can be done with all components listed under the available url components. Instead of a Url object ($url2 in the example above) you can also just provide a url as a string.

$url1 = Url::parse('');
$url2 = '';
if ($url1->compare($url2, 'path')) {
    echo "Urls 1 and 2 HAVE the same path.\n";
} else {
    echo "Urls 1 and 2 DO NOT HAVE the same path.\n";


Urls 1 and 2 HAVE the same path.

Internationalized domain names (IDN)

echo Url::parse('https://www.пример.онлайн/hello/world')->toString();



Behind the curtains true/punycode is used to parse internationalized domain names.

Updating Mozilla's Public Suffix List

Mozilla's Public Suffix List is parsed and stored in a file in this package to be able to extract the domain suffix from a url's host component. It should be updated with every new release of this package. If you need to get the latest version of the list immediately, because a particular new suffix isn't included in the list in this repository, you can update it using the following composer command:

composer update-suffixes

Note: Please don't overuse this, as Mozilla states on their page:

If you wish to make your app download an updated list periodically, please use this URL and have your app download the list no more than once per day. (The list usually changes a few times per week; more frequent downloading is pointless and hammers our servers.)