johnroyer / url-normalizer
Syntax based normalization of URL's
2.1.1
2024-02-07 04:49 UTC
Requires
- php: >=8.0
- ext-mbstring: *
Requires (Dev)
- phpunit/phpunit: ^9.6.0
- squizlabs/php_codesniffer: ^3.7
This package is auto-updated.
Last update: 2025-03-07 07:15:52 UTC
README
This URL normalizer is fork from glenscott/url-normalizer with some changes:
- upgrade PHPUnit to v9.x
- remove tracking parameter as an option (
utm_source
,fbclid
, etc)
Syntax based normalization of URI's
This normalizes URI's based on the specification RFC 3986 https://tools.ietf.org/html/rfc3986
Example usage
require_once 'vendor/autoload.php'; $url = 'eXAMPLE://a/./b/../b/%63/%7bfoo%7d'; $un = new URL\Normalizer( $url ); echo $un->normalize(); // Result: 'example://a/b/c/%7Bfoo%7D'
The normalization process preserves semantics
So, for example, the following URL's are all equivalent.
HTTP://www.Example.com/
andhttp://www.example.com/
http://www.example.com/a%c2%b1b
andhttp://www.example.com/a%C2%B1b
http://www.example.com/%7Eusername/
andhttp://www.example.com/~username/
http://www.example.com
andhttp://www.example.com/
http://www.example.com:80/bar.html
andhttp://www.example.com/bar.html
http://www.example.com/../a/b/../c/./d.html
andhttp://www.example.com/a/c/d.html
http://www.example.com/?array[key]=value
andhttp://www.example.com/?array%5Bkey%5D=value
Normalizations performed
- Converting the scheme and host to lower case
- Capitalizing letters in escape sequences
- Decoding percent-encoded octets of unreserved characters
- Adding trailing
/
- Removing the default port
- Removing dot-segments
For more information about these normalizations, please see the following Wikipedia article:
http://en.wikipedia.org/wiki/URL_normalization#Normalizations_that_Preserve_Semantics
For license information, please see LICENSE file.
Options
Two options are available when normalizing URLs which are disabled by default:
- Remove empty delimiters. Enabling this option would normalize
http://www.example.com/?
tohttp://www.example.com/
Currently, only the query string delimiter (?
) is supported by this option. - Sort query parameters. Enabling this option sorts the query parameters by key alphabetically. For example,
http://www.example.com/?c=3&b=2&a=1
becomeshttp://www.example.com/?a=1&b=2&c=3
- Remove tracking parameters. For examplem
https://example.com/?fbclid=xxxxx
becomeshttps://example.com/?