youthweb/urllinker

Autolink URLs in text or html

2.0.0 2022-12-14 13:25 UTC

This package is auto-updated.

Last update: 2024-11-14 17:44:11 UTC


README

Latest Version Software License Build Status codecov Total Downloads

UrlLinker converts any web addresses in plain text into HTML hyperlinks.

This is a maintained fork of the great work of Kwi\UrlLinker, formerly on Bitbucket.

Install

Via Composer

$ composer require youthweb/urllinker

Usage

$urlLinker = new Youthweb\UrlLinker\UrlLinker();

$linkedText = $urlLinker->linkUrlsAndEscapeHtml($text);

$linkedText = $urlLinker->linkUrlsInTrustedHtml($html);

You can optional configure different options for parsing URLs by passing them to UrlLinker::__construct():

$config = [
    // Ftp addresses like "ftp://example.com" will be allowed, default false
    'allowFtpAddresses' => true,

    // Uppercase URL schemes like "HTTP://exmaple.com" will be allowed:
    'allowUpperCaseUrlSchemes' => true,

    // Add a Closure to modify the way the urls will be linked:
    'htmlLinkCreator' => function(string $url, string $content): string
    {
        return '<a href="' . $url . '" target="_blank">' . $content . '</a>';
    },

    // ..or add a callable as a Closure to modify the way the urls will be linked:
    'htmlLinkCreator' => [$class, 'linkCreator'](...),

    // Add a Closure to modify the way the emails will be linked:
    'emailLinkCreator' => function(string $email, string $content): string
    {
        return '<a href="mailto:' . $email . '" class="email">' . $content . '</a>';
    },

    // ... or add a callable as a Closure to modify the way the emails will be linked:
    'emailLinkCreator' => \Closure::fromCallable('callableFunction'),

    // ... or you can also disable the links for email with a closure:
    'emailLinkCreator' => fn (string $email, string $content): string => $email,

    // You can customize the recognizable Top Level Domains:
    'validTlds' => ['.localhost' => true],
];

$urlLinker = new Youthweb\UrlLinker\UrlLinker($config);

Recognized addresses

  • Web addresses
    • Recognized URL schemes: "http" and "https"
      • The http:// prefix is optional.
      • Support for additional schemes, e.g. "ftp", can easily be added by setting allowFtpAddresses to true.
      • The scheme must be written in lower case. This requirement can be lifted by setting allowUpperCaseUrlSchemes to true.
    • Hosts may be specified using domain names or IPv4 addresses.
      • IPv6 addresses are not supported.
    • Port numbers are allowed.
    • Internationalized Resource Identifiers (IRIs) are allowed. Note that the job of converting IRIs to URIs is left to the user's browser.
    • To reduce false positives, UrlLinker verifies that the top-level domain is on the official IANA list of valid TLDs.
      • UrlLinker is updated from time to time as the TLD list is expanded.
      • In the future, this approach may collapse under ICANN's ill-advised new policy of selling arbitrary TLDs for large amounts of cash, but for now it is an effective method of rejecting invalid URLs.
      • Internationalized top-level domain names must be written in Punycode in order to be recognized.
      • If you want to support only some specific TLD you can set them with validTlds e.g. ['.com' => true, '.net' => true].
      • If you need to support unqualified domain names, such as localhost, you can also set them with ['.localhost' => true] in validTlds.
  • Email addresses
    • Supports the full range of commonly used address formats, including "plus addresses" (as popularized by Gmail).
    • Does not recognized the more obscure address variants that are allowed by the RFCs but never seen in practice.
    • Simplistic spam protection: The at-sign is converted to a HTML entity, foiling naive email address harvesters.
    • If you don't want to link emails you can set closure that simply returns the raw email with a closure function($email, $content) { return $email; } in emailLinkCreator.
  • Addresses are recognized correctly in normal sentence contexts. For instance, in "Visit stackoverflow.com.", the final period is not part of the URL.
  • User input is properly sanitized to prevent cross-site scripting (XSS), and ampersands in URLs are correctly escaped as &amp; (this does not apply to the linkUrlsInTrustedHtml() function, which assumes its input to be valid HTML).

Changelog

Please see CHANGELOG for more information what has changed recently.

Tests

Unit tests are written using PHPUnit.

$ phpunit

Contributing

Please feel free to submit bugs or to fork and sending Pull Requests. This project follows Semantic Versioning 2 and PSR-2.

License

GPL3. Please see License File for more information.