dilneiss/purify

An HTML Purifier / Sanitizer for Laravel

v4.0.1 2022-02-09 01:05 UTC

This package is auto-updated.

Last update: 2024-04-09 06:18:14 UTC


README

GitHub Actions Scrutinizer Code Quality Latest Stable Version Total Downloads License

Purify is a Laravel wrapper around HTMLPurifier by ezyang.

Requirements

  • PHP >= 7.1
  • Laravel >= 5.5

Installation

To install Purify, run the following in the root of your project:

composer require stevebauman/purify

Then, publish the configuration file using:

php artisan vendor:publish --provider="Stevebauman\Purify\PurifyServiceProvider"

If you are using Lumen, you should copy the config file purify.php by hand, and add this line to your bootstrap/app.php:

$app->register(Stevebauman\Purify\PurifyServiceProvider::class);

Usage

Cleaning a String

To clean a users input, simply use the clean method:

$input = '<script>alert("Harmful Script");</script> <p style="border:1px solid black" class="text-gray-700">Test</p>';

// Returns '<p class="text-gray-700">Test</p>'
$cleaned = Purify::clean($input);
Cleaning an Array

Need to purify an array of user input? Just pass in an array:

$array = [
    '<script>alert("Harmful Script");</script> <p style="border:1px solid black" class="text-gray-700">Test</p>',
    '<script>alert("Harmful Script");</script> <p style="border:1px solid black" class="text-gray-700">Test</p>',
];

$cleaned = Purify::clean($array);

// array [
//  '<p class="text-gray-700">Test</p>',
//  '<p class="text-gray-700">Test</p>',
// ]
var_dump($cleaned);
Dynamic Configuration

Need a different configuration for a single input? Pass in a configuration array into the second parameter:

$config = ['HTML.Allowed' => 'div,b,a[href]'];

$cleaned = Purify::clean($input, $config);

Note: Configuration passed into the second parameter is not merged with your current configuration.

$config = ['HTML.Allowed' => 'div,b,a[href]'];

$cleaned = Purify::clean($input, $config);
Replacing the HTML Purifier instance

Need to replace the HTML Purifier instance with your own? Call the setPurifier() method:

$purifier = new HTMLPurifier();

Purify::setPurifier($purifier);

Practices

If you're looking into sanitization, you're likely wanting to sanitize inputted user HTML content that is then stored in your database to be rendered onto your application.

In this scenario, it's likely best practice to sanitize on the way out instead of the on the way in. Remember, the database doesn't care what text it contains.

This way you can allow anything to be inserted in the database, and have strong sanization rules on the way out.

This helps tremendously if you change your sanization requirements later down the line, then all rendered content will follow these sanization rules.

Configuration

Inside the configuration file, the entire settings array is passed directly to the HTML Purifier configuration, so feel free to customize it however you wish. For the configuration documentation, please visit the HTML Purifier Website:

http://htmlpurifier.org/live/configdoc/plain.html

Custom Configuration Rules

There's multiple ways of creating custom rules on the HTML Purifier instance.

Below is an example service provider you can use as a starting point to add rules to the instance. This provider gives compatibility with Basecamp's Trix WYSIWYG editor:

Credit to Antonio Primera for resolving some HTML Purifier configuration issues with trix.

<?php

namespace App\Providers;

use HTMLPurifier_HTMLDefinition;
use Stevebauman\Purify\Facades\Purify;
use Illuminate\Support\ServiceProvider;

class PurifySetupProvider extends ServiceProvider
{
    const DEFINITION_ID = 'trix-editor';
    const DEFINITION_REV = 1;

    /**
     * Bootstrap the application services.
     *
     * @return void
     */
    public function boot()
    {
        /** @var \HTMLPurifier $purifier */
        $purifier = Purify::getPurifier();

        /** @var \HTMLPurifier_Config $config */
        $config = $purifier->config;

        $config->set('HTML.DefinitionID', static::DEFINITION_ID);
        $config->set('HTML.DefinitionRev', static::DEFINITION_REV);

        if ($def = $config->maybeGetRawHTMLDefinition()) {
            $this->setupDefinitions($def);
        }

        $purifier->config = $config;
    }

    /**
     * Register the application services.
     *
     * @return void
     */
    public function register()
    {
        //
    }

    /**
     * Adds elements and attributes to the HTML purifier
     * definition required by the trix editor.
     *
     * @param HTMLPurifier_HTMLDefinition $def
     */
    protected function setupDefinitions(HTMLPurifier_HTMLDefinition $def)
    {
        $def->addElement('figure', 'Inline', 'Inline', 'Common');
        $def->addAttribute('figure', 'class', 'Text');

        $def->addElement('figcaption', 'Inline', 'Inline', 'Common');
        $def->addAttribute('figcaption', 'class', 'Text');
        $def->addAttribute('figcaption', 'data-trix-placeholder', 'Text');

        $def->addAttribute('a', 'rel', 'Text');
        $def->addAttribute('a', 'tabindex', 'Text');
        $def->addAttribute('a', 'contenteditable', 'Enum#true,false');
        $def->addAttribute('a', 'data-trix-attachment', 'Text');
        $def->addAttribute('a', 'data-trix-content-type', 'Text');
        $def->addAttribute('a', 'data-trix-id', 'Number');

        $def->addElement('span', 'Block', 'Flow', 'Common');
        $def->addAttribute('span', 'data-trix-cursor-target', 'Enum#right,left');
        $def->addAttribute('span', 'data-trix-serialize', 'Enum#true,false');

        $def->addAttribute('img', 'data-trix-mutable', 'Enum#true,false');
        $def->addAttribute('img', 'data-trix-store-key', 'Text');
    }
}

After this service provider is created, make sure you insert it into your providers array in the config/app.php file, and update your HTML.Allowed string in the config/purify.php file.

Note: Remember that after this definition is created, and you have ran Purify::clean(), the definition will be cached, and you will have to clear it from your storage/app/purify folder if you want to make changes to the definition.

Otherwise, you will have to change the definition version number or ID for it to be re-cached.