html5/htmlreader

Html5 stream tokenizer/reader (not using libxml)

v2.0.1 2017-11-14 11:36 UTC

This package is auto-updated.

Last update: 2020-09-06 00:08:37 UTC


README

Downloads this Month 68747470733a2f2f7472617669732d63692e6f72672f6465726d6174746865732f48544d4c5265616465722e737667 Latest Stable Version

HTMLReader

HtmlReader is a very simple Html Parser NOT build on libxml. It is thought as replacement for XMLReader which won't parse html5 input data properly. It is faster than DOM and won't change a single whitespace.

It won't care about properly closed Elements etc. so you can / have to do it your own.

Installation

Use Composer to install the Package from Packagist.com:

composer require html5/htmlreader

Usage

$reader = new HtmlReader();
$reader->loadHtml("input.html")
// $reader->loadHtmlString("<html></html>");

$reader->setHandler(new HtmlCallback()); // <-- Write your own HtmlCallback
$reader->parse();

Debugging

We have packed a DebugHtmlCallback Handler.

New in Version 1.1.0

  • Added Support for Namespaces

Credits

Written by Matthias Leuffen http://leuffen.de