zenthangplus / html-dom-parser
A Simple HTML DOM parser written in PHP let you manipulate HTML in a easy way with CSS Selector.
Installs: 10 638
Dependents: 0
Suggesters: 0
Security: 0
Stars: 12
Watchers: 3
Forks: 2
Open Issues: 3
Requires
- php: >=5.6
Requires (Dev)
- phpunit/phpunit: ^5
This package is auto-updated.
Last update: 2025-07-14 08:16:48 UTC
README
A Simple HTML DOM parser written in PHP let you manipulate HTML in a easy way with selectors just like CSS or jQuery.
This is modern version of Simple HTML DOM. You can install by using Composer and import to your project as a package.
Features
- Parse and modify HTML document.
- Find tags (elements) on HTML with selectors just like jQuery.
- Extract contents from HTML in a single line.
- Export elements or a special node to a single file.
- Supports HTML document with invalid structure.
Installation
You can use Composer to install this package to your project by running following command:
composer require zenthangplus/html-dom-parser
The minimum PHP version requirement is 5.6. If you are using PHP < 5.6, please use the original version.
Usage
The following example is the simple usage of this package:
<?php $dom = \HTMLDomParser\DomFactory::load('<div class="container"><div class="anchor"><a href="#">Test</a></div></div>'); $a = $dom->findOne('.container a'); echo $a->text(); // Output: Test
DOM
Dom is the root Node of the HTML document.
You can load DOM from string
or file
.
<?php $dom = \HTMLDomParser\DomFactory::load('<div>Test</div>');
<?php $dom = \HTMLDomParser\DomFactory::loadFile('document.html');
NODE
Node is simply an HTML element that described as an object.
You can also load any Node (similar to Dom):
<?php $node = \HTMLDomParser\NodeFactory::load('<div><a href="#">Test</a></div>');
<?php $node = \HTMLDomParser\NodeFactory::loadFile('document.html');
Traversing the DOM
By using selectors like jQuery or CSS, you can traverse easy in the Dom or even in a Node.
Example:
<?php $dom = \HTMLDomParser\DomFactory::loadFile('document.html'); $dom->find('div'); $dom->find('#container'); $dom->find('#container .content ul>li a.external-link'); $dom->find('#container .content ul>li a[data-id=link-1]');
Similar to Dom, a Node also traversable:
<?php $dom = \HTMLDomParser\DomFactory::loadFile('document.html'); $node = $dom->findOne('#container .content ul>li'); $anchorNode = $node->findOne('a.external-link'); // Even traverse in a separate Node $node = \HTMLDomParser\NodeFactory::load('<ul class="list"><li>Item 1</li><li>Item 2</li></ul>'); $node->find('ul.list li');
List of supported selectors:
Selector example | Description |
---|---|
div |
Find elements with the div tag |
#container |
Find elements with the container id |
.wrapper |
Find elements with the wrapper class |
[data-id] |
Find elements with the data-id attribute |
[data-id=12] |
Find elements with the attribute data-id=12 |
a[data-id=12] |
Find anchor tags with the attribute data-id=12 |
*[class] |
Find all elements with class attribute |
a, img |
Find all anchors and images |
a[title], img[title] |
Find all anchors and images with the title attribute |
#container ul |
By using space between selectors, you can find nested elements |
#container>ul |
By using > between selectors, you can find the closest children |
#container, #side |
By using , between selectors, you can find elements by multiple selectors in one query |
#container div.content ul>li, #side div[role=main] ul li |
You can combine selectors in one query |
List of function you can use with above selectors:
Specific find functions:
getElementById()
Get a element by IDgetElementByTagName()
Get a element by tag namegetElementsByTagName()
Get elements by tag name
Accessing the node's data:
text()
Get the text contentsgetAttributes()
Get attributesgetAttribute()
Get a attributehasAttribute()
Check element has a attributehasChild()
Check element has childinnerHtml()
Get inner HTMLouterHtml()
Get outer HTMLinnerXml()
Get inner XML- Get node's HTML
Modifying the Node's data
setAttribute()
Set a attributeremoveAttribute()
Remove a attributeappendChild()
Append childsave()
Save DOM or even a node