amsify42 / php-domfinder
PHP package for searching document object model efficiently and with more readable way.
Installs: 1 551
Dependents: 0
Suggesters: 0
Security: 0
Stars: 6
Watchers: 3
Forks: 1
Open Issues: 0
Requires
- php: >=7.0.0
- amsify42/php-curl-http: dev-master
Requires (Dev)
- php: >=7.0.0
- amsify42/php-curl-http: dev-master
- phpunit/phpunit: ^7.0
This package is not auto-updated.
Last update: 2024-11-13 15:09:08 UTC
README
PHP package for searching document object model efficiently and with more readable way.
Installation
$ composer require amsify42/php-domfinder
Table of Contents
- Loading Source
- Important Notes
- Meta Tags
- Elements
- Element Class
- Element Id
- Element Attribute
- Regex Extraction
- Element Methods
- Multi Level Finder
1. Loading Source
File
$domFinder = new Amsify42\DOMFinder\DOMFinder('path/to/file.html'); // or $domFinder = new Amsify42\DOMFinder\DOMFinder(); $domFinder->load('path/to/file.html');
HTML
$domFinder = new Amsify42\DOMFinder\DOMFinder('path/to/file.html', 'html'); // or $domFinder = new Amsify42\DOMFinder\DOMFinder(); $domFinder->loadHTML('path/to/file.html');
XML
$domFinder = new Amsify42\DOMFinder\DOMFinder('path/to/file.xml', 'xml'); // or $domFinder = new Amsify42\DOMFinder\DOMFinder(); $domFinder->loadXML('path/to/file.xml');
URL
For HTML
$domFinder = new Amsify42\DOMFinder\DOMFinder('http://www.site.com/file.html', 'html', true); // or $domFinder = new Amsify42\DOMFinder\DOMFinder(); $domFinder->loadHTML('http://www.site.com/file.html', true);
For XML
$domFinder = new Amsify42\DOMFinder\DOMFinder('http://www.site.com/file.xml', 'xml', true); // or $domFinder = new Amsify42\DOMFinder\DOMFinder(); $domFinder->loadXML('http://www.site.com/file.xml', true);
Using helper method
$domFinder = get_dom_finder('http://www.site.com/file.html', 'html', true);
Note: Make sure you pass true
as 3rd parameter to constructor/helper method or 2nd parameter to load method for loading content from URL.
2. Important Notes
1. DOMDocument
Amsify42\DOMFinder\DOMFinder
class uses Amsify42\DOMFinder\DOM\Document
which extends PHP pre defined class DOMDocument. You can use all the methods of DOMDocument
using this instance
$domFinder->dom();
Example:
$domFinder->dom()->getElementsByTagName('p');
2. DomXPath
Amsify42\DOMFinder\DOMFinder
class uses PHP pre defined class DomXPath
for querying document. If you want to use all the methods of DomXPath, you can use this instance
$domFinder->finder();
Example:
$domFinder->finder()->query("/div[@class='body-entry']");
3. DOMElement
All the element results you get after querying document will be of type Amsify42\DOMFinder\DOM\Element
which extends PHP pre defined class DOMElement.
$anchors = $domFinder->find('a')->byClass('action-link')->all(); if($anchors->length) { foreach($anchors as $anchor) { var_dump($anchor); // Will be of type Amsify42\DOMFinder\DOM\Element which extends DOMElement } }
You can use all the methods of DOMElement
from all the element items.
Example:
foreach($anchors as $anchor) { $anchor->getAttribute('href'); }
Most importantly, whenever you try to get the first or particular key element by index, it will either return NULL
or element of type Amsify42\DOMFinder\DOM\Element
.
Examples:
$para = $domFinder->getFirstElement('p'); // or $para = $domFinder->getElement('p', 1); // or $para = $domFinder->findFirst('p'); // or $para = $domFinder->find('p')->first(); // or $para = $domFinder->find('p')->get(1);
3. Meta Tags
After source has been loaded, you can use these meta tags related methods.
$metaTags = $domFinder->metaTags();
To get specific meta tag value
<meta name="title" content="Amsify42">
$title = $domFinder->getMetaValue('name', 'title');
By default it takes content attribute value from meta element, to get value from other attribute, pass 3rd parameter
<meta name="title" myattr="Amsify42">
$title = $domFinder->getMetaValue('name', 'title', 'myattr');
4. Elements
To get specific elements from DOM
$paras = $domFinder->getElements('p');
To get first element
$para = $domFinder->getFirstElement('p');
To get the element by index position
$para = $domFinder->getElement('p', 1);
5. Element Class
Equals
Find all elements by class name
$elements = $domFinder->findByClass('section-items')->all();
Find first element by class
$element = $domFinder->findByClass('section-items')->first(); // or $element = $domFinder->findFirstByClass('section-items');
Find all div tag element by class
$elements = $domFinder->find('div')->byClass('section-items')->all();
Find first div tag element by class
$element = $domFinder->find('div')->byClass('section-items')->first();
For getting element by its key position
$element = $domFinder->find('div')->byClass('section-items')->get(1); // This will return 2nd element
Like
Find all elements contains class
$elements = $domFinder->findClassLike('section-items')->all();
Find first element contains class
$element = $domFinder->findClassLike('section-items')->first(); // or $element = $domFinder->findFirstClassLike('section-items');
Find all div tag element contains class
$divs = $domFinder->find('div')->classLike('section-items')->all();
Find first div tag element contains class
$div = $domFinder->find('div')->classLike('section-items')->first();
For getting element by its key position
$div = $domFinder->find('div')->classLike('section-items')->get(1); // This will return 2nd element
6. Element Id
Equals
Find all elements by id
$elements = $domFinder->findById('body-entry')->all();
Find first element by id
$element = $domFinder->findById('body-entry')->first(); // or $element = $domFinder->findFirstById('body-entry');
Find all div tag element by id
$divs = $domFinder->find('div')->byId('body-entry')->all();
Find first div tag element by id
$div = $domFinder->find('div')->byId('body-entry')->first();
Like
Find all elements contains id
$elements = $domFinder->findIdLike('section-')->all();
Find first element contains id
$element = $domFinder->findIdLike('section-')->first(); // or $element = $domFinder->findFirstIdLike('section-');
Find all div tag element contains id
$divs = $domFinder->find('div')->idLike('section-')->all();
Find first div tag element contains id
$div = $domFinder->find('div')->idLike('section-')->first();
For getting element by its key position
$div = $domFinder->find('div')->idLike('section-')->get(1); // This will return 2nd element
7. Element Attribute
Equals
Find all elements by attribute
$elements = $domFinder->findByAttr('data-section', 'paragraph')->all();
Find first element by attribute
$element = $domFinder->findByAttr('data-section', 'paragraph')->first(); // or $element = $domFinder->findFirstByAttr('data-section', 'paragraph');
Find all div tag element by attribute
$divs = $domFinder->find('div')->byAttr('data-section', 'paragraph')->all();
Find first div tag element by attribute
$div = $domFinder->find('div')->byAttr('data-section', 'paragraph')->first();
For getting element by its key position
$div = $domFinder->find('div')->byAttr('data-section', 'paragraph')->get(1); // This will return 2nd element
Like
Find all elements contains attribute
$elements = $domFinder->findAttrLike('my-att', 'some-')->all();
Find first element contains attribute
$element = $domFinder->findAttrLike('my-att', 'some-')->first(); // or $element = $domFinder->findFirstAttrLike('my-att', 'some-');
Find all div tag element contains attribute
$divs = $domFinder->find('div')->attrLike('my-att', 'some-')->all();
Find first div tag element contains attribute
$div = $domFinder->find('div')->attrLike'my-att', 'some-')->first();
For getting element by its key position
$div = $domFinder->find('div')->attrLike('my-att', 'some-')->get(1); // This will return 2nd element
8. Regex Extraction
To extract particular item from html, consider this sample html
$html = '<div class="section"> <script>var data={"name": "my name", "id":12345};</script> </div>'; $domFinder = new Amsify42\DOMFinder\DOMFinder(); $domFinder->loadHTML($html); $section = $domFinder->findFirstByClass('section'); if($section) { $data = $section->extractByRegex("/data\=(.*?)\;</"); // Here you will get js dictionary data }
For extracting multiple instances of data by regex, pass 2nd parameter as true
$html = '<div class="section"> <some-element class="some-class">{"name": "name one", "id":1}</some-element> <some-element class="some-class">{"name": "name two", "id":2}</some-element> <some-element class="some-class">{"name": "name three", "id":3}</some-element> </div>'; $domFinder = new Amsify42\DOMFinder\DOMFinder(); $domFinder->loadHTML($html); $section = $domFinder->findFirstByClass('section'); if($section) { $data = $section->extractByRegex("/class=\"some-class\">(.*?)\<\//", true); // Here you will get multiple js dictionary data as array }
You can also pass multiple regex as array for multi level check and extraction
$data = $section->extractByRegex(["/<some-element(.*?)some-element>/", "/class=\"some-class\">(.*?)\<\//"], true);
9. Element methods
These are the methods you can use at element level
<ul class="list-items"> <li>Item one</li> <li>Item two</li> <li>Item three</li> </ul>
$ul = $domFinder->getElement('ul'); // or $ul = $domFinder->findFirst('ul');
For getting outer and inner HTML of element, you can use these methods
echo $ul->outerHTML();
Outer html will print
<ul class="list-items"> <li>Item one</li> <li>Item two</li> <li>Item three</li> </ul>
echo $ul->innerHTML();
Inner html will print
<li>Item one</li> <li>Item two</li> <li>Item three</li>
10. Multi Level Finder
This section is to demonstrate how the dom finder works at multi level.
<div class="parent-class"> <div class="child-class"> <ul class="list"> <li class="item">one</li> <li class="item">two</li> <li class="item">three</li> </ul> </div> <div class="child-class"> <ul class="list"> <li class="item">one</li> <li class="item">two</li> <li class="item">three</li> </ul> </div> </div>
Simple
$uls = $domFinder->find('div')->byClass('child-class')->find('ul')->all(); // or $uls = $domFinder->find('div')->byClass('child-class')->findAll('ul');
The above query is same as DomXPath
$uls = $domFinder->finder()->query("/div[@class='child-class']/ul");
You will get all the ul elements
if($uls->length) { foreach($uls as $ul) { var_dump($ul); } }
Element Level
This approach actually creates DOMFinder
instance at each element level when you try to do query.
$div = $domFinder->find('div')->byClass('parent-class')->first(); if($div) { $divs = $div->find('div')->byClass('child-class')->all(); // At this level DOMFinder instance will be created and assigned to this element if($divs->length) { echo $divs->length; } }