wildhoney / xpath-document
Friendlier XPath extension of DOMDocument for those fluent in beloved XPath!
This package is not auto-updated.
Last update: 2024-03-25 13:15:16 UTC
README
Bower: bower install wildhoney/xpath-document
Getting Started
XPathDocument allows you to chain your query
methods, allowing you to delve deeper into the DOM hierarchy with each iteration.
$posts = $xpathDocument->query('//div[@class="posts"]'); foreach ($posts as $post) { $comments = $post->query('div[@class="comments"]'); }
Each query
will return an instance of XPathDocument_Dom_List
– and this class implements Iterator
, ArrayAccess
and Countable
, which gives you lots of useful methods for manipulating the node collection.
Typically XPathDocument_Dom_List
will hold a collection of XPathDocument_Dom_Element
instances – but other instances are possible:
XPathDocument_Dom_Element
– generic elements with values and attributes;XPathDocument_Dom_Attr
– specific for node attributes;XPathDocument_Dom_Text
– specific for text values of nodes;
The latter two have a simple getText
method for returning their values. However, XPathDocument_Dom_Element
has the greatest flexibility.
Element Instance
With an instance of XPathDocument_Dom_Element
you have the following methods:
getText
– retrieve the value of the node;getHtml
– retrieve the HTML value of the node;getName
– retrieve the name of the node (span
,div
, etc...);getAttribute
– retrieve an attribute by its name;query
– use node as the context for further querying;
Reddit Example
Please see the Reddit.com example in the example/index.php
which will demonstrate how simple it is to crawl websites with XPathDocument
!