matecat / xml-dom-parser
Matecat XML DomDocument Loader component
v2.0.0
2026-02-16 15:01 UTC
Requires
- php: >=8.3
- ext-dom: *
- ext-libxml: *
- ext-xml: *
Requires (Dev)
- phpstan/phpstan: @stable
- phpunit/phpunit: ^12
This package is auto-updated.
Last update: 2026-02-16 15:11:27 UTC
README
Matecat XML DOM Parser
A lightweight PHP library for parsing XML and HTML documents into traversable object structures. Built on top of PHP's DOMDocument with enhanced error handling and validation support.
Requirements
- PHP 8.3+
- ext-dom
- ext-xml
- ext-libxml
Installation
composer require matecat/xml-dom-parser
Usage
Parsing XML
use Matecat\XmlParser\XmlParser; $xml = '<?xml version="1.0"?> <note> <to>Tove</to> <from>Jani</from> <message>Hello!</message> </note>'; $parsed = XmlParser::parse($xml); // Access elements echo $parsed[0]->tagName; // "to" echo $parsed[0]->inner_html[0]->text; // "Tove"
Parsing XML Fragments
use Matecat\XmlParser\XmlParser; $fragment = '<tag id="1">Content</tag><tag id="2">More content</tag>'; $parsed = XmlParser::parse($fragment, isXmlFragment: true); echo $parsed[0]->attributes['id']; // "1" echo $parsed[1]->inner_html[0]->text; // "More content"
Parsing HTML
use Matecat\XmlParser\HtmlParser; $html = '<html><head><title>Test</title></head><body><div>Content</div></body></html>'; $parsed = HtmlParser::parse($html); echo $parsed[0]->inner_html[0]->tagName; // "head" echo $parsed[0]->inner_html[1]->tagName; // "body"
Using XmlDomLoader Directly
For more control over XML loading and validation:
use Matecat\XmlParser\XmlDomLoader; use Matecat\XmlParser\Config; // Basic loading $dom = XmlDomLoader::load($xmlContent); // With configuration $config = new Config( setRootElement: 'root', // Wrap content in a root element allowDocumentType: false, // Reject DOCTYPE declarations xmlOptions: LIBXML_NONET, // libxml options schemaOrCallable: '/path/to/schema.xsd' // XSD validation ); $dom = XmlDomLoader::load($xmlContent, $config);
Schema Validation
Validate XML against an XSD schema:
use Matecat\XmlParser\XmlDomLoader; use Matecat\XmlParser\Config; $config = new Config(schemaOrCallable: '/path/to/schema.xsd'); $dom = XmlDomLoader::load($xml, $config);
Or use a custom validation callable:
use Matecat\XmlParser\Config; use Matecat\XmlParser\XmlDomLoader; use DOMDocument; $validator = function (DOMDocument $dom, bool $internalErrors): bool { // Custom validation logic return $dom->getElementsByTagName('required-element')->length > 0; }; $config = new Config(schemaOrCallable: $validator); $dom = XmlDomLoader::load($xml, $config);
Parsed Element Structure
Each parsed element is an object with the following properties:
| Property | Type | Description |
|---|---|---|
node |
string |
The raw XML/HTML of the element |
tagName |
string |
The element's tag name |
attributes |
array |
Key-value pairs of attributes |
text |
string|null |
Text content (for text nodes) |
self_closed |
bool|null |
Whether the element is self-closing |
has_children |
bool|null |
Whether the element has child nodes |
inner_html |
ArrayObject |
Child elements |
Exception Handling
The library throws specific exceptions for different error conditions:
XmlParsingException- XML syntax errors or validation failuresInvalidXmlException- XML is well-formed but invalid against schemaDomDependecyMissingException- Required PHP extensions are missing
use Matecat\XmlParser\XmlParser; use Matecat\XmlParser\Exception\XmlParsingException; try { $parsed = XmlParser::parse('<invalid><xml>'); } catch (XmlParsingException $e) { echo "Parse error: " . $e->getMessage(); }
License
This project is licensed under the MIT License.