fizzka/extractor

HTML Parser

0.3.5 2019-03-13 22:38 UTC

This package is auto-updated.

Last update: 2024-05-14 09:49:04 UTC


README

html extraction library, based on SimpleXML & nokogiri XpathSubquery.php

Latest Stable Version Build Status Coveralls

Benefits

  • Simple
  • Minimal code
  • Fast
  • Query results are SimpleXMLElement instances
  • Supports nested css/xpath queries

Installation

#Using packagist:
composer require 'fizzka/extractor'

Basic Usage

<?php
require_once 'vendor/autoload.php';

$html = gzdecode(file_get_contents('http://habrahabr.ru/'));

$ex = Extractor::fromHtml($html);
var_dump($ex->get('a.habracut'));

Advanced Usage

echo $ex->cssPathFirst('div.post')->xpathFirst('.//@href');

foreach ($ex->cssPath('div.post') as $post) {
	var_dump($post->cssPathFirst('a.post_title'));
}

Testing

Just run phpunit from the top of project

Contribute

Feel free to use & contribute ;)

License

MIT