fizzka/extractor

HTML Parser

Installs: 4 869

Dependents: 0

Suggesters: 0

Security: 0

Stars: 4

Watchers: 1

Forks: 1

Open Issues: 2

pkg:composer/fizzka/extractor

0.3.5 2019-03-13 22:38 UTC

This package is auto-updated.

Last update: 2025-10-14 12:53:06 UTC


README

html extraction library, based on SimpleXML & nokogiri XpathSubquery.php

Latest Stable Version Build Status Coveralls

Benefits

  • Simple
  • Minimal code
  • Fast
  • Query results are SimpleXMLElement instances
  • Supports nested css/xpath queries

Installation

#Using packagist:
composer require 'fizzka/extractor'

Basic Usage

<?php
require_once 'vendor/autoload.php';

$html = gzdecode(file_get_contents('http://habrahabr.ru/'));

$ex = Extractor::fromHtml($html);
var_dump($ex->get('a.habracut'));

Advanced Usage

echo $ex->cssPathFirst('div.post')->xpathFirst('.//@href');

foreach ($ex->cssPath('div.post') as $post) {
	var_dump($post->cssPathFirst('a.post_title'));
}

Testing

Just run phpunit from the top of project

Contribute

Feel free to use & contribute ;)

License

MIT