fofx / utility
Utilities library
Installs: 2
Dependents: 1
Suggesters: 0
Security: 0
Stars: 0
Watchers: 0
Forks: 0
Open Issues: 0
Language:HTML
Requires
- php: ^8.3
- fofx/helper: ^1.2
- jeremykendall/php-domain-parser: ^6.4
- laravel/framework: ^12.22
- symfony/css-selector: ^7.3
- symfony/dom-crawler: ^7.3
Requires (Dev)
- friendsofphp/php-cs-fixer: ^3.85
- orchestra/testbench: ^10.4
- phpstan/phpstan: ^2.1
- phpstan/phpstan-mockery: ^2.0
- phpunit/phpunit: ^12.3
README
A PHP library with a few practical helpers. Uses jeremykendall/php-domain-parser for domain parsing.
get_tables()
- List database tables for the current connectiondownload_public_suffix_list()
- Ensure the Public Suffix List exists locallyextract_registrable_domain()
- Extract a registrable domain from a URLis_valid_domain()
- Validate if a domain has a valid registrable domain and suffixextract_canonical_url()
- Extract the canonical URL from HTML contentlist_embedded_json_selectors()
- List CSS selectors for JSON script tags found in HTMLextract_embedded_json_blocks()
- Extract JSON blocks from HTML with additional metadatafilter_json_blocks_by_selector()
- Filter JSON blocks by selector ID, with optional 'json' key selectionsave_json_blocks_to_file()
- Save extracted JSON blocks to a file with optional filtering by selector ID
Installation
composer require fofx/utility
Usage
See usage examples in:
Extracting and Filtering Embedded JSON from HTML
You can use filter_json_blocks_by_selector()
with extract_embedded_json_blocks()
to extract embedded JSON from an HTML file, and filter it.
require_once __DIR__ . '/../vendor/autoload.php'; use FOfX\Utility; use Illuminate\Support\Arr; $filename = __DIR__ . '/../resources/2-httpswwwfiverrcomcategoriesgraphics-designcreative-logo-design-fiverrcom-browserhtml.html'; $html = file_get_contents($filename); $blocks = Utility\extract_embedded_json_blocks($html); $filtered = Utility\filter_json_blocks_by_selector($blocks, 'perseus-initial-props', true); // Use Arr::dot() to get a dot notation list of keys, to see the JSON structure $dot_keys_only = array_keys(Arr::dot($filtered[0] ?? [])); print_r($dot_keys_only);
Importing Fiverr Sitemap Data (Categories and Tags)
See docs/usage-FiverrSitemapImporter.md
use FOfX\Utility\FiverrSitemapImporter; $importer = new FiverrSitemapImporter(); $importer->setBatchSize(500); // optional (default 100) // Categories $stats = $importer->importCategories(); print_r($stats); // Tags $stats = $importer->importTags(); print_r($stats);
JSON to Columns
Helpers for working with JSON data and converting it to database columns.
See docs/usage-json-to-columns.md
Testing and Development
To run the PHPUnit test suite through composer:
composer test
To use PHPStan for static analysis:
composer phpstan
To use PHP-CS-Fixer for code style:
composer cs-fix
Test and the PSL file
Since tests use temporary storage, to avoid network calls during tests. public_suffix_list.dat
(download here) must be saved to to local/resources/
.
Tests then copy this file into temporary storage. If it is missing, tests are skipped.
License
MIT