sukohi / shellless
A PHP package to extract readable text from HTML.
Installs: 30
Dependents: 0
Suggesters: 0
Security: 0
Stars: 0
Watchers: 1
Forks: 0
Open Issues: 0
pkg:composer/sukohi/shellless
This package is not auto-updated.
Last update: 2025-09-28 00:24:12 UTC
README
A PHP package to extract readable text from HTML.
Installation
Execute the next command.
composer require sukohi/shellless:1.*
Usage
use Sukohi\Shellless\Shellless;
$html = file_get_contents('http://example.com/');
$shellless = new Shellless();
$result = $shellless->extract($html);
echo $result->title; // Page title
echo $result->best_text; // The longest text
echo $result->full_text; // Joined text if more than 100 characters length.
print_r($result->all_texts, true);
Options
$shellless->setOptions([
'join_step' => 5,
'min_text_length' => 100
]);
Algorithm
- Join close texts if less than 5 HTML tags between them.
- Pick up texts if more than 100 characters length.
License
This package is licensed under the MIT License.
Copyright 2017 Sukohi Kuhoh