tenqz / lingua
A comprehensive PHP library for advanced text processing using Chain of Responsibility pattern
Requires
- php: ^8.1
Requires (Dev)
- phpunit/phpunit: ^10.0
- squizlabs/php_codesniffer: 4.x-dev
README
Lingua v2.1.1
Lingua is a comprehensive PHP library designed for advanced text processing. It implements the Chain of Responsibility pattern to provide flexible and extensible text processing capabilities.
Note: This library is currently under active development. The current version may not reflect the final quality and API stability. Breaking changes may occur in future releases.
Features
- Chain of Responsibility pattern for text processing
- Flexible scenario-based processing
- Easy to extend with custom handlers
- PHP 8.1+ support
Available Handlers
The library provides several built-in text processing handlers:
SpecialCharsHandler
Removes special characters from text while preserving words and spaces. This handler:
- Removes all special characters (punctuation marks, symbols, etc.)
- Replaces newlines with spaces
- Preserves spaces between words
Example:
use Tenqz\Lingua\Handlers\Basic\SpecialCharsHandler; $handler = new SpecialCharsHandler(); $result = $handler->handle("Hello! @#$%^&*() World..."); // Returns: "Hello World"
NormalizeSpacesHandler
Normalizes whitespace in text. This handler:
- Replaces multiple spaces, tabs, and newlines with a single space
- Preserves leading and trailing whitespace
Example:
use Tenqz\Lingua\Handlers\Basic\NormalizeSpacesHandler; $handler = new NormalizeSpacesHandler(); $result = $handler->handle("Hello World\t\nTest"); // Returns: "Hello World Test"
TrimHandler
Removes whitespace from the beginning and end of text. This handler:
- Removes spaces, tabs, and newlines from the beginning and end of text
- Preserves whitespace between words
Example:
use Tenqz\Lingua\Handlers\Basic\TrimHandler; $handler = new TrimHandler(); $result = $handler->handle(" Hello World "); // Returns: "Hello World"
HtmlTagsHandler
Removes HTML tags from text while preserving the content between them. This handler:
- Removes all HTML tags using PHP's
strip_tags
function - Preserves HTML entities
- Normalizes multiple spaces into single spaces
- Trims spaces from the beginning and end of text
Example:
use Tenqz\Lingua\Handlers\Basic\HtmlTagsHandler; $handler = new HtmlTagsHandler(); $result = $handler->handle("<p>Hello</p> <div>World</div>"); // Returns: "HelloWorld" $result = $handler->handle("<p>Hello & World</p>"); // Returns: "Hello & World"
Installation
composer require tenqz/lingua
Basic Usage
use Tenqz\Lingua\Core\TextProcessor; use Tenqz\Lingua\Core\AbstractTextHandler; use Tenqz\Lingua\Core\Contracts\TextHandlerInterface; // Create a custom handler class CustomHandler extends AbstractTextHandler { protected function process(string $text): string { // Your text processing logic here return $text; } } // Initialize the processor $processor = new TextProcessor(); // Add handlers to the chain $processor->addHandler(new CustomHandler()); // Process text $result = $processor->process('Your text here');
Chain of Handlers Example
You can combine multiple handlers to create a processing pipeline:
use Tenqz\Lingua\Core\TextProcessor; use Tenqz\Lingua\Handlers\Basic\SpecialCharsHandler; use Tenqz\Lingua\Handlers\Basic\NormalizeSpacesHandler; use Tenqz\Lingua\Handlers\Basic\TrimHandler; // Initialize the processor $processor = new TextProcessor(); // Create a chain of handlers $processor ->addHandler(new SpecialCharsHandler()) // First, remove special characters ->addHandler(new NormalizeSpacesHandler()) // Then, normalize spaces ->addHandler(new TrimHandler()); // Finally, trim whitespace // Process text through the entire chain $result = $processor->process(' Hello! @#$%^&*() World... '); // Result: "Hello World"
Architecture
The library consists of the following main components:
TextHandlerInterface
Interface that defines the contract for all text processing handlers:
interface TextHandlerInterface { public function handle(string $text): string; public function setNext(TextHandlerInterface $handler): TextHandlerInterface; public function getNext(): ?TextHandlerInterface; }
AbstractTextHandler
Abstract base class that implements the Chain of Responsibility pattern:
abstract class AbstractTextHandler implements TextHandlerInterface { protected ?TextHandlerInterface $nextHandler = null; public function setNext(TextHandlerInterface $handler): TextHandlerInterface { $this->nextHandler = $handler; return $handler; } public function getNext(): ?TextHandlerInterface { return $this->nextHandler; } public function handle(string $text): string { $processedText = $this->process($text); if ($this->nextHandler !== null) { return $this->nextHandler->handle($processedText); } return $processedText; } abstract protected function process(string $text): string; }
TextProcessor
Facade class for managing text processing chains:
class TextProcessor implements TextProcessorInterface { private ?TextHandlerInterface $handlersChain = null; public function addHandler(TextHandlerInterface $handler): self { if (!$this->handlersChain) { $this->handlersChain = $handler; return $this; } $lastHandler = $this->handlersChain; while ($lastHandler->getNext()) { $lastHandler = $lastHandler->getNext(); } $lastHandler->setNext($handler); return $this; } public function process(string $text): string { if (!$this->handlersChain) { throw new NotFoundHandlerException(); } return $this->handlersChain->handle($text); } }
Creating Custom Handlers
To create a custom handler, extend the AbstractTextHandler
class and implement the process
method:
use Tenqz\Lingua\Core\AbstractTextHandler; class MyCustomHandler extends AbstractTextHandler { protected function process(string $text): string { // Your text processing logic here return $text; } }
Error Handling
The library throws the following exceptions:
NotFoundHandlerException
: Thrown when attempting to process text with no registered handlers
Testing
composer test
Contributing
- Fork the repository
- Create your feature branch (
git checkout -b feature/amazing-feature
) - Commit your changes (
git commit -m 'Add some amazing feature'
) - Push to the branch (
git push origin feature/amazing-feature
) - Open a Pull Request
License
This project is licensed under the MIT License