tenqz/lingua

A comprehensive PHP library for advanced text processing using Chain of Responsibility pattern

Installs: 1

Dependents: 0

Suggesters: 0

Security: 0

Stars: 4

Watchers: 1

Forks: 0

Open Issues: 0

pkg:composer/tenqz/lingua

v2.1.1 2025-05-11 17:42 UTC

This package is auto-updated.

Last update: 2025-10-11 18:31:54 UTC


README

Build Status Total Downloads Latest Stable Version License

Lingua v2.1.1

Lingua is a comprehensive PHP library designed for advanced text processing. It implements the Chain of Responsibility pattern to provide flexible and extensible text processing capabilities.

Note: This library is currently under active development. The current version may not reflect the final quality and API stability. Breaking changes may occur in future releases.

Features

  • Chain of Responsibility pattern for text processing
  • Flexible scenario-based processing
  • Easy to extend with custom handlers
  • PHP 8.1+ support

Available Handlers

The library provides several built-in text processing handlers:

SpecialCharsHandler

Removes special characters from text while preserving words and spaces. This handler:

  • Removes all special characters (punctuation marks, symbols, etc.)
  • Replaces newlines with spaces
  • Preserves spaces between words

Example:

use Tenqz\Lingua\Handlers\Basic\SpecialCharsHandler;

$handler = new SpecialCharsHandler();
$result = $handler->handle("Hello! @#$%^&*() World..."); // Returns: "Hello  World"

NormalizeSpacesHandler

Normalizes whitespace in text. This handler:

  • Replaces multiple spaces, tabs, and newlines with a single space
  • Preserves leading and trailing whitespace

Example:

use Tenqz\Lingua\Handlers\Basic\NormalizeSpacesHandler;

$handler = new NormalizeSpacesHandler();
$result = $handler->handle("Hello    World\t\nTest"); // Returns: "Hello World Test"

TrimHandler

Removes whitespace from the beginning and end of text. This handler:

  • Removes spaces, tabs, and newlines from the beginning and end of text
  • Preserves whitespace between words

Example:

use Tenqz\Lingua\Handlers\Basic\TrimHandler;

$handler = new TrimHandler();
$result = $handler->handle("  Hello World  "); // Returns: "Hello World"

HtmlTagsHandler

Removes HTML tags from text while preserving the content between them. This handler:

  • Removes all HTML tags using PHP's strip_tags function
  • Preserves HTML entities
  • Normalizes multiple spaces into single spaces
  • Trims spaces from the beginning and end of text

Example:

use Tenqz\Lingua\Handlers\Basic\HtmlTagsHandler;

$handler = new HtmlTagsHandler();
$result = $handler->handle("<p>Hello</p> <div>World</div>"); // Returns: "HelloWorld"
$result = $handler->handle("<p>Hello &amp; World</p>"); // Returns: "Hello &amp; World"

Installation

composer require tenqz/lingua

Basic Usage

use Tenqz\Lingua\Core\TextProcessor;
use Tenqz\Lingua\Core\AbstractTextHandler;
use Tenqz\Lingua\Core\Contracts\TextHandlerInterface;

// Create a custom handler
class CustomHandler extends AbstractTextHandler
{
    protected function process(string $text): string
    {
        // Your text processing logic here
        return $text;
    }
}

// Initialize the processor
$processor = new TextProcessor();

// Add handlers to the chain
$processor->addHandler(new CustomHandler());

// Process text
$result = $processor->process('Your text here');

Chain of Handlers Example

You can combine multiple handlers to create a processing pipeline:

use Tenqz\Lingua\Core\TextProcessor;
use Tenqz\Lingua\Handlers\Basic\SpecialCharsHandler;
use Tenqz\Lingua\Handlers\Basic\NormalizeSpacesHandler;
use Tenqz\Lingua\Handlers\Basic\TrimHandler;

// Initialize the processor
$processor = new TextProcessor();

// Create a chain of handlers
$processor
    ->addHandler(new SpecialCharsHandler())    // First, remove special characters
    ->addHandler(new NormalizeSpacesHandler()) // Then, normalize spaces
    ->addHandler(new TrimHandler());           // Finally, trim whitespace

// Process text through the entire chain
$result = $processor->process('  Hello! @#$%^&*() World...  ');
// Result: "Hello World"

Architecture

The library consists of the following main components:

TextHandlerInterface

Interface that defines the contract for all text processing handlers:

interface TextHandlerInterface
{
    public function handle(string $text): string;
    public function setNext(TextHandlerInterface $handler): TextHandlerInterface;
    public function getNext(): ?TextHandlerInterface;
}

AbstractTextHandler

Abstract base class that implements the Chain of Responsibility pattern:

abstract class AbstractTextHandler implements TextHandlerInterface
{
    protected ?TextHandlerInterface $nextHandler = null;

    public function setNext(TextHandlerInterface $handler): TextHandlerInterface
    {
        $this->nextHandler = $handler;
        return $handler;
    }
    
    public function getNext(): ?TextHandlerInterface
    {
        return $this->nextHandler;
    }

    public function handle(string $text): string
    {
        $processedText = $this->process($text);
        
        if ($this->nextHandler !== null) {
            return $this->nextHandler->handle($processedText);
        }

        return $processedText;
    }

    abstract protected function process(string $text): string;
}

TextProcessor

Facade class for managing text processing chains:

class TextProcessor implements TextProcessorInterface
{
    private ?TextHandlerInterface $handlersChain = null;

    public function addHandler(TextHandlerInterface $handler): self
    {
        if (!$this->handlersChain) {
            $this->handlersChain = $handler;
            return $this;
        }

        $lastHandler = $this->handlersChain;
        while ($lastHandler->getNext()) {
            $lastHandler = $lastHandler->getNext();
        }
        
        $lastHandler->setNext($handler);
        return $this;
    }

    public function process(string $text): string
    {
        if (!$this->handlersChain) {
            throw new NotFoundHandlerException();
        }

        return $this->handlersChain->handle($text);
    }
}

Creating Custom Handlers

To create a custom handler, extend the AbstractTextHandler class and implement the process method:

use Tenqz\Lingua\Core\AbstractTextHandler;

class MyCustomHandler extends AbstractTextHandler
{
    protected function process(string $text): string
    {
        // Your text processing logic here
        return $text;
    }
}

Error Handling

The library throws the following exceptions:

  • NotFoundHandlerException: Thrown when attempting to process text with no registered handlers

Testing

composer test

Contributing

  1. Fork the repository
  2. Create your feature branch (git checkout -b feature/amazing-feature)
  3. Commit your changes (git commit -m 'Add some amazing feature')
  4. Push to the branch (git push origin feature/amazing-feature)
  5. Open a Pull Request

License

This project is licensed under the MIT License