techecosystem / parsify-php
A PHP library for Persian text conversion, including number translation, diacritics removal, and normalization with a fluent API.
Requires
- php: >=8.2
Requires (Dev)
- friendsofphp/php-cs-fixer: ^3.67
- phpunit/phpunit: ^11.5
README
The Persian Converter Library is a powerful tool designed to handle Persian text conversion, including removing diacritics, converting numbers between English and Persian, and other text normalizations. This library provides an easy-to-use interface for developers to work with Persian text more effectively.
Features
- Convert English numbers to Persian and vice versa.
- Remove diacritics from Persian text.
- Highly customizable through a fluent builder pattern.
- Lightweight and efficient.
Installation
To install the library, you can use Composer:
composer require techecosystem/parsify-php
API Documentation
PersianConverterBuilder Class
The PersianConverterBuilder
class provides a fluent interface to create a PersianConverter
with customizable strategies for text normalization and number conversion.
Methods
-
static create(): self
-
Creates and returns a new instance of the builder.
-
Example:
$builder = PersianConverterBuilder::create();
-
-
withNumberConversion(bool $keepEnglishNumbers = false): self
-
withTextNormalization(bool $keepPersianDiacritic = true): self
-
build(): PersianConverter
-
Builds and returns a
PersianConverter
instance with the configured strategies. -
Throws:
MissingStrategyException
If no strategies are enabled, i.e., both text normalization and number conversion are disabled.
-
Example:
$converter = $builder->build();
-
PersianConverter Class
The PersianConverter
class applies various conversion strategies to normalize Persian text and convert numbers.
Methods
-
static createDefault(): self
-
Creates a default
PersianConverter
with text normalization and Persian number conversion enabled. -
Example:
$converter = PersianConverter::createDefault();
-
-
convert(string $input): string
-
Applies all strategies to the input string and returns the converted text.
-
Throws:
TextConversionException
on conversion failure.
-
Example:
$convertedText = $converter->convert("متن فارسی 123");
-
PersianTextService Class
The PersianTextService
class provides utility methods for text normalization using PersianConverter
with various configurations.
Methods
-
static normalize(string $input): string
Normalizes the text with default settings (text normalization and Persian number conversion).-
Example:
$normalizedText = PersianTextService::normalize("متن فارسی 123");
-
-
static normalizeTextWithEnglishNumbers(string $input): string
-
Normalizes the text while keeping English numbers.
-
Example:
$normalizedText = PersianTextService::normalizeTextWithEnglishNumbers("متن فارسی 123");
-
-
static normalizeTextWithoutNumbers(string $input): string
-
Normalizes the text without converting numbers.
-
Example:
$normalizedText = PersianTextService::normalizeTextWithoutNumbers("متن فارسی 123");
-
Examples
-
Simple Text Normalization
echo PersianTextService::normalize("𞸮ﻼم 123 يوﺱ𞺰"); // Outputs: "سلام ۱۲۳ یوسف"
-
Normalization with English Numbers
echo PersianTextService::normalizeTextWithEnglishNumbers("𞸮ﻼم 123 يوﺱ𞺰"); // Outputs: "سلام 123 یوسف"
-
Normalization without Number Conversion
echo PersianTextService::normalizeTextWithoutNumbers("𞸮ﻼم 123 يوﺱ𞺰"); // Outputs: "سلام 123 یوسف"
-
Custom Converter: Combining Multiple Conversions
The library allows chaining multiple conversion rules in one go:
$converter = PersianConverterBuilder::create() ->withTextNormalization(false) ->withNumberConversion() ->build(); echo $converter->convert("𞸮ﻼم 123 يوﺱ𞺰"); // Outputs: "سلام ۱۲۳ یوسف"
-
Removing Diacritics
You can also remove diacritics from Persian text using the following approach:
$converter = PersianConverterBuilder::create() ->withTextNormalization(keepPersianDiacritic: false) ->build(); $input = "حَتماً"; echo $converter->convert($input); // Outputs: حتما
Contributing
Please see CONTRIBUTING for details.
License
This library is open-source software licensed under the MIT license.