becklyn/search-text-transformer

A library that extracts plain text from HTML for usage in search engines (like Elasticsearch)

2.0.0 2022-12-12 10:40 UTC

This package is auto-updated.

Last update: 2024-12-12 14:49:22 UTC


README

Transforms HTML to searchable plain text for usage in conjunction with a search engine (like Elasticsearch).

Installation

Install via composer.

Usage

<?php

use Becklyn\SearchText\SearchTextTransformer;

$transformer = new SearchTextTransformer();
$plain = $transformer->transform("<p>Some HTML content</p>");

Testing

All test cases belong into tests/fixtures and must have the file extension .test.

The test format is:

--TEST--
Here is a plain text description of this test.
--HTML--
<p>Some html.</p>
--EXPECT--
The expected result.

The --TEST-- segment is optional.