eZ Platform bundle which allows to index the content of binary files
It was possible to use third-party binaries to index binary files in eZ Publish. This functionality is missing in the latest eZ Platform versions. And this bundle provides it.
Also, it provides an example of a binary extractor for PDF files. Which uses pdftotext third-party binary.
composer require contextualcode/ezplatform-search-binary-extractor
First of all, please double check if "Searchable" checkbox is checked for binary file field types that need to be searchable.
After the bundle is installed, all the PDF files content will be indexed. And you would need to rebuild the search index by running:
php bin/console ezplatform:reindex
Also it is possible to build your own custom binary extractors. You just need to follow a few simple steps:
Tag your service with