ottosmops / pdftotext
Extract text from PDF
Installs: 111 805
Dependents: 0
Suggesters: 0
Security: 0
Stars: 4
Watchers: 4
Forks: 1
Open Issues: 0
Requires
- php: >= 7.2
- symfony/process: >=4.2
Requires (Dev)
- phpunit/phpunit: >=8.2
README
This package provides a class to extract text from a pdf.
For PHP 5.6 use Version 1.0.3
\Ottosmops\Pdftotext\Extract::getText('/path/to/file.pdf') //returns the text from the pdf
Requirements
The Package uses pdftotext. Make sure that this is installed: which pdftotext
For Installation see: poppler-utils
If the installed binary is not found ("The command "which pdftotext" failed.
") you can pass the full path to the _constructor
(see below) or use putenv('PATH=$PATH:/usr/local/bin/:/usr/bin')
(with the dir where pdftotext lives) before you call the class Extract
.
Installation
composer require ottosmops/pdftotext
Usage
Extracting text from a pdf:
$text = (new Extract()) ->pdf('file.pdf') ->text();
You can set the binary and you can specify options:
$text = (new Extract('/path/to/pdftotext')) ->pdf('path/to/file.pdf') ->options('-layout') ->text();
Default options are: -eol unix -enc UTF-8 -raw
License
The MIT License (MIT). Please see License File for more information.