tiefan / google-pdf-scraper
A php library to filter pdf documents in google driver for Daniel Fischl
Installs: 37
Dependents: 0
Suggesters: 0
Security: 0
Stars: 0
Watchers: 0
Forks: 0
Open Issues: 0
pkg:composer/tiefan/google-pdf-scraper
Requires
- php: ^7.2
- ext-json: *
- ext-mysqli: *
- google/apiclient: ^2.0
- smalot/pdfparser: ^0.13.2
README
This is a php library to filter pdf documents in google driver for Daniel Fischl.
To import this into your project, use composer.
composer require tiefan/google-pdf-scraper
Extract text from PDF document
$text = PdfScraper::textFromDriveId(string $fileId);
$text = PdfScraper::textFromDriveUrl(string $url);
Check Document with "Begin" and "End" Keyword
$isThatDocument = PdfScraper::checkKeywordsFromDriveId(string $fileId, string $begin, string $end = null);
$isThatDocument = PdfScraper::checkKeywordsFromDriveUrl(string $url, string $begin, string $end = null);
$scraper = new PdfScraper($doc, $isURL = true); // $isURL: true for url, false for id $isThatDocument = $scraper->checkKeywords(string $begin, string $end = null);
Using MySQL or MariaDB to process data at once
Following code is using db schema in Sample\db_pdf_scraper.sql
$pdfDB = new PdfDB($host, $username, $password, $database);
$processed_count = $pdfDB->checkPdfs();