tiefan/google-pdf-scraper

A php library to filter pdf documents in google driver for Daniel Fischl

v0.17 2019-01-13 15:22 UTC

This package is auto-updated.

Last update: 2024-04-14 02:19:06 UTC


README

This is a php library to filter pdf documents in google driver for Daniel Fischl.

To import this into your project, use composer.

composer require tiefan/google-pdf-scraper

Extract text from PDF document

$text = PdfScraper::textFromDriveId(string $fileId);
$text = PdfScraper::textFromDriveUrl(string $url);

Check Document with "Begin" and "End" Keyword

$isThatDocument = PdfScraper::checkKeywordsFromDriveId(string $fileId, string $begin, string $end = null);
$isThatDocument = PdfScraper::checkKeywordsFromDriveUrl(string $url, string $begin, string $end = null);
$scraper = new PdfScraper($doc, $isURL = true); // $isURL: true for url, false for id
$isThatDocument = $scraper->checkKeywords(string $begin, string $end = null);

Using MySQL or MariaDB to process data at once

Following code is using db schema in Sample\db_pdf_scraper.sql

$pdfDB = new PdfDB($host, $username, $password, $database);
$processed_count = $pdfDB->checkPdfs();