cviniciussdias / google-crawler
A simple Crawler for getting Google results
Requires
- php: >=7.2
- ext-dom: *
- ext-ds: *
- guzzlehttp/guzzle: ^6.3
- symfony/dom-crawler: ^4.2
Requires (Dev)
- infection/infection: ^0.13.0
- nunomaduro/phpinsights: ^1.0
- phan/phan: ^1.3
- phpstan/phpstan: ^0.11.6
- phpstan/phpstan-phpunit: ^0.11.1
- phpunit/phpunit: ^8.0
- sebastian/phpcpd: ^4.1
- squizlabs/php_codesniffer: ^3.0
Suggests
- php-ds/php-ds: Allow IDE autocomplete for editing the component
README
A simple Crawler for getting Google results.
This component can be used to retrieve the 100 first results for a search term.
Since google detects a crawler and blocks the IP when several requests are made, this component is prepared to use some online proxy services, such as hide.me.
Installation
Install the latest version with
$ composer require cviniciussdias/google-crawler
Usage
Crawler class constructor prototype
CViniciusSDias\GoogleCrawler\Crawler::__construct(
SearchTermInterface $searchTerm, GoogleProxyInterface $proxy = null,
string $googleDomain = 'google.com', string $countryCode = ''
)
Parameters
- $searchTerm Term that will be searched on Google
- $proxy Online proxy service that will be used to access Google [optional]
- $googleDomain Your country specific google domain, like google.de, google.com.br, etc. [optional]
- $countryCode Country code that will be added to
gl
parameter on Google's url, indicating the location of the search. E.g. 'BR', 'US', 'DE' [optional]
Examples
Without proxy
<?php use CViniciusSDias\GoogleCrawler\{ Crawler, SearchTerm }; $searchTerm = new SearchTerm('Test'); $crawler = new Crawler($searchTerm); // or new Crawler($searchTerm, new NoProxy()); $resultList = $crawler->getResults();
With some proxy
<?php use CViniciusSDias\GoogleCrawler\{ Crawler, SearchTerm, Proxy\CommonProxy }; $searchTerm = new SearchTerm('Test'); $commonProxy = new CommonProxy('https://us.hideproxy.me/includes/process.php?action=update'); $crawler = new Crawler($searchTerm, $commonProxy); $resultList = $crawler->getResults();
More details on proxies
To know more details about which proxies are currently
supported, see the files inside tests/Functional
folder.
There you'll see all the available proxies.
Iterating over results
foreach ($resultList as $result) { $title = $result->getTitle(); $url = $result->getUrl(); $description = $result->getDescription(); }
About
Requirements
- This component works with PHP 7.2 or above
- This component requires the extension php-ds to be installed
Author
Vinicius Dias (ZCE) - carlosv775@gmail.com - https://github.com/CViniciusSDias/ - http://www.zend.com/en/yellow-pages/ZEND030134
License
This component is licensed under the GPL v3.0 License - see the LICENSE
file for details