denis-kisel / casper-curl
A phantomjsCURL for get content of difficult sites
Requires
- php: >=7.2
- ext-json: *
- rap2hpoutre/similar-text-finder: ^1.1
README
Basics on casperjs / phantomjs libs for get content difficult sites.
Installation
1 Install global casperjs and phantomjs
npm install -g casperjs
npm install -g phantomjs
# If phantomjs is running with errors
npm install -g phantomjs --ignore-scripts
2 Install CasperCURL package
composer require denis-kisel/casper-curl
Publish Configuration File(If Use Laravel)
If you use another framework or native PHP, just skip this setting.
php artisan vendor:publish --provider="DenisKisel\CasperCURL\ServiceProvider" --tag="config"
Usage
Simple example
//Return content page $casperCURL = new \DenisKisel\CasperCURL\CasperCURL($storageDir); $response = $casperCURL->to('https://google.com')->request()
Set Method
method($method)
Methods available: GET|POST|PUT|DELETE
By default use GET
$casperCURL = new \DenisKisel\CasperCURL\CasperCURL($storageDir); $response = $casperCURL->to('https://google.com') ->method('POST') ->request()
Set Data
withData($arrayData)
$casperCURL = new \DenisKisel\CasperCURL\CasperCURL($storageDir); $response = $casperCURL->to('https://google.com') ->withData([ 'login' => '***', 'pass' => '***' ]) ->method('POST') ->request()
Set Headers
withHeaders($arrayHeaders)
$casperCURL = new \DenisKisel\CasperCURL\CasperCURL($storageDir); $response = $casperCURL->to('https://google.com') ->withHeaders([ 'User-Agent' => 'Mozilla/5.0 (X11; Ubuntu; Linux x86_64; rv:26.0) Gecko/20100101 Firefox/26.0', 'Accept' => 'text/html,application/xhtml+xml,application/xml;q=0.9,*/*;q=0.8' ]) ->request()
Set UserAgent
userAgent($userAgent)
By default use: Mozilla/5.0 (Windows NT 10.1; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/61.0.3163.100 Safari/537.36
$casperCURL = new \DenisKisel\CasperCURL\CasperCURL($storageDir); $response = $casperCURL->to('https://google.com') ->userAgent('Mozilla/5.0 (X11; Ubuntu; Linux x86_64; rv:26.0) Gecko/20100101 Firefox/26.0') ->request()
Use Proxy
withProxy($ip, $port [, $method = 'http'] [, $login = null] [, $pass = null])
Methods available: http|socks5|none
$casperCURL = new \DenisKisel\CasperCURL\CasperCURL($storageDir); $response = $casperCURL->to('https://google.com') ->withProxy($ip, $port) ->request()
Use Cookies
withCookie($fileName, [, $dir])
By default cookie is disabled
.
By default cookies file is stored in storage dir.
$casperCURL = new \DenisKisel\CasperCURL\CasperCURL($storageDir); $response = $casperCURL->to('https://google.com') ->withCookie('cookie.txt') ->request()
Use WindowSize(ViewPort)
windowSize($with, $height)
By default: width/height: 1920/1080
px
$casperCURL = new \DenisKisel\CasperCURL\CasperCURL($storageDir); $response = $casperCURL->to('https://google.com') ->windowSize(320, 600) ->request()
Phantom Cli Options
Set custom phantom cli options
List of available options: Phantom Options Doc
withPhantomOptions($arrayOptions)
Key of option must not contain
a prefix --
$options = [ 'debug' => 'true', 'ignore-ssl-errors' => 'true' ]; $casperCURL = new \DenisKisel\CasperCURL\CasperCURL($storageDir); $response = $casperCURL->to('https://google.com') ->withPhantomOptions($options) ->request()
CasperJS
For use dynamic handling content
Casper Doc
Use Casper Then
casperThen($jsScript)
DOC
$casperCURL = new \DenisKisel\CasperCURL\CasperCURL($storageDir); $response = $casperCURL->to('http://google.fr') ->casperThen(' this.fill('form[action="/search"]', { q: 'casperjs' }, true); this.wait(2000, function () { this.capture('step_1.png'); }); ') ->request()
Use Custom Casper JS
Custom casper body js
DOC
$casperCURL = new \DenisKisel\CasperCURL\CasperCURL($storageDir); $response = $casperCURL->to('http://google.fr') ->customCasper(' casper.then(function() { this.fill('form[action="/search"]', { q: 'casperjs' }, true); this.wait(2000, function () { this.capture('step_1.png'); }); }); ') ->request()
Debug
enableDebug()
Will be store response data and capture in storage dir
$casperCURL = new \DenisKisel\CasperCURL\CasperCURL($storageDir); $response = $casperCURL->to('http://google.com') ->enableDebug() ->request()
Response
Response is object with fields:
- status (exp. 200|404|500)
- content (string html|dom|txt)
$casperCURL = new \DenisKisel\CasperCURL\CasperCURL($storageDir); $response = $casperCURL->to('http://google.com') ->request(); $response->status; $response->content;
Response Content
By default request response full page content
DOC
But response can override by output
variable
$casperCURL = new \DenisKisel\CasperCURL\CasperCURL($storageDir); $response = $casperCURL->to('http://google.fr') ->casperThen(' this.fill('form[action="/search"]', { q: 'casperjs' }, true); this.wait(2000, function () { this.capture('step_1.png'); }); output = console.log('Override default output!'); ') ->request()
Use In Laravel
$response = \DenisKisel\CasperCURL\LCasperCURL::to('https://google.com')->request()
License
This package is open-sourced software licensed under the MIT license
Contact
Developer: Denis Kisel
- Email: denis.kisel92@gmail.com
- Skype: live:denis.kisel92