zfr / zfr-prerender
Integration with prerender.io service
Installs: 7 854
Dependents: 0
Suggesters: 0
Security: 0
Stars: 21
Watchers: 5
Forks: 6
Open Issues: 1
Requires
- php: >=5.4
- zendframework/zend-http: ~2.2
- zendframework/zend-servicemanager: ~2.2
Requires (Dev)
- phpunit/phpunit: ~3.7
- satooshi/php-coveralls: ~0.6
- squizlabs/php_codesniffer: 1.5.*
- zendframework/zendframework: ~2.2
README
Are you using Backbone, Angular, EmberJS, etc, but you're unsure about the SEO implications?
This Zend Framework 2 module uses Prerender.io to dynamically render your JavaScript pages in your server using PhantomJS.
Installation
Install the module by typing (or add it to your composer.json
file):
$ php composer.phar require zfr/zfr-prerender:3.*
Documentation
How it works
- Check to make sure we should show a prerendered page
- Check if the request is from a crawler (either agent string or by detecting escaped_fragment query param)
- Check to make sure we aren't requesting a resource (js, css, etc...)
- (optional) Check to make sure the url is in the whitelist
- (optional) Check to make sure the url isn't in the blacklist
- Make a
GET
request to the prerender service (PhantomJS server) for the page's prerendered HTML - Return that HTML to the crawler
Customization
ZfrPrerender comes with sane default, but you can customize the module by copying the
config/zfr_prerender.global.php.dist
file to your autoload
folder
(remove the .dist
extension), and modify it to suit your needs.
Prerender URL
By default, ZfrPrerender uses the Prerender.io service deployed at http://service.prerender.io
. However, you
may want to deploy it on your own server. To that
extent, you can customize ZfrPrerender to use your server using the following configuration:
return array( 'zfr_prerender' => array( 'prerender_url' => 'http://myprerenderservice.com' ) );
With this config, here is how ZfrPrerender will proxy the "https://google.com" request:
GET
http://myprerenderservice.com/https://google.com
Crawler user-agents
ZfrPrerender decides to pre-render based on the User-Agent string to check if a request comes from a bot or not. By
default, those user agents are registered: baidu
, facebookexternalhit
and twitterbot
.
GoogleBot, Yahoo and BingBot are not in the list starting from ZfrPrerender 2.0 as those search engines support the escaped_fragment approach, and we want to ensure people are not penalized for cloacking.
You can add other User-Agent string to evaluate using this sample configuration:
return array( 'zfr_prerender' => array( 'crawler_user_agents' => array('yandex', 'msnbot') ) );
Note: ZfrPrerender also supports the detection of a crawler through the user of the
_escaped_fragment_
query param. You can learn more about this on Google's website.
Ignored extensions
ZfrPrerender is configured by default to ignore all the requests for resources with those extensions: .css
,
.gif
, .jpeg
, .jpg
, .js
, .png
, .less
, .pdf
, .doc
, .txt
, .zip
, .mp3
, .rar
, .exe
, .wmv
,
.doc
, .avi
, .ppt
, .mpg
, .mpeg
, .tif
, .wav
, .mov
, .psd
, .ai
, .xls
, .mp4
, .m4a
, .swf
,
.dat
, .dmg
, .iso
, .flv
, .m4v
, .torrent
. Those are never pre-rendered.
You can add your own extensions using this sample configuration:
return array( 'zfr_prerender' => array( 'ignored_extensions' => array('.less', '.pdf') ) );
Whitelist
Whitelist a single url path or multiple url paths. Compares using regex, so be specific when possible. If a whitelist is supplied, only url's containing a whitelist path will be prerendered.
Here is a sample configuration that only pre-render URLs that contains "/users/":
return array( 'zfr_prerender' => array( 'whitelist_urls' => array('/users/*') ) );
Note: remember to specify URL here and not ZF2 route names. This occur because ZfrPrerender registers a listener that happen very early in the MVC process, before the routing is actually done.
Blacklist
Blacklist a single url path or multiple url paths. Compares using regex, so be specific when possible. If a blacklist is supplied, all url's will be pre-rendered except ones containing a blacklist part. Please note that if the referer is part of the blacklist, it won't be pre-rendered too.
Here is a sample configuration that prerender all URLs excepting the ones that contains "/users/":
return array( 'zfr_prerender' => array( 'blacklist_urls' => array('/users/*') ) );
Note: remember to specify URL here and not ZF2 route names. This occur because ZfrPrerender registers a listener that happen very early in the MVC process, before the routing is actually done.
Events
ZfrPrerender\Mvc\PrerenderListener
triggers two events:
ZfrPrerender\Mvc\PrerenderEvent::EVENT_PRERENDER_PRE
: this event is triggered before actually making the request to Prerender service. If you return aZend\Http\Response
object from the listener attached to this event, it will immediately return this response, hence avoiding a new request to the Prerender service.ZfrPrerender\Mvc\PrerenderEvent::EVENT_PRERENDER_POST
: this event is triggered once the response from the Prerender service is made. This allows you to cache it (for instance in Memcached).
Listeners attached to those two events receive an instance of ZfrPrerender\Mvc\PrerenderEvent
. Here is an example
that shows you how to register listeners using the shared event manager. In your Module.php
class:
use ZfrPrerender\Mvc\PrerenderEvent; public function onBootstrap(MvcEvent $event) { $eventManager = $event->getTarget()->getEventManager(); $sharedManager = $eventManager->getSharedManager(); $sharedManager->attach( 'ZfrPrerender\Mvc\PrerenderListener', PrerenderEvent::EVENT_PRERENDER_PRE, array($this, 'prerenderPre') ); $sharedManager->attach( 'ZfrPrerender\Mvc\PrerenderListener', PrerenderEvent::EVENT_PRERENDER_POST, array($this, 'prerenderPost') ); } public function prerenderPre(PrerenderEvent $event) { $request = $event->getRequest(); // Check from your cache if you have already the content // $content = ... $response = new Response(); $response->setStatusCode(200); $response->setContent($content); return $response; } public function prerenderPost(PrerenderEvent $event) { // This is the response we get from the Prerender service $response = $event->getResponse(); // You could get the body and put it in cache // ... }
Testing
If you want to make sure your pages are rendering correctly:
- Open the Developer Tools in Chrome (Cmd + Atl + J)
- Click the Settings gear in the bottom right corner.
- Click "Overrides" on the left side of the settings panel.
- Check the "User Agent" checkbox.
- Choose "Other..." from the User Agent dropdown.
- Type googlebot into the input box.
- Refresh the page (make sure to keep the developer tools open).