Visual regression testing made easy
Automating the boring stuff.
Download and install Vagrant.
- PHP 7.0
- A (maybe big) list of urls to crawl. Robots.txt and Sitemaps detected automatically
- Download and unzip the package:
curl -L -# -C - -O "https://github.com/alex-moreno/glitcherbot/archive/main.zip" unzip main.zip cd glitcherbot-main
Make a copy of your config.php
cp config.sample.php config.php
Create a .csv which contains a list of urls to iterate over (see example.csv).
If using Acquia Site Factory, a command is supplied to generate a list of sites from a sites.json file. You'll need to:
- Download the sites.json in your Acquia Cloud subscription
scp [subscription][ENV].[ENV]@[subscription][ENV].ssh.enterprise-g1.acquia-sites.com:/mnt/files/[subscription][ENV]/files-private/sites.json ./sites-dev.json
Vagrant up if you want to use the crawler inside the virtual machine (recommended).
Run the crawl against that json
php bin/visual_regression_bot.php acquia:acsf-crawl-sites sites.json
You can see all available commands by running:
For help with a specific command use:
php bin/visual_regression_bot.php help <command>
Whilst debugging, increase verbosity by adding a number of
-v : verbose
-vv : very verbose
-vvv : debug
There are some settings that you can configure, like the headers that you'll send to the site or the concurrency that you want to use.
Move your config.sample.php into config.php and adapt to your needs. For example:
<?php return [ 'headers' => [ 'User-Agent' => 'GlitcherBotScrapper/0.1', 'Accept' => 'application/json', ], 'http_errors' => false, 'connect_timeout' => 0, // wait forever 'read_timeout' => 0, 'timeout' => 0, // wait forever 'concurrency' => 60, ];
Note: The higher the concurrency is configured, the more sites it will run on each step, but be careful, php is fast (contrary to popular belief), it could send high load to a site and put it in trouble. Big power means bigger responsibility.
To run the regression tool as a stand along interface you need to point your webserver at the html/ directory in the repo.
A composer script has been included to aid with testing of the tool. To run this use the command.
Then navigate to the following address in your browser.
A docker setup has been included to aid with the running or the tool.
Download and install Docker
This command will start the containers
This command will check if the config file exists and create one if needed. Then it will install all Composer dependencies.
This command will use sample-sites.csv as source of urls to Crawl by default.
To run the command with a different file, use the syntax
make crawl SITES_CSV=path_to_sites_csv
or for Acquia json files:
make crawl-acquia SITES_JSON=sitesd8-prod.json
Keep in mind that the crawl runs within the container, so
path_to_sites_csv needs to be relative to the container.
Opens the tool on the browser.
You can run all commands at once, for example the following command will start the containers, build, craw and open the browser.
make up build crawl open
This will include all sitemaps in the website, if they are referenced from the robots.txt
Using makefile and Docker:
make crawl SITES_CSV=sample-sites.csv INCLUDE_SITEMAPS=yes
make crawl SITES_CSV=sample-sites.csv FORCE_SITEMAPS=yes