nicksagona/pop-spider

A simple web spider for SEO analysis.

3.0.0 2017-02-27 19:50 UTC

README

pop-spider is a simple CLI-driven web spider for SEO analysis that uses components from the Pop PHP Framework. It parses SEO-pertinent data from a website and produces a HTML-based report of what was parsed as well as an sitemap.xml file.

RELEASE INFORMATION

pop-spider 3.0.0 Release
February 27, 2017

INSTALLATION

$ composer create-project nicksagona/pop-spider pop-spider

QUICK USE

$ cd pop-spider/script
$ ./spider crawl http://www.mydomain.com/

OVERVIEW

By default, the spider parses the following elements and their SEO-pertinent attributes:

  • title
  • meta
    • name
    • content
  • a
    • href
    • title
    • rel
    • name
    • value
  • img
    • src
    • title
    • alt
  • h1
  • h2
  • h3

You can parse additional tags via the --tags= option.

$ ./spider help				                Display this help screen.
$ ./spider crawl [--dir=] [--tags=]	<url>   Crawl the URL.

The optional [--dir=] parameter allows you to set the output directory for the results report.
The optional [--tags=] parameter allows you to set additional tags to scan for in a comma-separated list.

Example:

$ ./spider crawl http://www.mydomain.com/ --dir=seo-report --tags=b,u