nicksagona/pop-spider

A simple web spider for SEO analysis.

4.0.2 2023-09-03 04:53 UTC

This package is auto-updated.

Last update: 2024-10-28 17:31:20 UTC


README

pop-spider is a simple CLI-driven web spider for SEO analysis that uses components from the Pop PHP Framework. It parses SEO-pertinent data from a website and produces a HTML-based report of what was parsed as well as an sitemap.xml file.

RELEASE INFORMATION

pop-spider 4.0.0 Release
August 12, 2023

INSTALLATION

$ composer create-project nicksagona/pop-spider pop-spider

QUICK USE

$ cd pop-spider/script
$ ./spider crawl http://www.mydomain.com/

OVERVIEW

By default, the spider parses the following elements and their SEO-pertinent attributes:

  • title
  • meta
    • name
    • content
  • a
    • href
    • title
    • rel
    • name
    • value
  • img
    • src
    • title
    • alt
  • h1
  • h2
  • h3

You can parse additional tags via the --tags= option.

$ ./spider help                                                 Display this help screen.
$ ./spider crawl [--dir=] [--tags=] [--speed=] [--save] <url>   Crawl the URL.

The optional [--dir=] parameter allows you to set the output directory for the results report.
The optional [--tags=] parameter allows you to set additional tags to scan for in a comma-separated list.
The optional [--speed=] parameter will throttle the speed between each request in seconds.
The optional [--save] parameter will save the site files into a directory

Example:

$ ./spider crawl --dir=seo-report --tags=b,u --speed=5 --save http://www.mydomain.com/