webignition/website-sitemap-finder

Finds the sitemap(.xml) for a given website

1.2 2017-10-04 11:41 UTC

README

Overview

Find the URLs for sitemaps for a given site. URLs are extracted from robots.txt. If none are present, sitemap.xml and sitemap.txt are assumed.

Usage

The "Hello World" example

<?php
use webignition\WebsiteSitemapFinder\Configuration;
use webignition\WebsiteSitemapFinder\WebsiteSitemapFinder;

$configuration = new Configuration([
    Configuration::KEY_ROOT_URL => 'http://google.com/',
]);

$finder = new WebsiteSitemapFinder($configuration);        
$sitemapUrls = $finder->findSitemapUrls();

$this->assertEquals($sitemapUrls, [
    'http://www.gstatic.com/culturalinstitute/sitemaps/www_google_com_culturalinstitute/sitemap-index.xml',
    'http://www.gstatic.com/s2/sitemaps/profiles-sitemap.xml',
    'https://www.google.com/sitemap.xml',
]);

Building

Using as a library in a project

If used as a dependency by another project, update that project's composer.json and update your dependencies.

"require": {
    "webignition/website-sitemap-finder": "*"      
}

Developing

This project has external dependencies managed with composer. Get and install this first.

# Make a suitable project directory
mkdir ~/website-sitemap-finder && cd ~/website-sitemap-finder

# Clone repository
git clone git@github.com:webignition/website-sitemap-finder.git.

# Retrieve/update dependencies
composer.phar install

Testing

Have look at the project on travis for the latest build status, or give the tests a go yourself.

cd ~/website-sitemap-finder
composer.phar test