spatie/mixed-content-scanner-cli

A tool to scan sites for mixed content

1.2.0 2018-05-22 08:26 UTC

README

Latest Version on Packagist Build Status SensioLabsInsight Quality Score StyleCI Total Downloads

This repo contains a tool called mixed-content-scanner that can help you find pieces of mixed content on your site. This is how you can use it:

mixed-content-scanner scan https://spatie.be

And of course our company site reports no mixed content.

spatie

Here's an example of a local test server that does contain some mixed content:

mixed

Installation

You can install the package via composer:

composer global require spatie/mixed-content-scanner-cli

How it works under the hood

When scanning a site, the tool will crawl every page. On all html retrieved, these elements and attributes will be checked:

  • audio: src
  • embed: src
  • form: action
  • link: href
  • iframe: src
  • img: src, srcset
  • object: data
  • param: value
  • script: src
  • source: src, srcset
  • video: src

If any of those attributes start with http:// the element will be regarded as mixed content.

The tool does not scan linked .css or .js files. Inline <script> or <style> are not taken into consideration.

Usage

You can scan a site by using the scan command followed by the url

mixed-content-scanner scan https://example.com

Options

SSL verification

You might want to check your site for mixed content before actually launching it. It's quite common your site doesn't have an ssl certificate installed yet at that point. That's why by default the tool will not verify ssl certificates.

If you want to turn on ssl verification just use the verify-ssl option

mixed-content-scanner scan https://self-signed.badssl.com/ --verify-ssl

That examples will result in non responding urls because the host does not have a valid ssl certificate

Filtering and ignoring urls

You can filter which urls are going to be crawled by passing regex to the filter and ignore options.

In this example we are only going to crawl pages starting with /en.

mixed-content-scanner scan https://spatie.be --filter="^\/en"

You can use multiple filters:

mixed-content-scanner scan https://spatie.be --filter="^\/en" --filter="^\/nl"

You can also ignore certain urls. Here we are going to ignore all url's that contain the word opensource.

mixed-content-scanner scan https://spatie.be --ignore="opensource"

Of course you can also combine filters and ignores:

mixed-content-scanner scan https://spatie.be --filter="^\/en" --ignore="opensource"

Ignoring robots

By default, the crawler will respect robots data. You can ignore them though with the --ignore-robots option.

mixed-content-scanner scan https://example.com --ignore-robots

Changelog

Please see CHANGELOG for more information what has changed recently.

Testing

composer test

Contributing

Please see CONTRIBUTING for details.

Security

If you discover any security related issues, please email freek@spatie.be instead of using the issue tracker.

Postcardware

You're free to use this package, but if it makes it to your production environment we highly appreciate you sending us a postcard from your hometown, mentioning which of our package(s) you are using.

Our address is: Spatie, Samberstraat 69D, 2060 Antwerp, Belgium.

We publish all received postcards on our company website.

Credits

The scanner is inspired by mixed-content-scan by Bram Van Damme. Parts of his readme and code were used.

Support us

Spatie is a webdesign agency based in Antwerp, Belgium. You'll find an overview of all our open source projects on our website.

Does your business depend on our contributions? Reach out and support us on Patreon. All pledges will be dedicated to allocating workforce on maintenance and new awesome stuff.

License

The MIT License (MIT). Please see License File for more information.