A set of useful classes and utilities to convert html to AMP html (See https://www.ampproject.org/)
An open source PHP library and console utility to convert HTML to AMP HTML and report HTML compliance with the AMP HTML specification.
The AMP PHP Library is an open source and pure PHP Library that:
- Works with whole or partial HTML documents (or strings). Specifically, the AMP PHP Library:
- Specifically, the PHP validator supports tag specification validation, attribute specification validation, CDATA validation, CSS validation, layout validation, template validation and attribute property-value pair validation. It will report tags and attributes that are missing, illegal, mandatory according to spec but not present, unique according to spec but multiply present, having wrong parents or ancestors or children and so forth.
- Note: while the AMP PHP library (already) supports many of the features and capabilities of the canonical validator, it is not intended to achieve parity in every respect with the canonical validator. Even within the features we support (e.g. CSS validation) there may be certain validation issues that we don't flag but the canonical validator does.
- Using the feedback given by the in-house PHP validator, the AMP PHP library tries to "correct" some issues found in the HTML to make it more AMP HTML compliant. This would, for example, involve:
- Removing illegal attributes e.g.
- Removing all kinds of illegal tags e.g.
<body>tag, a tag with a disallowed ancestor, a duplicate unique tag etc.
- Removing illegal property value pairs e.g. removing
<meta name="viewport" content="width=device-width,minimum-scale=hello">
- Adding or correcting the tags necessary for a minimally valid AMP document:
link rel=canonicaltag if you let the library know the canonical path of the document
- Boilerplate CSS
- If there are mutually exclusive attributes for a tag, removing all but one of them
- Fixing issues with
amp-imgtags that have problems like inconsistent units, invalid attributes, missing mandatory attributes, invalid implied or specified layouts.
- The library does a decent job of removing bad things and in a few cases makes some corrections/additions to the HTML. As the library cannot understand the true intention of the user, a lot the validation problems in the HTML may eventually need to be fixed manually by the human.
- In general, the library will try to fix validation errors in
<head>and if its not successful in doing so, remove those tags from
<body>the AMP PHP library is less aggressive and in most cases will not remove the tag from the document if the tag does not validate after it attempts any fixes on it.
- The library needs to be provided with well formed HTML / HTML5. Please don't give it faulty, incorrect html (e.g. non closed
<div>tags etc). The correction it does is related to AMP HTML standard issues only. Use a HTML tidying library if you expect your HTML to be malformed.
- Removing illegal attributes e.g.
- Converts some non-amp elements to their AMP equivalents automatically
<img>tag is converted to an
<iframe>tag is converted to an
<audio>tag is converted to an
<video>tag is converted to an
- Twitter embed code for tweets is converted to an
- Instagram embed code for instagrams is converted to an
- Youtube embed code for videos is converted to an
- Dailymotion embed code for videos is converted to an
- Pinterest embed code for pins is converted to an
- Soundcloud embed code for audio music is converted to an
- Vimeo embed code for videos is converted to an
- Vine embed code for videos is converted to an
- Some of these embed code conversions may not have the advanced features you may require. File an issue if you need enhancements to the functionality already provided or new embed code conversions
- Some of the embed codes have an associated
<script>tag. These conversions will work even if no
<script>tag was added to your HTML document. The AMP library will add the appropriate AMP component
<script>tag to the
<head>if it is provided a full html document.
- You may experiment with the command line utility
amp-consoleon the above HTML fragments to see how the converted HTML looks
- Provides both a console and programmatic interface with which to call the library. It works like this: the developer first provides some HTML. After processing it, the library returns:
- The AMPized HTML
- A list of validation errors in the HTML provided
- A description of fixes and embed code conversions made to the HTML
- Currently the AMP PHP Library is used by the Drupal AMP Module to report issues with user entered, arbitrary HTML (originating from Rich Text Editors) and converting the HTML to AMPized HTML (as much as possible)
- The AMP PHP Library command line validator can be used for experimentation and to do HTML to AMP HTML conversion of HTML files. While the canonical validator only validates, our library tries to make corrections too. As noted above, our validator is a subset of the canonical validator but already covers a lot of cases
- The AMP PHP Library can be used in any other PHP project to "convert" HTML to AMP HTML and report validation issues. It does not have any non-PHP dependencies and will work in PHP 5.5 and higher. It will also work in recent versions of HHVM.
The project uses a composer workflow. If you're not familiar with composer then please read up on it before trying to set this up.
Using this in Drupal requires some specific steps. Please refer to the Drupal AMP Module documentation.
For all other scenarios, continue reading.
git clone this repo,
cd into it and type in
$ composer install at the command prompt to get all the dependencies of the library. Now you'll be able to use the command line AMP html converter
amp-console (or equivalently
After doing a
$ composer install for setting up the command line console, you can run some phpunit tests
$ vendor/bin/phpunit tests
To see test coverage data first ensure you have the xdebug extenstion installed in your PHP installation.
$ php -m | grep xdebug # should output xdebug $ vendor/bin/phpunit tests --coverage-html=coverage-data $ cd coverage-data $ firefox index.html
To use this in your composer based PHP project, refer to composer docs here to make changes to your
Or you can simply do
$ composer require lullabot/amp:"^1.0.0" to fetch the library from here and automatically update your
Should you wish to follow the bleeding edge you can do
$ composer require lullabot/amp:"dev-master". Note that this will create a
.git folder in
vendor/lullabot/amp. If you want to avoid that, do
$ composer require lullabot/amp:"dev-master" --prefer-dist
$ cd <amp-php-library-repo-cloned-location> # Do this if you haven't already $ composer install $ ./amp-console amp:convert --help $ ./amp-console amp:convert <name-of-html-document> <options>
Please note that the
--help command line option is your friend. Use that when confused!
A few example HTML files are available in the test-html folder for you to test drive so that you can get a flavor of the AMP PHP library.
$ ./amp-console amp:convert sample-html/sample-html-fragment.html $ ./amp-console amp:convert sample-html/several_errors.html --full-document
Note that you need to provide
--full-document if you're providing a full html document file for conversion.
Lets see the output of the first example command above. The first few lines is the AMPized HTML provided by our library. The rest of the headings are self explanatory.
First, follow the setup steps above if you're using this in a composer based project.
Sample code to get started:
- Its probably not a good idea to run the library on your HTML dynamically on every page view. You should try caching the results of
$amp->convertToAmpHtml()once the library has run. If you're using the library from a CMS then you should consider using the caching facilities provided by the CMS.
- We only support UTF-8 string input and output from the library. If you're using ASCII, then you don't need to worry as UTF-8 is a superset of ASCII. If you're using another encoding like Latin-1 (etc.) you'll need to convert to UTF-8 strings before you use this library
- If you have
httpsurls and they don't have height/width attributes and you are using PHP 5.6 or higher and you have not listed any certificate authorities (
cafile) in your
php.inifile then the library may have problems converting these to
<amp-img>. This is because of http://php.net/manual/en/migration56.openssl.php . That link also has a work around.
- If your
<amp-pinterest>pins are appearing "chopped off" (after pinterest embed code conversion) try the workaround here
- Composer homepage for the AMP PHP Library on Packagist, the PHP package repository
- AMP Project Homepage
- AMP Project code repository on Github
- Technical Specification of AMP HTML in Protocol Buffers ASCII message format. See here for the Schema definition of the technical specification
You can ignore these links if you simply plan to use this library and not develop for it
- Google for creating the AMP Project and sponsoring development
- Lullabot for development of the module, theme, and library to work with the specifications