jolicode/jolitypo

Microtypography fixer for the web.

Installs: 6 470

Dependents: 2

Stars: 143

Watchers: 25

Forks: 10

Open Issues: 4

1.0 2015-12-13 18:50 UTC

README

Finally a tool for typography nerds.

JoliTypo is a tool fixing Microtypography glitches inside your HTML content.

use JoliTypo\Fixer;

$fixer = new Fixer(array('Ellipsis', 'Dash', 'SmartQuotes', 'CurlyQuote', 'Hyphen'));
$fixed_content = $fixer->fix('<p>"Tell me Mr. Anderson... what good is a phone call... if you\'re unable to speak?" -- Agent Smith, <em>Matrix</em>.</p>');
<p>&ldquo;Tell me Mr. Ander&shy;son&hellip; what good is a phone call&hellip; if you&rsquo;re unable to speak?&rdquo;&mdash;Agent Smith, <em>Matrix</em>.</p>

“Tell me Mr. Anderson… what good is a phone call… if you’re unable to speak?”—Agent Smith, Matrix.

It's designed to be:

  • language agnostic (you can fix fr_FR, fr_CA, en_US... You tell JoliTypo what to fix);
  • easy to integrate into modern PHP project (composer and autoload);
  • robust (make use of \DOMDocument instead of parsing HTML with dummy regexp);
  • smart enough to avoid Javascript, Code, CSS processing... (configurable protected tags list);
  • fully tested;
  • fully open and usable in any project (MIT License).

Build Status Latest Stable Version Latest Unstable Version Code Coverage SensioLabsInsight Dependency Status

Quick usage

Just tell the Fixer class which Fixer you want to run on your content and then, call fix():

use JoliTypo\Fixer;

$fixer = new Fixer(array("SmartQuotes", "FrenchNoBreakSpace"));
$fixer->setLocale('fr_FR');
$fixed_content = $fixer->fix('<p>Je suis "très content" de t\'avoir invité sur <a href="http://jolicode.com/">Jolicode.com</a> !</p>');

For your ease of use, you can find ready to use list of Fixer for your language here. Micro-typography is nothing like a standard or a law, what really matter is consistency, so feel free to use your own lists.

Please be advise that JoliTypo work best on HTML content; it will also work on plain text, but will be less smart about smart quotes. When fixing a complete HTML document, potential <head>, <html> and <body> tags may be removed.

To fix non HTML content, use the fixString() method:

use JoliTypo\Fixer;

$fixer = new Fixer(array("Trademark", "SmartQuotes"));
$fixed_content = $fixer->fixString('Here is a "protip(c)"!'); // Here is a “protip©”!

Installation

Requirements are handled by Composer (libxml and mbstring are required).

composer require jolicode/jolitypo

Usage outside composer is also possible, just add the src/ directory to any PSR-0 compatible autoloader.

Integrations

Available Fixers

Dash

Replace the simple - by a ndash between numbers (dates ranges...) and the double -- by a mdash .

Dimension

Replace the letter x between numbers (12 x 123) by a times entity (×, the real math symbol).

Ellipsis

Replace the three dot ... by an ellipsis .

SmartQuotes

Convert dumb quotes " " to all kind of smart style quotation marks (“ ”, « », „ “...). Handle a good variety of locales, like English, Arabic, French, Italian, Spanish, Irish, German...

See the code for more details, and do not forget to specify a locale on the Fixer instance.

This Fixer replace legacy EnglishQuotes, FrenchQuotes and GermanQuotes.

FrenchNoBreakSpace

Replace some classic spaces by non breaking spaces following the French typographic code. No break space are placed before :, thin no break space before ;, ! and ?.

NoSpaceBeforeComma

Remove space before , and make sure there is only one space after.

Hyphen (automatic hyphenation)

Make use of org_heigl/hyphenator, a tool enabling word-hyphenation in PHP. This Hyphenator uses the pattern-files from OpenOffice which are based on the pattern-files created for TeX.

There is only some locale available for this fixer: af_ZA, ca, da_DK, de_AT, de_CH, de_DE, en_GB, en_UK, et_EE, fr, hr_HR, hu_HU, it_IT, lt_LT, nb_NO, nn_NO, nl_NL, pl_PL, pt_BR, ro_RO, ru_RU, sk_SK, sl_SI, sr, zu_ZA.

You can read more about this fixer on the official github repository.

This Fixer require a Locale to be set on the Fixer with $fixer->setLocale('fr_FR');. Default to en_GB.

Proper hyphenation is mandatory in justified text and you should avoid word breaking in titles with this line of CSS: hyphens:none;.

CurlyQuote (Smart Quote)

Replace straight quotes ' by curly one's . There is on exception to consider: foot and inch marks (minutes and second marks). Purists use prime , this fixer use straight quote for compatibility. Read more about Curly quotes.

Trademark

Handle trade­mark symbol , a reg­is­tered trade­mark symbol ®, and a copy­right symbol ©. This fixer replace commonly used approximations: (r), (c) and (TM). A non-breaking space is put between numbers and copyright symbol too.

Numeric

Add a non-breaking space between a numeric and it's unit. Like this: 12_h, 42_฿ or 88_%.

It is really easy to make your own Fixers, feel free to extend the provided ones if they do not fit your typographic rules.

Fixer recommendations by locale

en_GB

$fixer = new Fixer(array('Ellipsis', 'Dimension', 'Numeric', 'Dash', 'SmartQuotes', 'NoSpaceBeforeComma', 'CurlyQuote', 'Hyphen', 'Trademark'));
$fixer->setLocale('en_GB');

fr_FR

Those rules apply most of the recommendations of "Abrégé du code typographique à l'usage de la presse", ISBN: 9782351130667.

$fixer = new Fixer(array('Ellipsis', 'Dimension', 'Numeric', 'Dash', 'SmartQuotes', 'FrenchNoBreakSpace', 'NoSpaceBeforeComma', 'CurlyQuote', 'Hyphen', 'Trademark'));
$fixer->setLocale('fr_FR');

fr_CA

Mostly the same as fr_FR, but the space before punctuation points is not mandatory.

$fixer = new Fixer(array('Ellipsis', 'Dimension', 'Numeric', 'Dash', 'SmartQuotes', 'NoSpaceBeforeComma', 'CurlyQuote', 'Hyphen', 'Trademark'));
$fixer->setLocale('fr_CA');

de_DE

Mostly the same as en_GB, according to Typefacts and Wikipedia.

$fixer = new Fixer(array('Ellipsis', 'Dimension', 'Numeric', 'Dash', 'SmartQuotes', 'NoSpaceBeforeComma', 'CurlyQuote', 'Hyphen', 'Trademark'));
$fixer->setLocale('de_DE');

More to come (contributions welcome!).

Documentation

Default usage

$fixer          = new Fixer(array('Ellipsis', 'Dimension', 'Dash', 'SmartQuotes', 'CurlyQuote', 'Hyphen'));
$fixed_content  = $fixer->fix("<p>Some user contributed HTML which does not use proper glyphs.</p>");

$fixer->setRules(array('CurlyQuote'));
$fixed_content = $fixer->fix("<p>I'm only replacing single quotes.</p>");

$fixer->setRules(array('Hyphen'));
$fixer->setLocale('en_GB'); // I tell which locale to use for Hyphenation and SmartQuotes
$fixed_content = $fixer->fix("<p>Very long words like Antidisestablishmentarianism.</p>");

Define your own Fixer

If you want to add your own Fixer to the list, you have to implement JoliTypo\FixerInterface. Then just give JoliTypo their fully qualified name, or even instance:

// by FQN
$fixer          = new Fixer(array('Ellipsis', 'Acme\\YourOwn\\TypoFixer'));
$fixed_content  = $fixer->fix("<p>Content fixed by the 2 fixers.</p>");

// or instances, or both
$fixer          = new Fixer(array('Ellipsis', 'Acme\\YourOwn\\TypoFixer', new Acme\\YourOwn\\PonyFixer("Some parameter")));
$fixed_content  = $fixer->fix("<p>Content fixed by the 3 fixers.</p>");

Configure the protected tags

Protected tags is a list of HTML tag name that the DOM parser must avoid. Nothing in those tags will be fixed.

$fixer          = new Fixer(array('Ellipsis'));
$fixer->setProtectedTags(array('pre', 'a'));
$fixed_content  = $fixer->fix("<p>Fixed...</p> <pre>Not fixed...</pre> <p>Fixed... <a>Not Fixed...</a>.</p>");

Add your own Fixer / Contribute a Fixer

  • Write test;
  • A Fixer is run on a piece of text, no HTML to deal with;
  • Implement JoliTypo\FixerInterface;
  • Pull request;
  • PROFIT!!!

Contribution guidelines

  • You MUST write code in english;
  • you MUST follow PSR2 and Symfony coding standard (run ./vendor/bin/php-cs-fixer -vvv fix on your branch);
  • you MUST run the tests (run phpunit);
  • you MUST comply to the MIT license;
  • you SHOULD write documentation.

If you add a new Fixer, please provide sources and references about the typographic rule you want to fix.

Compatibility & OS support restrictions

  • Windows XP : Thin No-Break Space can't be used, all other spaces are ignored but they do not look bad (normal space).
  • Mac OS Snow Leopard : no espaces fixes, demi-fixes, cadratin et demi-cadratin but does not look bad (normal space).

BUT if you use a font (@font-face maybe) that contains all those glyphs, there will be no issues.

There is a known issue preventing JoliTypo to work correctly with APC versions older than 3.1.11.

What can you do to help?

We need to be able to use this tool everywhere, you can help by providing:

  • Wordpress plugin (to replace or complete wptexturize)
  • Dotclear plugin ...

Also, there is a Todo list :kissing_smiling_eyes:

License

This piece of code is under MIT License. See the LICENSE file.

Alternatives and other implementations

There are already quite a bunch of tool like this one (including good ones). Sadly, some are only for one language, some are running regexp on the whole HTML code (which is bad), some are not tested, some are bundled inside a CMS or a Library, some are not using proper auto-loading, some do not have an open bug tracker... Have a look by yourself:

Glossary & References

Thanks to theses online resources for helping a developer understand typography: