infureal/php-ass

A library for reading Advanced Sub Station Alpha subtitle files

v1.1 2020-10-15 22:52 UTC

This package is auto-updated.

Last update: 2024-04-16 06:07:51 UTC


README

A library for reading Advanced Substation Alpha subtitle files.

Specification

The ASS file specs are available in various parts in various places:

  1. Wikipedia has a good overview
  2. The original format in Microsoft Word .doc format
  3. How the files are incorporated into Matroska (MKV) files

In short: ASS files are an advanced version of the original SSA (Sub Station Alpha) subtitle files and include several improvements in terms of styling and effects.

A valid script file starts with [Script Info] and contains several sections in INI style format.

Quick start

Install using composer:

composer require chaostangent/php-ass

Then start using:

require __DIR__.'/vendor/autoload.php';

use ChaosTangent\ASS\Reader;

$reader = new Reader();
$script = $reader->fromFile(__DIR__.'/examples/example.ass');

foreach ($script as $block) {
    echo $block->getId().PHP_EOL;

    foreach ($block as $line) {
        echo $line->getKey().': '.$line->getValue();
    }
}

Parts

Script

The ChaosTangent\ASS\Script class represents the root object of an ASS script.

You instantiate a Script with the content of a script as well as an optional filename:

$script = new Script('[Script Info]', 'mytestscript.ass');

Once instantiated you can check whether what's been passed looks like a valid ASS script:

if ($script->isASSScript()) {
    // do more processing
}

This only checks the first few bytes for the "[Script Info]" string, it doesn't guarantee that a passed script is valid or readable.

To parse the passed script:

$script->parse();

This will go through the content passed when creating the script and parse it into blocks and lines.

To get the current collection of blocks, you can call getBlocks() or treat the script as an iterator:

foreach ($script as $block) {
    // block processing
}

To check if a script has a block:

if ($script->hasBlock('Script Info')) {
    $script->getBlock('Script Info');
}

Block

Every ASS script is comprised of a few different blocks. The ChaosTangent\ASS\Block\Block abstract class represents one of these blocks.

At the moment php-ass supports the following blocks:

Any other kind of block (e.g. "Aegisub Project Garbage", "Fonts") are silently ignored when parsing.

ScriptInfo blocks provide functions for common fields:

$scriptInfoBlock->getTitle();
$scriptInfoBlock->getWrapStyle();
$scriptInfoBlock->getScriptType();

Otherwise you can just treat blocks as containers for lines. You can use array access to get a specific line:

$scriptInfoBlock[123]; // line 124 of this block

Or treat the block as an iterator:

foreach ($scriptInfoBlock as $line) {
    // line processing
}

Line

Lines are the core of a script file. Any line that isn't a comment (not to be confused with a comment event line) uses the base class ChaosTangent\ASS\Line\Line.

Lines in some blocks are mapped according to a special "Format" line. These are represented by ChaosTangent\ASS\Line\Format. Format lines have a special getMapping() method that returns an array that can be used to parse other lines.

If all of this sounds a bit complicated, you mostly won't have to worry about it if parsing files as it's all taken care of for you. All it means is that for Dialogue and Style lines, you can use methods to get the different parts:

$styleLine->getName();
$styleLine->getPrimaryColour();
$dialogueLine->getLayer();
$dialogueLine->getText();

Dialogue lines also have an extra method for getting the text of a line without any style override codes in it:

$dialogueLine->getTextWithoutStyleOverrides();

For all lines you can use the generic getKey() and getValue() methods which will return the key of the line (e.g. "Dialogue", "Format", "Style") and its unparsed value:

$dialogueLine->getKey(); // == 'Dialogue'
$dialogueLine->getValue(); // e.g. 0,0:00:00.98,0:00:05.43,ED_English,,0,0,0,,{\fad(100,200)\blur5\c&H000010&\3c&H80A0C0&}My destiny,

If you only want lines of a specific type, just do an instanceof check when iterating through:

foreach ($block as $line) {
    if ($line instanceof Dialogue) {
        echo $line->getTextWithoutStyleOverrides().PHP_EOL;
    }
}

Tests

There is a growing test suite for this library that you can use phpunit to validate against. Any esoteric or known broken scripts would be a welcome addition.

TODO

  • Allow reading of embedded information (images, fonts etc.)
  • Allow construction and writing of ASS files
  • More line type support
  • More block type support
  • Test completion