xp-forge/parse

This package is abandoned and no longer maintained. No replacement package was suggested.

Parses code based on rules

v4.0.0 2020-10-04 14:57 UTC

This package is auto-updated.

Last update: 2021-11-04 19:59:33 UTC


README

Build Status on TravisCI XP Framework Mdodule BSD Licence Requires PHP 7.0+ Latest Stable Version

Parses code based on rules.

Example

The following example parses key/value pairs using the tokenizer built on top of PHP's tokenizer extension.

use text\parse\{Rules, Syntax, Tokenized};
use text\parse\rules\{
  Repeated,
  Sequence,
  Token,
  Apply,
  Matches,
  Collect
};

$syntax= new class() extends Syntax {
  public function rules() { return new Rules([
    new Repeated(
      new Sequence([new Token(T_STRING), new Token(':'), new Apply('val')], function($values) {
        return [$values[0] => $values[2]];
      }),
      new Token(','),
      Collect::$AS_MAP
    ),
    'val' => new Matches([
      T_CONSTANT_ENCAPSED_STRING => function($values) { return substr($values[0], 1, -1); },
      T_STRING                   => function($values) { return constant($values[0]); },
      T_DNUMBER                  => function($values) { return (double)$values[0]; },
      T_LNUMBER                  => function($values) { return (int)$values[0]; }
    ])
  ]); }
};

$tokens= new Tokenized('a: 1, b: 2.0, c: true, d: "D"');
$pairs= $syntax->parse($tokens);  // ["a" => 1, "b" => 2.0, "c" => true, "d" => "D"]

Rules

The following rules are available for matching:

Token

The rule Token(T) matches a single token T.

Tokens

The rule Tokens(T1[, T2[, ...]]) matches any combination of the given tokens. For example, new Tokens(T_STRING, '.') can be used to match dotted type notation as used in XP's type names.

Be aware of the fact that this will match three dots, or three strings, or a string and a dot; and therefore does not guarantee syntactical correctness. It is, however, a high-performance alternative to more complex rules.

Apply

The rule Apply(RuleName) will defer handling to a given named rule passed to the Rules constructor.

Matches

The rule Matches([T1 => Rule1[, T2 => Rule2[, ...]]]) matches rules based on the initial tokens used in the lookup map. High-performance due to isset()-based lookups, though less flexible as OneOf.

OneOf

The rule OneOf([Rule1[, Rule2[, ...]]]) matches rules in the order specified and returns the values of the first rule to match.

Sequence

The rule Sequence([Rule1[, Rule2[, ...]]], function) matches a sequence of rules in the order specified, and passed the matched values to the handler function.

Optional

The rule Optional(Rule, default= NULL) matches the rule, and returns it value; or the default if not matched.

Repeated

The rule Repeated(Rule, Delim= NULL, collect= IN_ARRAY) matches the rule (and optionally, a given delimiter rule) as many times as possible. It uses a collector function from the text.parse.rules.Collect enum to process the results.

An example is processing argument lists, e.g. new Repeated(new Apply('val'), new Token(',')) will parse arguments to a function. Dangling delimiters are allowed.