xp-forge / parse
Parses code based on rules
Installs: 6 561
Dependents: 1
Suggesters: 0
Security: 0
Stars: 0
Watchers: 3
Forks: 0
Open Issues: 1
Requires
- php: >=7.0.0
- xp-framework/core: ^10.0 | ^9.0 | ^8.0 | ^7.0 | ^6.5
Requires (Dev)
- xp-framework/unittest: ^11.0 | ^10.0 | ^9.0 | ^8.0 | ^7.0 | ^6.5
README
Parses code based on rules.
Example
The following example parses key/value pairs using the tokenizer built on top of PHP's tokenizer extension.
use text\parse\{Rules, Syntax, Tokenized}; use text\parse\rules\{ Repeated, Sequence, Token, Apply, Matches, Collect }; $syntax= new class() extends Syntax { public function rules() { return new Rules([ new Repeated( new Sequence([new Token(T_STRING), new Token(':'), new Apply('val')], function($values) { return [$values[0] => $values[2]]; }), new Token(','), Collect::$AS_MAP ), 'val' => new Matches([ T_CONSTANT_ENCAPSED_STRING => function($values) { return substr($values[0], 1, -1); }, T_STRING => function($values) { return constant($values[0]); }, T_DNUMBER => function($values) { return (double)$values[0]; }, T_LNUMBER => function($values) { return (int)$values[0]; } ]) ]); } }; $tokens= new Tokenized('a: 1, b: 2.0, c: true, d: "D"'); $pairs= $syntax->parse($tokens); // ["a" => 1, "b" => 2.0, "c" => true, "d" => "D"]
Rules
The following rules are available for matching:
Token
The rule Token(T) matches a single token T
.
Tokens
The rule Tokens(T1[, T2[, ...]]) matches any combination of the given tokens. For example, new Tokens(T_STRING, '.')
can be used to match dotted type notation as used in XP's type names.
Be aware of the fact that this will match three dots, or three strings, or a string and a dot; and therefore does not guarantee syntactical correctness. It is, however, a high-performance alternative to more complex rules.
Apply
The rule Apply(RuleName) will defer handling to a given named rule passed to the Rules
constructor.
Matches
The rule Matches([T1 => Rule1[, T2 => Rule2[, ...]]]) matches rules based on the initial tokens used in the lookup map. High-performance due to isset()
-based lookups, though less flexible as OneOf
.
OneOf
The rule OneOf([Rule1[, Rule2[, ...]]]) matches rules in the order specified and returns the values of the first rule to match.
Sequence
The rule Sequence([Rule1[, Rule2[, ...]]], function) matches a sequence of rules in the order specified, and passed the matched values to the handler function.
Optional
The rule Optional(Rule, default= NULL) matches the rule, and returns it value; or the default if not matched.
Repeated
The rule Repeated(Rule, Delim= NULL, collect= IN_ARRAY) matches the rule (and optionally, a given delimiter rule) as many times as possible. It uses a collector function from the text.parse.rules.Collect
enum to process the results.
An example is processing argument lists, e.g. new Repeated(new Apply('val'), new Token(','))
will parse arguments to a function. Dangling delimiters are allowed.