Compiler for converting pbj schemas into jsonschema, php, js, etc.

v0.4.0 2019-04-06 18:25 UTC


Build Status Code Climate

Compiler for converting pbj schemas into jsonschema, php, js, etc.

Language Guide

This guide describes how to use the XML language to structure your schema file syntax and how to generate data classes files.


Let's start by defining each of the elements and key options used across the compiler.

  • Schema: The purpose of a Schema is to define a pbj message, with the fields and related mixins (other schemas used to extend the schema capability).

  • Enum: An Enum is a collection of key-value, used in schema fields (see Enumerations below).

  • SchemaId: A Schema fully qualified name (id).

    • Schema Id: pbj:vendor:package:category:message:version
    • Schema Curie Major: vendor:package:category:message:v#
    • Schema Curie: vendor:package:category:message
    • Schema QName: vendor:message
  • SchemaVersion: Similar to semantic versioning but with dashes and no "alpha, beta, etc." qualifiers.

    • Schema Version Format: major-minor-patch

Defining A Schema

First let's look at a very simple example. Let's say you want to define a mixin schema, with slug and name fields. Here's the .xml file you use to define the schema.

<?xml version="1.0" encoding="UTF-8" ?>
<pbj-schema xmlns=""

  <schema id="pbj:acme:blog:entity:article:1-0-0" mixin="true">
      <field name="slug" type="string" pattern="/^[A-Za-z0-9_\-]+$/" required="true" />
      <field name="title" type="text" required="true" />

Each schema required a few basic elements: id and fields. The id is a unique identifier follow a basic schema-id format pbj:vendor:package:category:message:version (version = major-minor-patch). The fields is an array of associated fields used by the schema. In the above example, the store schema contains a slug and a title.

Since we are creating a mixin schema, we set in the second line mixin = true.

In addition, we allow to add language specific options which will be used while generating the language output file.

Schema Field Types

The following list contains all available field types:

- big-int
- binary
- blob
- boolean
- date
- date-time
- decimal
- dynamic-field
- float
- geo-point
- identifier
- float
- int
- medium-blob
- medium-int
- medium-text
- microtime
- signed-big-int
- signed-int
- signed-medium-int
- signed-small-int
- signed-tiny-int
- small-int
- string
- text
- time-uuid
- timestamp
- tiny-int
- uuid

Default Values

When a schema is parsed, if the encoded schema does not contain a particular singular element, the corresponding field in the parsed object is set to the default value for that field. These defaults are type-specific:

- For strings, the default value is the empty string.
- For bytes, the default value is empty bytes.
- For bools, the default value is false.
- For numeric types, the default value is zero.
- For each of the other field types, the default value is null.


When you're defining a schema, you might want one of its fields to only have one of a pre-defined list of values. For example, let's say you want to add a Reason enum field, where the values can be INVALID, FAILED or DELETED.

  <field name="failure_reason" type="string-enum">
    <enum id="acme:blog:publish-status" />

The define the enum in enums.xml:

<enums namespace="acme:blog">
  <enum name="publish-status" type="string">
    <option key="PUBLISHED" value="published" />
    <option key="DRAFT" value="draft" />
    <option key="PENDING" value="pending" />
    <option key="EXPIRED" value="expired" />
    <option key="DELETED" value="deleted" />

From the above example you can see we defined the enum keys and values for a specific schema and called it directly from the field.

Note: We can also define the PHP namespace where the enum class will be generated to.

There are 2 kinds of enum types, StringEnum and IntEnum. We separated to simplified the field type and values.

Note: major database for example MySQL, DynamoDB and other define enum based on type - string or int.

Using Message Types

You can use Message and MessageRef as field types. For example, let's say you wanted to include related messages in each Story schema:

<field name="failed_request" type="message">

The any-of attribute define the message id that will be used to pull the message details.

Full Schema Options

<?xml version="1.0" encoding="UTF-8" ?>
<pbj-schema xmlns=""


        <enum id="{vendor:package:enum}" />

          <!-- ... -->



      <!-- ... -->
<?xml version="1.0" encoding="UTF-8" ?>
<pbj-enums xmlns=""

  <enums namespace="{vendor:package}">
    <enum name="{string}" type="int|string">
      <option key="{string}" value="{string}" />
      <!-- ... -->

Note: For each php-options you can also add dynamic tags. For example:

use Gdbots\Pbj\MessageRef;
 * @param string $tag
 * @return MessageRef
public function generateMessageRef($tag = null)
    return new MessageRef(static::schema()->getCurie(), $this->get('command_id'), $tag);

Basic Usage

pbjc --language[=LANGUAGE] --config[=CONFIG]
Option Notes
-l or --language[=LANGUAGE] The generated language [default: "php"]
-c or --config[=CONFIG] The pbjc config yaml file

Define compile settings in pbjc.yml file:

  - <vendor1>:<package1>
  - <vendor2>:<package2>

    output: <div>
    manifest: <dir>/<filename>

Note: by default the compiler searches for pbjc.yml in the root folder.