dirtsimple/clean-yaml

Dump diff-friendly, readable YAML

v0.1.3 2019-09-08 20:30 UTC

This package is auto-updated.

Last update: 2024-11-09 08:16:57 UTC


README

While Symfony's Yaml component has a terrific parser and mostly-spec-compliant dumper, there are times when you really need your YAML output to be readable by a human being, and produce clean diffs when changed. (For example, both Postmark and Imposer generate YAML output from arbitrary WordPress data, and their diffs are an important part of both revision control and configuration management.)

While JSON (and Symfony's rather JSON-like YAML output) can be diffed to some extent, the diffs tend to be "noisy", filled with extraneous punctuation changes and overly-long lines, especially when strings contain multiline text or HTML content.

So this library provides a wrapper for Symfony's YAML dumper, using a different outlining algorithm that prioritizes diffability and readability, while still producing spec-compliant output that's fully round-trippable. Specifically it:

  • Only inlines data structures that can fit on the current line within a specified line length
  • Doesn't inline more than one level of nested data structure, unless all the child structures are empty (e.g. [] and {})
  • Splits strings containing line feeds into multi-line output, with correct chomping operators (as Symfony's DUMP_MULTI_LINE_LITERAL_BLOCK flag only works correctly for strings that end in exactly one \n -- no more, no less.)

Following these rules produces output that is still spec-compliant, but which avoids long lines of inlined data, except where the long lines are themselves part of a string. (The trade-off is that the outputs produced are invariably larger in both number of lines and total file size than those created by Symfony, since fewer things are inlined, and thus more linefeeds and indentation are included.)

To use this library, require dirtsimple/clean-yaml and use dirtsimple\CleanYaml;, then call CleanYaml::dump($data), optionally passing extra arguments for the width (120 by default) and indent size (2 by default). The return value is a string containing a complete YAML document that always includes a trailing newline, even if the top-level value is a scalar.

There are no flags to control the output: the library's behavior is roughly equivalent to using Symfony's DUMP_EXCEPTION_ON_INVALID_TYPE | DUMP_OBJECT_AS_MAP | DUMP_MULTI_LINE_LITERAL_BLOCK | DUMP_EMPTY_ARRAY_AS_SEQUENCE, except that various quirks in the last two flags' behavior are fixed. (Symfony's output for literal blocks lack proper chomp settings, and it sometimes dumps nested empty arrays as objects even though you've asked it not to; CleanYaml on the other hand treats a top-level empty array as a map, and all other empty arrays as sequences.)

The second argument to dump() is the number of characters wide you'd like the output to be. Data structures will be inlined if they can fit within this space, including the current indent and key, if any. Lines will still exceed this size for strings that are too long or wide to fit the space, or if the indentation gets too deep. The third argument is the number of spaces by which each nesting level will indent.

Here's some sample output with the default settings:

-
  id: 5509c1f
  elType: widget
  settings:
    title: 'Free download: No credit card required'
    tags: [ these, are, random, tags, that, fit, 'on', this, line ]
    text: |-
      Here is some indented text that is on multiple lines.  It doesn't rewrap
      because the line feeds are as originally given, and clean-yaml doesn't
      do folding, because that would introduce additional noise into diffs that
      didn't directly come from changes to the underlying data.  The last line
      doesn't end with a `\n`, so the `|-` chomp operator is used.
    align: center
    typography_typography: custom
    typography_font_weight: bold
    background_color: '#23a455'
    border_radius: { unit: px, top: '18', right: '18', bottom: '18', left: '18', isLinked: true }
  elements: []
  timestamp: 2016-05-27T00:00:00+00:00
  not_a_timestamp: '2016-05-27T00:00:00+00:00'

And now the same data, but with a narrower width (40), and wider indent (4):

-
    id: 5509c1f
    elType: widget
    settings:
        title: 'Free download: No credit card required'
        tags:
            - these
            - are
            - random
            - tags
            - that
            - fit
            - 'on'
            - this
            - line
        text: |-
            Here is some indented text that is on multiple lines.  It doesn't rewrap
            because the line feeds are as originally given, and clean-yaml doesn't
            do folding, because that would introduce additional noise into diffs that
            didn't directly come from changes to the underlying data.  The last line
            doesn't end with a `\n`, so the `|-` chomp operator is used.
        align: center
        typography_typography: custom
        typography_font_weight: bold
        background_color: '#23a455'
        border_radius:
            unit: px
            top: '18'
            right: '18'
            bottom: '18'
            left: '18'
            isLinked: true
    elements: []
    timestamp: 2016-05-27T00:00:00+00:00
    not_a_timestamp: '2016-05-27T00:00:00+00:00'