A config file format for humans.
DMS — Data Meta Syntax — aims to be a data syntax with YAML's clean look, TOML's small strict spec, and one extra superpower: comments survive parse → modify → re‑emit, in every reference parser, in every language.
+++
title: "DMS feature tour"
version: "1.0.0"
updated: 2026-04-24T09:30:00-04:00
+++
# Line comments take # or //, your pick.
// Bare keys allow full Unicode. Heredocs are first‑class.
database:
host: "db.internal"
port: 5432 # raised after the LB change
pool: { size: 10, idle_timeout_s: 30 }
servers:
+ name: "web1"
disks:
+ mount: "/"
size_gb: 100
+ mount: "/var"
size_gb: 500
+ name: "web2"
sql: """SQL _trim("\n", ">")
SELECT id, email
FROM users
WHERE active = true
SQL
regions: ["us-east-1", "eu-west-1", "ap-south-1"]
Indent-based, one rule
Siblings under a parent must share one indent width. Tabs are banned in structural indent. That is the entire indent specification.
Distinct types, never inferred
Quoted is a string. Bare digits are a number. true is a boolean. There is no NO-becomes-false, no octal-from-leading-zero, no Norway problem.
Comments are first-class
Leading, trailing, and floating comments are AST nodes. They survive parse, mutation, and emit — by spec, in every reference parser.
Polymorphic root
A document can be a table, a list, a scalar, or empty. Want to serialize a list? Just write a list — no wrapper key, no ceremony.
Heredocs with modifiers
Labelled heredocs in basic and literal flavors, with chainable modifiers like _trim and _fold_paragraphs. Every YAML block-scalar mode is one combo away.
Thirteen first-party parsers
Rust, C, Go, Zig, Python, Perl, JavaScript, C#, Ruby, Java, PHP, Lua, Crystal. All hold at 4695 / 4695 on the shared conformance corpus.
Why a new config format?
Same data, three formats. The differences aren't cosmetic.
servers:
+ name: "web1"
disks:
+ mount: "/"
+ mount: "/var"
+ name: "web2"
countries: ["US", "UK", "NO"]
version: "1.0" # quoted = string
port: 11 # bare digits = decimal
octal_perms: 0o644 # explicit prefix
YAML's shape. TOML's strictness. + for list items so paths never repeat. One indent rule, distinct types, a small spec you can read over lunch.
servers:
- name: web1
disks:
- mount: /
- mount: /var
- name: web2
countries:
- US
- UK
- NO # silently false
version: 1.0 # float, not "1.0"
port: 011 # octal 9 in YAML 1.1
Clean shape. Eighty-five-page spec, plain scalars that type-coerce on what they look like, anchors and merge keys baked into the data model.
[[servers]]
name = "web1"
[[servers.disks]]
mount = "/"
[[servers.disks]]
mount = "/var"
[[servers]]
name = "web2"
Tight grammar, unambiguous types. Then you nest something and the path repeats five times. No block comments, no list-or-scalar root, no heredocs.
A quick tour
Every feature DMS has, in one short scroll.
Comments — five forms, mix freely
# line comment, hash style
// line comment, slash style — both equivalent
port: 8080 # trailing on the same line
/* C-style block — inline or multi-line.
/* Nests, so commenting out commented code just works. */
Closes on the matching */
host: /* inline, between key and value */ "db.internal"
###
Hash-block, unlabeled. Closes on a bare ### line.
###
###NOTE
Hash-block, labeled. Closes on a line equal to NOTE.
Pick any label when the body contains stray */ or ###.
NOTE
Strings — basic, literal, heredoc
basic: "escapes are processed: \n \t é"
literal: 'C:\Users\ada — backslashes stay backslashes'
# Strip trailing newlines (YAML's |-):
sql: """SQL _trim("\n", ">")
SELECT id, email
FROM users
WHERE active = true
SQL
# Fold paragraphs (YAML's >+):
prose: """DOC _fold_paragraphs()
The quick brown fox
jumps over the lazy dog.
Sphinx of black quartz,
judge my vow.
DOC
# Literal heredoc — no escape processing:
regex: '''RE
^[A-Za-z_][A-Za-z0-9_]*$
RE
Numbers — base prefixes, no inference
port: 8080 # always decimal integer
mask: 0xFF_00_00 # hex with digit underscores
perms: 0o644 # octal
flags: 0b1010_0101 # binary
ratio: 0.42 # decimal float
hex_float: 0x1.8p3 # hex float (= 12.0)
sentinel: nan # also: inf, -inf
ready: true
shutdown: false # not "no", not "off"
Dates & times — first-class
deployed: 2026-04-24T09:30:00-04:00 # offset datetime
window: 2026-04-24T09:30:00 # local datetime
release: 2026-04-24 # local date
cutover: 09:30:00 # local time
Lists & tables — block and flow
# block list with "+" — one character, never doubled brackets
servers:
+ name: "web1"
port: 443
+ name: "web2"
port: 443
# flow list (inline)
regions: ["us-east-1", "eu-west-1", "ap-south-1"]
# block table (indent)
database:
host: "db.internal"
port: 5432
# flow table (inline)
cache: { host: "redis", db: 0 }
Front matter — for document metadata
+++
app_name: "myservice"
doc_version: "1.2.3"
updated: 2026-04-23
+++
# the actual document body starts here
database:
host: "db.internal"
port: 5432
Polymorphic root — list, table, scalar, or empty
# all three are valid DMS documents:
# 1. a table
title: "production"
# 2. a bare list
+ "apples"
+ "oranges"
# 3. a single scalar
42
Unicode keys — bare, no quoting required
résumé: "ada.pdf"
こんにちは: "hello"
"path with space": "/etc/dms.conf" # quote only the unusual
Fast, too
DMS was designed for readability, but the parsers turn out to be
quick. On a real production Helm chart's values.yaml
(kube-prometheus-stack 84.3.0,
4–5k kvpairs, ~25 KB) DMS beats YAML in every
meaningful cell, decode and encode. Measured in lite mode.
Decode (parse only)
Each driver reads stdin, parses to an in-memory tree, prints
ok\n. Cells are median wall time across 15 timed
iterations after 2 warmup — fresh process per iter, includes
startup. Lower is better.
| Language | DMS | YAML | TOML | JSON |
|---|---|---|---|---|
| C | 5.62 ms | 6.61 ms | 5.64 ms | 5.66 ms |
| Zig | 5.55 ms | 5.05 ms | 4.46 ms | 4.58 ms |
| Rust | 7.27 ms | 7.52 ms | 6.89 ms | 6.27 ms |
| Go | 8.34 ms | 9.62 ms | 9.27 ms | 8.19 ms |
| Crystal | 12 ms | 13 ms | 370 ms ✱ | 12 ms |
| Lua | 16 ms | 21 ms | 26 ms | 8.6 ms |
| Perl (DMS-XS) | 24 ms | 27 ms | 177 ms | 27 ms |
| Perl (pure) | 33 ms | — | — | — |
| C# | 44 ms | 79 ms | 61 ms | 45 ms |
Python (dms_c) |
45 ms | 65 ms | 49 ms | 44 ms |
| Node | 49 ms | 65 ms | 57 ms | 42 ms |
| Python (pure) | 56 ms | — | — | — |
| PHP | 63 ms | 62 ms | 78 ms | 47 ms |
| Java | 125 ms | 188 ms | 296 ms | 237 ms |
| Ruby | 132 ms | 154 ms | 197 ms | 121 ms |
The C and Zig DMS rows shed parse cost via an ASCII fast-path on
the source-level NFC normalization: utf8proc's
utf8proc_map walks every byte and allocates a fresh
buffer regardless of input (~420 µs on this 25 KB
fixture), but when the source is pure ASCII — the common case
for config files — NFC is a no-op. A single byte scan catches
that case and skips the heavy call.
†† Python/JSON: stdlib json parsed
inside the startup-probe noise floor on this fixture — best-of-N
parse time landed under the probe minimum, so the subtraction
clamps to ~0.
✱ Crystal/TOML: manastech/crystal-toml
has a wide-table pathology — 26 KB takes 351 ms. The DMS
port in that same language doesn't have the issue.
The Python (pure) and Perl (pure) rows show DMS only —
pure-language peers for the other formats either don't exist
meaningfully (stdlib JSON is C; PyYAML defaults to libyaml) or are
the same parser everyone uses (tomli is pure, but it's
the tomli). The split exists to show what the C extension
buys you on top of the pure-language parser. The gap is modest now —
Python ~1.2× (dms_c over pure-Python), Perl ~1.4×
(DMS-XS over pure-Perl on this fixture). Pure-Perl closed most of
the previous gap via a mega-regex fast path for lite-mode parsing
that bypasses the recursive-descent walker; algorithmic parse cost
is now ~5 ms in-process, beating pure-Python's ~8 ms.
Encode (in-memory → text)
Each driver parses stdin once (untimed warmup), then loops the
serialise step. Cells are median wall time of the timed loop
(20 iters, 3 warmup). DMS encode = encode_lite —
canonical-form output with no comment / original-form preservation.
| Language | DMS | YAML | TOML | JSON |
|---|---|---|---|---|
| C | 0.07 ms | 1.0 ms ‡ | n/a ‡ | 0.05 ms |
| Perl (DMS-XS) | 0.08 ms | 1.1 ms § | 10 ms ¶ | 2.2 ms |
| Node | 0.22 ms | 0.99 ms | 0.87 ms | 0.08 ms |
| Crystal | 0.24 ms | 0.53 ms | 0.12 ms | 0.08 ms |
| Rust | 0.32 ms | 0.42 ms | 0.35 ms | 0.02 ms |
| Zig | 0.46 ms | 0.02 ms ω | 0.03 ms | 0.03 ms |
| PHP | 0.55 ms | 2.6 ms | 5.9 ms | 0.04 ms |
| C# | 0.56 ms | 7.9 ms | 1.2 ms | 0.15 ms |
| Python | 0.70 ms | 26 ms | 1.4 ms | 0.11 ms |
| Ruby | 0.82 ms | 4.3 ms | 7.3 ms | 0.13 ms |
| Go | 1.0 ms | 1.6 ms | 0.53 ms | ε |
| Lua | 1.0 ms | 6.0 ms | ε | ε |
| Java | 1.3 ms | 5.4 ms | 1.1 ms | 0.22 ms |
| Perl (pure) | 2.2 ms | 25 ms § | 10 ms ¶ | 2.2 ms |
‡ C: no tomlc99 emit driver wired
into bench_encoders.py yet (TOML parse-only is
wired). C/YAML emit goes through yaml_emitter_dump,
which libyaml ties to a yaml_parser_load callback —
the timed loop pays parse + emit each iter, not pure emit. The
number reflects what an application actually pays per serialise,
but is not directly comparable to the other rows' "already-loaded
structure → text" measurements.
§ Perl YAML splits on the library: YAML.pm
(pure) vs. YAML::XS (XS).
¶ Perl TOML: only TOML::Tiny is widely
used; the same encoder backs both Perl rows.
ε Cells reading 0 ms are broken-driver sentinels —
the encoder loop short-circuited rather than serializing.
ω Zig/YAML emit at 22 µs is implausibly fast on
a 28 KB output; the zig-yaml emitter likely skips
work the others don't.
The companion fixtures are normalized per format
so every parser actually parses, since real-world data exposed
bugs in five of the comparison libraries:
kubkon/zig-yaml errors on
key: {} followed by a sibling block-mapping opener
and on plain scalars containing a double quote;
sam701/zig-toml errors on consecutive empty inline
arrays; lua-toml rejects [[a.b.c]]
array-of-tables headers when an intermediate parent has been
declared; yosymfony/toml rejects inline-tables-inside-arrays;
and manastech/crystal-toml decode time still scales
pathologically with table width even on the small fixture
(26 KB takes 351 ms). gen_companions.py
drops the offending shapes per format so the bench cells return
numbers rather than errors. The DMS port in each of these
languages handles every shape unaltered.
Lite mode
Every DMS port ships a lite mode as of 0.2.0
(SPEC §Parsing modes — full and lite).
Same parser, same grammar, same errors, same data tree — but
with comment-AST construction and original_forms
recording switched off. For read-only consumers (config loaders,
deploy pipelines, CI scripts) that never call encode,
that machinery is dead weight; lite mode skips it.
The chart above is lite mode. Switch to
full mode and the parser earns its keep: it
returns a tree where every value carries its source formatting
and attached comments, ready to round-trip via
encode. Same fixture, same comment density — but
now full pays the comment-attachment cost and lite skips it,
so the delta is exactly the preservation tax:
| Language | Full | Lite | Lite is … |
|---|---|---|---|
| Rust | 9.91 ms | 9.36 ms | 6% faster |
| Go | 10.31 ms | 9.62 ms | 7% faster |
| C | 10.79 ms | 10.27 ms | 5% faster |
| Zig | 11.75 ms | 10.89 ms | 7% faster |
| Crystal | 15.10 ms | 14.14 ms | 6% faster |
| Lua | 51.52 ms | 38.31 ms | 26% faster |
| C# | 60.61 ms | 53.29 ms | 12% faster |
Median wall time, includes startup. Same machine, same fixture,
same parser per row — only the parse mode changes. The 5–7 %
wall-time wins on the fast row look small but read very differently
once you subtract startup: parse-only, Rust full→lite is
3.76 ms → 2.04 ms (46 %),
Crystal is 2.67 → 1.39 (48 %),
Go is 2.04 → 1.53 (25 %) — the
comment-AST and original_forms bookkeeping is a
genuinely non-trivial slice of pure parse cost in languages that
finish in single-digit milliseconds. On the slower rows
(Python / PHP / Java / Ruby) the
delta gets buried under interpreter startup variance, so they're
omitted here rather than dressed up. Lite mode is opt-in per call:
parse_lite_document(src) /
parseLiteDocument(src) /
decode_document(src, mode="lite") depending on the
port's idiom. Full mode remains the default and the conformance
floor.
Get DMS in your language
Thirteen first-party parsers. All hold at 4695 / 4695 on the shared conformance corpus.
dms-rscargo add dms-rs
dms-py + dms-cpip install dms-py
dms-jsnpm install dms-js
go get gitlab.com/flo-labs/pub/dms-go
DMS + DMS-XS-Parsercpanm DMS
git clone …/dms-c
zig fetch …/dms-zig
Dmsdotnet add package Dms
dmsgem install dms
dev.flolabs:dmsmvn add dev.flolabs:dms
flo-labs/dmscomposer require flo-labs/dms
dmsluarocks install dms
dmsshards install dms
git clone …/dms-tests
Tier 1 — decorators and dialects
Structured annotations on top of the tier-0 value tree.
Tier 1 layers structured decorators on top of the tier-0 grammar. A
decorator call is a sigil-prefixed function-shaped annotation
(|tag(class: "lede"), 🚀deploy(env: "prod")) attached to any
value-tree node. The decoration sidecar is parallel to the value
tree — consumers walk one or both as they need.
Dialects bind sigils to families of decorators. dms+html carries HTML-shaped documents; dms+hcl carries HCL/Terraform configs; dms+kdl node-shaped data; dms+ron tagged ADTs; dms+k8s Kubernetes manifests with auto-mirrored labels and selectors. Reserved-emoji codepoints (🚀, 🇺🇸, 🏷️, 🔒, …) are first-class sigil atoms alongside ASCII.
Comments survive round-trip
Configuration files are documentation. A comment on
port: 8080 # raised after the LB change in 2024-Q4exists so someone in 2026 can read it. If the first formatter or deploy template renderer drops it, the documentation was a lie.DMS makes comments first-class AST nodes attached to the value tree at one of four positions.
encode(decode(source))walks the tree and writes them back where they belong. Modify the data — rename a key, sort a list, delete a server — and the comments on the still-present nodes travel with them. Round-trip is byte-stable on the second pass.The line above a node, no blank line between. Stacks if you write more than one.
Same line as the value, after it. A line comment ends the line;
/* */blocks can stack inline.Between a key's
:and its value./* … */only — line comments would eat the value.Block-final note, separated from siblings by a blank line. Stays attached to the parent, not the last child.
ruamel.yamlonly (Python), opt-in, slowertoml-editcrate only (Rust), separate value type