I’ve often heard this (YAML is a superset of JSON) but never looked into the det...

kelnos · on Jan 24, 2022

CPAN link provided by the parent says 1.2 still isn't a superset:

> Addendum/2009: the YAML 1.2 spec is still incompatible with JSON, even though the incompatibilities have been documented (and are known to Brian) for many years and the spec makes explicit claims that YAML is a superset of JSON. It would be so easy to fix, but apparently, bullying people and corrupting userdata is so much easier.

terom · on Jan 24, 2022

Are these documented YAML 1.2 JSON incompatibilities listed / linked to somewhere?

I assume these are something related to non-ascii string encoding / escapes?

guelo · on Jan 24, 2022

They are listed in that same CPAN link

"Please note that YAML has hardcoded limits on (simple) object key lengths that JSON doesn't have and also has different and incompatible unicode character escape syntax... YAML also does not allow \/ sequences in strings"

DHowett · on Jan 24, 2022

The JSON::XS documentation linked above reports that YAML 1.2 is not a strict superset of JSON:

> Addendum/2009: the YAML 1.2 spec is still incompatible with JSON

The author also details their issues in, ah, getting some of the authors of the YAML specification to agree.

vbezhenar · on Jan 24, 2022

I just checked YAML 1.2 and it seems that 1024 limit length on keys still in spec (https://yaml.org/spec/1.2.2/, ctrl+f, 1024). So any JSON with long keys is not compatible with YAML.

twic · on Jan 24, 2022

The JSON specification [1] states:

> An implementation may set limits on the length and character contents of strings.

So this length limit is not a source of incompatibility with JSON.

[1] https://datatracker.ietf.org/doc/html/rfc7159#section-9

eatonphil · on Jan 24, 2022

Wow! That makes it pretty hard to know you've generated useful JSON especially if your goal is to for cross-ecosystem communication.

forty · on Jan 24, 2022

To be fair, any JSON implentation is going to have a practical limit on the key size, it's just a bit more random and harder to figure out :)

hvdijk · on Jan 24, 2022

If you mean limited by available memory, then sure but that does not apply just to key size. If you mean something else, could you elaborate?

ionicgiraffe · on Jan 24, 2022

Another reason to have a limit well below the computer's memory capacity is that one could find ill-formed documents in the wild, e.g., an unclosed quotation mark, causing the "rest' of a potentially large file to be read as a key, which can quickly snowball (imagine if you need to store the keys in a database, in a log, if your algorithms need to copy the keys, etc.)

forty · on Jan 24, 2022

I assume JSON implementations have a some limit on the key size (or on the whole document which limits the key size), hopefully far below the available memory.

hvdijk · on Jan 24, 2022

I assume and hope that they do not, if there is no rule stating that they are invalid. There are valid reasons for JSON to massive keys. A simple one: depending on the programming language and libraries used, an unordered array ["a","b","c"] might be better mapped as a dictionary {"a":1,"b":1,"c":1}. Now all of your keys are semantically values, and any limit imposed on keys only makes sense if the same limit is also imposed on values.

forty · on Jan 24, 2022

Yes absolutely, in practice the limit seems to be on the document size rather than on keys specifically. That said it still sets a limit on the key size (to something a bit less that the max full size), and some JSON documents valid for a given JSON implentation might not be parsable by others, in which case the Yaml parsers are no exceptions ;)

I'm not even sure why I'm playing the devil's advocate, I hate Yaml actually :D

01acheru · on Jan 24, 2022

I guess it is about different implementations of some not properly formalized parts of the JSON spec.

There was also an article here some time ago but I cannot find it right now.

clcaev · on Jan 24, 2022

1024 limit is for unquoted keys, which do not occur in JSON

tinita · on Jan 24, 2022

Have a closer look. The 1024 limit in version 1.2 is only for implicit block mapping keys, not for flow style `{"foo": "bar"}`

peterburkimsher · on Jan 24, 2022

In the beginning was the SGML.

Then we said it's too verbose. We named some subsets XML, HTML, XLSX.

Then we said it's still too long. So we named some subsets Markdown, and YML.

Then we said it's still too long, and made JSON.

What's wrong with subsets? Ambiguity in naming things.

https://martinfowler.com/bliki/TwoHardThings.html

Is JSON the same as YML?

NO.

Norwegian?

https://news.ycombinator.com/item?id=26671136

tannhaeuser · on Jan 24, 2022

> Then we said it's too verbose. We named some subsets XML, HTML, XSLX

If anything, XML as an SGML subset is more verbose than SGML proper; in fact, getting rid of markup declarations to yield canonical markup without omitted/inferred tags, shortforms, etc. was the entire point of XML. Of course, XML suffered as an authoring format due to verbosity, which led to the Cambrian explosion of Wiki languages (MediaWiki, Markdown, etc.).

Also, HTML was conceived as an SGML vocabulary/application [1], and for the most part still is [2] (save for mechanisms to smuggle CSS and JavaScript into HTML without the installed base of browsers displaying these as content at the time, plus HTML5's ad-hoc error recovery).

[1]: http://info.cern.ch/hypertext/WWW/MarkUp/MarkUp.html

[2]: http://sgmljs.net/docs/html5.html

coldtea · on Jan 24, 2022

Well, Markdown and YML and JSON are not subsets of SGML, nobody claims they are, and nobody intented them as such. So there's that.

tannhaeuser · on Jan 24, 2022

While indeed neither markdown, much less JSON syntax has been intended as an SGML app, that doesn't stop SGML from parsing JSON, markdown, and other custom Wiki syntax using SHORTREF [1] ;) In fact, the original markdown language is specified as a mapping to HTML angle-bracket markup (with HTML also an SGML vocabulary), and thus it's quite natural to express that mapping using SGML SHORTREF, even though only a subset can be expressed.

[1]: https://www.balisage.net/Proceedings/vol17/html/Walsh01/Bali...

[2]: https://daringfireball.net/projects/markdown/

dudeinjapan · on Jan 24, 2022

First they came for the angle brackets. And I did not speak out. Because I did not use XML...

peterburkimsher · on Jan 24, 2022

You didn't use XML? But We use XML to read the comments here on this HTML web page.

But I came for the angle brackets. Because I < We, eternally.

slightwinder · on Jan 24, 2022

> Then we said it's still too long. So we named some subsets Markdown, and YML.

> Then we said it's still too long, and made JSON.

JSON is older than markdown and yaml.

peterburkimsher · on Jan 25, 2022

Thank you for correcting history! I'd forgotten >_<

eadmund · on Jan 24, 2022

I think you'll find that in the beginning were M-expressions, but they were evil, and were followed by S-expressions, which were and are and ever will be good.

SGML and its descendants are okay for document markup.

XML for data (as opposed to markup) is either evil or clown-shoes-for-a-hat insane — I can’t figure out which.

JSON is simultaneously under- and over-specified, leading to systems where everything works right up until it doesn't. It shares a lot with C and Unix in this respect.

tlavoie · on Jan 25, 2022

If XML for data is bad, check out XML as a programming language. I think this has cropped up a few times, one that stuck with me was as templating structures in the FutureTense app server, before being acquired by OpenMarket and they switched to JSPs or something.

Lots of <for something> <other stuff> </for> sorts of evil.

irrational · on Jan 24, 2022

note: HTML5 is not a subset of SGML.