There are several short-comings to the above approach though, specifically, you have to explicitly trade off performance vs. efficiency depending on the JSON you're dealing with.
JSON Patch is going to give you a consistent performance vs. efficiency curve which is desirable in a lot of circumstances.
It is my understanding that the JSON encoding of objects is not specified canonically because the order of keys can be arbitrary. That is, JSON.stringify() is free to return different strings for the same JSON object.
Is my understanding correct? If it is, that would totally break your suggestion.
You are correct, but if you control the environment enough to be doing this at all, then you control the environment enough to get the serialization consistent: either by having ordered dictionaries in the language a la Python and PHP, or implementing your own, or by sorting the keys lexicographically a la bencode.
That is, `diff(sorted_json_stringify(json_parse(p)))` will be adequate if `diff(p)` is not.
Your proposed algorithm is a particularly poor solution in terms of correctness if nothing else. For text, DMP performs quite well- patches can typically be applied in arbitrary order with the same result. For complex data formats, you very quickly arrive at invalid (JSON).
This is about sending a diff over the network without having to either have the full json document on hand, or worrying about other concurrent changes that might wreck the diff-match-patch. It's also json-aware versus operating on pure text which might lead to invalid json documents.
1. Errr, to apply any delta, including these, you need the document on hand.
If your goal is to store deltas, this is no better than anything else.
2. This is in no way better at handling concurrent changes than anything other diff format. In fact, the extra operations it has do nothing to help you here (it is as good and bad as copy/add, or insert/delete diffs).
I'd love to know why you seem to think otherwise.
JSON awareness only helps when trying to apply partial patches. At that point, the only thing it can possibly buy you is syntactic correctness, which is worthless without semantic correctness (and in fact, sometimes worse)
> Errr, to apply any delta, including these, you need the document on hand. If your goal is to store deltas, this is no better than anything else.
Doesn't this allow the sender to send updates without knowledge of the existing values, though? With a string diff, the existing value would have to be known.
> This is in no way better at handling concurrent changes than anything other diff format. In fact, the extra operations it has do nothing to help you here (it is as good and bad as copy/add, or insert/delete diffs). I'd love to know why you seem to think otherwise.
If you're sending a regular diff, you would clobber changes to other parts of the object if they were updated between your initial retrieval and subsequent change. But yeah, it doesn't seem to help with applying concurrent changes to the same part of an object, though I don't know if the patch format should shoulder that responsibility.
You can know keys/paths but nothing else. Or you might have user-specific parts that won't be concurrently modified, but the base JSON doc might be modified by others. You'd have to have the latest version to do a text-based patch.
A text-based diff doesn't really gain much for what it loses in use cases.
A simple example: if the order of the keys is different, say because of a different json implementation, semantically the document is identical, but you can't apply the patch anymore.
The OP was talking about the current incumbent, and the solution proposed does not work for web applications, since the JSON.stringify in most browsers do not sort keys. Sure, you could implement your own javascript version, but it will be at least a couple of orders of magnitude slower.
So then it's a bad implementation of a textual tree diff algorithm .... which again, is a standard copy/add delta format (with a small amount of metadata).
But its Javascript Object Notation - Notation in this case implies text - if your storing it in something other than text it's not JSON.
From Wikipedia: JavaScript Object Notation, is an open standard format that uses human-readable text to transmit data objects consisting of attribute–value pairs.
Like you pointed out JSON is used to transmit data objects, but storage of data objects on the server or local file is entirely different story. I hope I don't have to explain further.
1. JSON.stringify()
2. Diff-match-patch: https://code.google.com/p/google-diff-match-patch/
3. JSON.parse()
There are several short-comings to the above approach though, specifically, you have to explicitly trade off performance vs. efficiency depending on the JSON you're dealing with.
JSON Patch is going to give you a consistent performance vs. efficiency curve which is desirable in a lot of circumstances.