The thing you're competing against in terms of performance and efficiency is: 1....

nhaehnle · on Oct 30, 2014

It is my understanding that the JSON encoding of objects is not specified canonically because the order of keys can be arbitrary. That is, JSON.stringify() is free to return different strings for the same JSON object.

Is my understanding correct? If it is, that would totally break your suggestion.

drostie · on Oct 30, 2014

You are correct, but if you control the environment enough to be doing this at all, then you control the environment enough to get the serialization consistent: either by having ordered dictionaries in the language a la Python and PHP, or implementing your own, or by sorting the keys lexicographically a la bencode.

That is, `diff(sorted_json_stringify(json_parse(p)))` will be adequate if `diff(p)` is not.

kansface · on Oct 30, 2014

Your proposed algorithm is a particularly poor solution in terms of correctness if nothing else. For text, DMP performs quite well- patches can typically be applied in arbitrary order with the same result. For complex data formats, you very quickly arrive at invalid (JSON).

azinman2 · on Oct 30, 2014

This is about sending a diff over the network without having to either have the full json document on hand, or worrying about other concurrent changes that might wreck the diff-match-patch. It's also json-aware versus operating on pure text which might lead to invalid json documents.

DannyBee · on Oct 30, 2014

1. Errr, to apply any delta, including these, you need the document on hand. If your goal is to store deltas, this is no better than anything else.

2. This is in no way better at handling concurrent changes than anything other diff format. In fact, the extra operations it has do nothing to help you here (it is as good and bad as copy/add, or insert/delete diffs). I'd love to know why you seem to think otherwise.

JSON awareness only helps when trying to apply partial patches. At that point, the only thing it can possibly buy you is syntactic correctness, which is worthless without semantic correctness (and in fact, sometimes worse)

function_seven · on Oct 30, 2014

> Errr, to apply any delta, including these, you need the document on hand. If your goal is to store deltas, this is no better than anything else.

Doesn't this allow the sender to send updates without knowledge of the existing values, though? With a string diff, the existing value would have to be known.

> This is in no way better at handling concurrent changes than anything other diff format. In fact, the extra operations it has do nothing to help you here (it is as good and bad as copy/add, or insert/delete diffs). I'd love to know why you seem to think otherwise.

If you're sending a regular diff, you would clobber changes to other parts of the object if they were updated between your initial retrieval and subsequent change. But yeah, it doesn't seem to help with applying concurrent changes to the same part of an object, though I don't know if the patch format should shoulder that responsibility.

nacs · on Oct 30, 2014

> Doesn't this allow the sender to send updates without knowledge of the existing values, though?

This system requires knowledge of existing values/nodes to create the diff/patch document.

However for JSON data, it does have the advantage of not clobbering data unlike a string diff as you mentioned.

azinman2 · on Oct 30, 2014

You can know keys/paths but nothing else. Or you might have user-specific parts that won't be concurrently modified, but the base JSON doc might be modified by others. You'd have to have the latest version to do a text-based patch.

A text-based diff doesn't really gain much for what it loses in use cases.

function_seven · on Oct 30, 2014

I don't see why you need to know values. The operations don't require the sender to supply them.

kolev · on Oct 30, 2014

I think you're missing the point. If I want to patch just one node and don't care what the rest of the document is, I cannot just use a text diff.

DannyBee · on Oct 30, 2014

Can you explain why?

This seems like a trivial application of any copy/add delta format, which are easily expressed in text diff format.

jackhammer · on Oct 30, 2014

A simple example: if the order of the keys is different, say because of a different json implementation, semantically the document is identical, but you can't apply the patch anymore.

DonHopkins · on Oct 30, 2014

That's why you use a canonicalizing JSON serializer, which you'd also want to use if you were tracking changes to JSON in git, for example.

hmbg · on Oct 30, 2014

The OP was talking about the current incumbent, and the solution proposed does not work for web applications, since the JSON.stringify in most browsers do not sort keys. Sure, you could implement your own javascript version, but it will be at least a couple of orders of magnitude slower.

azinman2 · on Oct 30, 2014

That's very limiting. So I'm suppose to use the same Json library on various platforms for client as well as server? And what have you really gained?

DannyBee · on Oct 30, 2014

There are plenty of textual tree delta formats too.

verroq · on Oct 30, 2014

Because you may not want to serialise your data. Because the order of a json dictionary key is undefined.

kolev · on Oct 30, 2014

Text diff is based on surrounding lines, specifically, siblings, parents, and children in this case. In JSON Patch, only the parents matter.

DannyBee · on Oct 30, 2014

So then it's a bad implementation of a textual tree diff algorithm .... which again, is a standard copy/add delta format (with a small amount of metadata).

chj · on Oct 30, 2014

JSON data is not always stored as texts.

disordinary · on Oct 30, 2014

How else would you store it? Javascript Object Notation is text. Other formats such as BSON are different.

Or are you simply talking about javascript style object literals?

chj · on Oct 30, 2014

JSON is external representation of objects. You can store the actual data on server in any form you like. Doesn't mean you have to store them in text.

disordinary · on Oct 30, 2014

But its Javascript Object Notation - Notation in this case implies text - if your storing it in something other than text it's not JSON.

From Wikipedia: JavaScript Object Notation, is an open standard format that uses human-readable text to transmit data objects consisting of attribute–value pairs.

chj · on Oct 31, 2014

Like you pointed out JSON is used to transmit data objects, but storage of data objects on the server or local file is entirely different story. I hope I don't have to explain further.

disordinary · on Nov 2, 2014

Yes, then its no longer json...