This is awesome. The new streaming stuff is particularly interesting to me, and it's very impressive that you managed to implement it with no protocol changes. I have a couple questions. Please forgive any misconceptions as I've never used capnproto myself, since all the streaming I've done has to work in the browser, and as far as I know capnproto doesn't work over WebSocket or WebRTC transports. But I've long been impressed with and inspired by capnproto and sandstorm.
Main question: is there a reason you opted for traditional window flow control a la TCP, as opposed to "request-N" style like in reactive streams[0] (see rsocket[1] for a great implementation)?
So with request-N, a server->client stream would look something like this:
And the server will only ever send as much data as has been requested by the client with requester calls. This results in really elegant flow control that takes into account both the network, and the client's capacity to consume, without the necessity of tracking windows. The receiver simply calls request(1) each time it processes a message. If you want a buffer you can just start with an assumed N=10, 100 etc.
I've found this worked really well when implementing omnistreams[2], which is basically a very thin streaming/multiplexing layer for WebSockets, since WS doesn't have any flow control. (fibridge[3] is a good example of it in action). I started with a window-style but once I learned about reactive streams the request model was much easier to reason about for me.
Hmm, to me, what you describe still sounds window-based, it's just that the receiver chooses the window size. The question then is: how does the receiver decide on a good size? If it chooses a window that is too small, it won't fully utilize the available bandwidth. If it chooses one too big, it'll create queuing delay.
This is a very hard question to answer and many academic papers have been written on the subject. But the strategies I thought about seemed easy enough to compute on the sender side, and the sender is the one that ultimately needs to know the window size in order to decide when to send more data.
But I can totally imagine that there are applications where the receiver knows better how much data it wants to request at a time. You can, of course, use a pattern like you suggest to accomplish that, without any help from the RPC system.
Regarding WebSockets, you could totally make Cap'n Proto RPC run over WebSocket. It wouldn't even be much work to hook up the C++ RPC implementation to KJ's HTTP library which supports WebSocket. The harder problem is that there isn't currently a JavaScript implementation of capnp RPC... :/
> If it chooses a window that is too small, it won't fully utilize the available bandwidth. If it chooses one too big, it'll create queuing delay.
Yeah, that's a valid concern, and one I've run into in practice.
It's true that in environments where the server has access to TCP socket information, traditional windowing will have an advantage for performance. You may even be able to do some sort of detection as to how saturated the interface is from other processes.
As I see it the main advantage of the pull-based backpressure I described is the simpler mental model, making it easier to reason about and implement. So in environments with limited system information for the sender (ie WebSockets, which knows basically nothing about how full the buffers are), you don't have to pay the extra complexity cost with no benefit.
Hmm, but if the puller doesn't actually know what value of `n` is ideal, then what benefit is there to a pull-based model vs. having the pusher choose an arbitrary `n`?
The network isn't the only resource in play. The puller is hypothetically more aware of the size of it's buffers, processing capacity, internet connection speed, etc. But again, to me the primary advantage is the mental model. For omnistreams the implementation ended being almost the same as the ACK-based system I started with, but shifting the names around and inverting the model in my head made it much easier to work with.
FWIW, Cap'n Proto's approach provides application-level backpressure as well. The application returns from the RPC only when it's done processing the message (or, more precisely, when it's ready for the next message). The window is computed based on application-level replies, not on socket buffer availability.
My experience was that in practice, most streaming apps I'd seen were doing this already (returning when they wanted the next message), so turning that into the basis for built-in flow control made a lot of sense. E.g. I can actually go back and convert Sandstorm to use streaming without actually introducing any backwards-incompatible protocol changes.
Ah I think I misread the announcement to mean you were using the OS buffer level information. But if I understand correctly you're just using the buffer size as a heuristic for the window size, then doing all the logic at the application level?
If that's the case, then implementation-wise these approaches are probably very similar, and window/ACK is the normal way of doing this, and also the pragmatic approach in your case.
Probably not without some work, but I think it'd be possible to get there, and I've definitely been considering that as a way forward. I worry that getting the code footprint down to the point of being reasonable for a web app might be tricky but we'll see.
Main question: is there a reason you opted for traditional window flow control a la TCP, as opposed to "request-N" style like in reactive streams[0] (see rsocket[1] for a great implementation)?
So with request-N, a server->client stream would look something like this:
interface MyInterface {
}And the server will only ever send as much data as has been requested by the client with requester calls. This results in really elegant flow control that takes into account both the network, and the client's capacity to consume, without the necessity of tracking windows. The receiver simply calls request(1) each time it processes a message. If you want a buffer you can just start with an assumed N=10, 100 etc.
I've found this worked really well when implementing omnistreams[2], which is basically a very thin streaming/multiplexing layer for WebSockets, since WS doesn't have any flow control. (fibridge[3] is a good example of it in action). I started with a window-style but once I learned about reactive streams the request model was much easier to reason about for me.
[0]: https://github.com/reactive-streams/reactive-streams-jvm
[1]: https://github.com/rsocket/rsocket
[2]: https://github.com/omnistreams/omnistreams-spec
[3]: http://iobio.io/2019/06/12/introducing-fibridge/