Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

> Things may have changed since, but AFAIK the C++ implementation would always allocate on the heap for nested messages

This is no longer the case if you use arenas: https://developers.google.com/protocol-buffers/docs/referenc...

> and perhaps even for optional scalars

This has never been the case, except for string fields where std::string forces us to allocate.

Ideally we will eventually use std::string_view for string accessors instead of std::string, so that even string data can be allocated on an arena instead of the heap.



It was never really the case. It's just before arenas it depended on your underlying heap allocator to do all the hard work.

Arenas have been around for a long while now though...


> This has never been the case, except for string fields where std::string forces us to allocate.

I'm was quite surprised you didnt offer your own stringview implementation (or something similar) the last time I looked at protobuf. I'd naively assume that inside Google this could be quite a low-efford high-reward optimization.


> I'm was quite surprised you didnt offer your own stringview implementation (or something similar) the last time I looked at protobuf.

We sort of do actually: https://github.com/protocolbuffers/protobuf/blob/master/src/...

The internal version of protobuf lets you switch individual string fields to string_view using [ctype=STRING_PIECE], but migrating the default away from std::string is mainly just an enormous migration challenge.

Internally we also do something slightly nuts: we break the encapsulation of std::string so that we can point it to arena-allocated memory (we then "steal" the memory back before the destructor runs). We can only afford to do this internally, where the implementation of std::string is known. The real long-term solution is to move to string_view.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: