Yeah; expected limits are also fantastically useful in performance engineering. It’s very common your code needs to handle an arbitrarily sized input, but 99% of the time the input will be bounded. (Or generally simpler). Special casing the common code path can make a lot of code run much faster.
For example, in some code I’m writing at the moment I have lists of integers all over the place. I call them lists - usually they only have 1 element. Sometimes they have 2 elements (10%) and very occasionally more than 2 elements or they’re empty (<1%).
If I used a language like Javascript, I’d use Arrays. But arrays are quite expensive performance wise - they need to be allocated and tracked by the GC and the array contents are stored indirectly.
Instead, I’m using an array type which stores up to 2 items inline in the container object (or the stack) without allocating. It only allocates memory on the heap when there are 3 or more items. This decreases allocations by 2 orders of magnitude, which makes a really big difference for performance in my library. And my code is just as readable.
I’m using the smallvec crate. There’s plenty of libraries in C and Rust for this sort of thing in arrays and strings. Swift (like obj-c before it) builds small string optimizations into the standard library. I think that’s a great idea.
While processing, yes - an arena allocator is a better fit. But my data is loaded from disk & held in memory while its manipulated by consumers of my API. Given the lifetime is determined by the caller, there's no obvious arena to allocate from.
I could put the whole thing into a long lived arena - but unless I'm careful, some operations would leak memory.
But it would definitely be better from a performance standpoint. Using smallvec, every time these values are read or written the code needs to check if the value is "spilled" or not. And I think there's a lot of code monomorphization involved too - using a vec in an arena would probably make my binary a fair bit smaller.
Even with an arena allocator, the indirection is likely to increase cache misses, especially when the array element type is something small like an integer.
It’s possible to implement something like smallvec without the branch by having it always contain a pointer field, which points to either the inline storage or a heap allocation. However this means it can’t be moved in memory (has to be pinned), and also means you can’t reuse the pointer field to be part of the inline storage in the inline case.
I'd love to see some real numbers showing how these different decisions impact performance and code size. I suspect the branch cost is pretty minimal because so few of my smallvecs get spilled - so the branch predictor probably does a pretty good job at this.
And there's often fiercely diminishing returns from optimizing allocations. Dropping the number of allocations from 1M to 1k made a massive performance difference. Dropping it from 1k to 1 will probably be under the benchmark noise floor.
I don't think that goes against the rule. Your code doesn't impose an arbitrary limit on the data, it just internally represents it differently based on size.
For example, in some code I’m writing at the moment I have lists of integers all over the place. I call them lists - usually they only have 1 element. Sometimes they have 2 elements (10%) and very occasionally more than 2 elements or they’re empty (<1%).
If I used a language like Javascript, I’d use Arrays. But arrays are quite expensive performance wise - they need to be allocated and tracked by the GC and the array contents are stored indirectly.
Instead, I’m using an array type which stores up to 2 items inline in the container object (or the stack) without allocating. It only allocates memory on the heap when there are 3 or more items. This decreases allocations by 2 orders of magnitude, which makes a really big difference for performance in my library. And my code is just as readable.
I’m using the smallvec crate. There’s plenty of libraries in C and Rust for this sort of thing in arrays and strings. Swift (like obj-c before it) builds small string optimizations into the standard library. I think that’s a great idea.
https://crates.io/crates/smallvec