> It has been many years since I shipped a memory bug in C++. It is just not a r...

ncmncm · on April 23, 2019

>Can you write down the algorithm that you use to avoid writing memory bugs? Can you teach others how to do it?

Yes. Code using powerful libraries. Every use of a powerful library eliminates any number of every kind of bug.

Rust has not caught up to C++'s ability to code powerful libraries, and might never. C++ is a moving target. C++20 is more powerful than C++17, which was more powerful than 14, 11, 03.

There are certainly niches for less powerful languages. Rust is more powerful, and nicer to code in, than many that occupy those. It will completely displace Ada, for example.

lmm · on April 23, 2019

> Yes. Code using powerful libraries. Every use of a powerful library eliminates any number of every kind of bug.

So if I find that a C++ project is using powerful libraries, I can be confident that it doesn't have memory errors? History suggests not.

ncmncm · on April 23, 2019

If I find a Rust program that is (perforce) not using powerful libraries, can I be confident that it does not harbor grave errors?

Certainly not. Rust takes aim at memory errors, and misses the rest that would be avoided by encapsulating bug-prone code in libraries. C++ enables capturing bug-prone code in well-tested libraries, eliminating whole families of bugs, including, in my recent experience, memory bugs.

That is not to say all C++ code is bug-free. Google and Mozilla code, by corporate fiat, is forbidden to participate.

lmm · on April 23, 2019

> If I find a Rust program that is (perforce) not using powerful libraries, can I be confident that it does not harbor grave errors?

You can be confident that it doesn't harbour memory errors. You can be confident that it doesn't contain arbitrary code execution bugs, which is a much better circumstance than with any C++ project I've seen (C++ by its nature turns almost any bug into a security bug).

IME you can also have a much higher level of confidence that it does what you expect (including not having bugs) than you would for a C++ project, because of Rust's more expressive type system.

> C++ enables capturing bug-prone code in well-tested libraries, eliminating whole families of bugs, including, in my recent experience, memory bugs.

And yet in practice you can neither be confident that there are no memory bugs, nor that there are no other bugs. Even the big name C++ libraries are riddled with major bugs. Perhaps libraries that are written in a certain fashion avoid this bugginess, but that's of little use when it's not possible to tell from a glance whether a given library is one of the buggy ones or not.

ncmncm · on April 23, 2019

This is the classic False Dichotomy.

Rust programs have bugs. Rust programs have security bugs. Are they mediated by memory usage bugs? Probably not, unless the program has unsafe blocks, or uses libraries with unsafe blocks, or libraries that use libraries that have unsafe blocks, or call out to C libraries. Or tickle a compiler bug.

Can it leak my credentials to a network socket as a consequence of any of those bugs, memory or otherwise?

Putting your memory errors in unsafe blocks may make them invisible to you, but that does not make them go away.

So, yes, of course it can.

lmm · on April 23, 2019

> Can it leak my credentials to a network socket as a consequence of any of those bugs, memory or otherwise?

Sure, that class of bugs still exists. But they're rarer and less damaging (even with stolen credentials, an attacker can't do as much damage as one who had arbitrary code execution).

Rust eliminates many classes of bugs. C++ does not: the fact that theoretically there could be non-buggy C++ libraries doesn't help you out in practice, because there's no way to distinguish those libraries from the very many buggy C++ libraries.

> Putting your memory errors in unsafe blocks may make them invisible to you, but that does not make them go away.

It's just the opposite: it makes the risk very visible, so in Rust you can choose to avoid libraries with unsafe. Whereas in C++ any library you might choose is likely to have memory safety bugs and therefore arbitrary code execution vulnerabilities.

pjmlp · on April 24, 2019

Kind of true, AFAIK Rust binary libraries don't expose safety information, like it happens in ClearPath or .NET Assemblies.

Still too many libraries make use of unsafe when they could be fully written in safe Rust.

pjmlp · on April 24, 2019

Rust cannot displace Ada until it fulfills the business and security requirements that keep Ada alive.

jstimpfle · on April 23, 2019

> Can you write down the algorithm that you use to avoid writing memory bugs? Can you teach others how to do it?

Structure the code in a way such that it is obvious what happens. Use "semantic compression" (e.g. be clear about your concepts and factor them in free standing functions), but don't overabstract/overengineer.

Eliminate special cases. If the code has few branches and data dependendencies, then successful manual testing gives already high confidence that it will be pretty robust in production.

Prefer global allocations (buffers with the same lifetime as the process), not local state. This also makes for much clearer code, since it avoid heavy plumbing / indirections.

I tend to think that modern programming language features mostly enable us to stay longer with bad structure. And when you hit the next road block, fixing that will be correspondingly harder.

lmm · on April 23, 2019

> Structure the code in a way such that it is obvious what happens. Use "semantic compression" (e.g. be clear about your concepts and factor them in free standing functions), but don't overabstract/overengineer.

This sounds little different from "write good code, don't write bad code." I'm sure we all agree on these things, but I'm sure the people who write terrible code weren't trying to be unclear or trying to overengineer.

> Eliminate special cases. If the code has few branches and data dependendencies, then successful manual testing gives already high confidence that it will be pretty robust in production.

True enough, but that's so much easier in a language with sum types.

> Prefer global allocations (buffers with the same lifetime as the process), not local state. This also makes for much clearer code, since it avoid heavy plumbing / indirections.

That's a pretty controversial viewpoint, since it makes composition impossible (indeed taken to its logical extreme this would mean never writing a library, whereas the grandparent was convinced that more use of libraries was the way to write good code).

> I tend to think that modern programming language features mostly enable us to stay longer with bad structure. And when you hit the next road block, fixing that will be correspondingly harder.

Interesting; that's the opposite of my experience. I find modern language features mostly guide us down the path that most of us already agreed was good programming style, enforcing things that were previously only rules of thumb (and that we had to resist the temptation to bend when things got tricky). And so the modern language forces you to solve problems properly rather than hacking a workaround, and the further you scale the more that will help you.

jstimpfle · on April 23, 2019

>> Eliminate special cases. [...] > True enough, but that's so much easier in a language with sum types.

These languages make it easier to have more special cases. There's a difference.

> That's a pretty controversial viewpoint, since it makes composition impossible (indeed taken to its logical extreme this would mean never writing a library, whereas the grandparent was convinced that more use of libraries was the way to write good code).

I don't see why that should be the case. Aside from the fact that composition/"reuse" is way overrated, libraries can always opt for process- or thread-wide global state. Another possibility would be to have global state per use (store pointer handles), and passing a pointer only to library API calls. The latter is also the most realistic case since most libraries take pointer handles. I absolutely have these handles stored in process global data. For example, Freetype handle, windowing handle, sound card handle, network socket handle, etc.

Also called "singleton" in OOP circles. Singletons are nothing but global data with nondeterminstic initialization order and superfluous syntax crap on top. Other than that, they are indeed good choices (as is global data) since lifetime management and data plumbing is a no-brainer.

> I find modern language features mostly guide us down the path that most of us already agreed was good programming style

But just the paragraph before you said you didn't agree with mine? In my opinion, OOP, or more specifically, lots of isolated allocations connected by pointers/references, make for hard to follow code since there is so much hiding and indirection even within the same project/maintenance boundaries without benefit. In any case I absolutely agree that this style is not doable in C. You need automated, static or dynamic (runtime) ownership tracking.

lmm · on April 23, 2019

> I don't see why that should be the case.

At the most basic level, if project A makes use of library B and library C, then you want to be able to verify the behaviour of library B and library C independently and then make use of your conclusions when analysing project A. But if library B and library C use global state then you can't have any confidence that that will work. E.g. if both library B and library C use some other library D that has some global construct, then they will likely interfere with each other.

> Another possibility would be to have subproject-wide global state, and passing a pointer only to library API calls. The latter is also the most realistic case since most libraries take pointer handles.

At that point you're not using global state in the library, which was the point.

> you can always opt for process- or thread-wide global state

That doesn't solve the problem at all.

> Also called "singleton" in OOP circles. Singletons are nothing but global data with nondeterminstic initialization order and superfluous syntax crap on top.

Indeed, and they're seen as bad practice for the same reason as global state in general.

jstimpfle · on April 23, 2019

> At that point you're not using global state in the library, which was the point.

Yes. But I want to make clear that you are still using global state for all uses within the project itself. The library can be implemented in whatever way. For example, setting the pointer in a global variable on API entry ;-)

> That doesn't solve the problem at all.

WHICH problem? I don't think there is one.

> Indeed, and they're seen as bad practice for the same reason as global state in general.

This is foolish. There is no problem with global state. Global state is a fact of life. Your process has one address space. It has (probably) one server socket for listening to incoming request. It has (probably) one graphics window to show its state. Whenever you have more (e.g. file descriptors, memory mappings, ...), well then you have a set of that thing, but you have ONE set :-). And so on.

You are not writing a thousand pseudo-isolated programs. But ONE. One entity composed of a fixed number of parts (i.e. modules, code files) that work together to do what must be done.

Why add indirection? Why make it hard to iterate over all open file descriptors? Why thread a window handle through 15 layers of function calls when you have only one graphics window? It adds a lot of boilerplate. It even brings some people to invent hard to digest concepts like monads or objects just to make that terrible code manageable. It makes the code unclear. Someone once described it with this analogy, "I don't say ''I'm meeting one of my wives tonight'', unless I have more than one".

lmm · on April 23, 2019

> Yes. But I want to make clear that you are still using global state for all uses within the project itself.

But if we believe in using libraries then often our project will itself be a library.

> The library can be implemented in whatever way. For example, setting the pointer in a global variable on API entry ;-)

And then you have the problem I mentioned: if there is a diamond dependency on your library then the thing using it will break.

> WHICH problem? I don't think there is one.

The problem of not being able to break down your project and understand it piecemeal.

> Global state is a fact of life. Your process has one address space. It has (probably) one server socket for listening to incoming request. It has (probably) one graphics window to show its state.

All those global things are a common source of bugs, as different pieces of the program make subtly different assumptions about them. Perhaps a certain amount of global state is unavoidable. That's not an argument against minimizing it.

> You are not writing a thousand pseudo-isolated programs. But ONE. One entity composed of a fixed number of parts (i.e. modules, code files) that work together to do what must be done.

If you write a program that can only be understood in its entirety, you'll be unable to maintain it once it becomes too big to fit in your head. Writing a thousand isolated functions gives you something much easier to understand and scale.

jstimpfle · on April 23, 2019

> The problem of not being able to break down your project and understand it piecemeal.

That's just incredibly untrue. It's FUD spread by OOP and FP zealots.

> All those global things are a common source of bugs, as different pieces of the program make subtly different assumptions about them.

Do you want to say that my logging routine is more complex because my windowing handle is stored in a globally accessible place?

> Perhaps a certain amount of global state is unavoidable. That's not an argument against minimizing it.

My advice is to make clear what the data means. Make it simple. Don't put a blanket over what's already hard to grasp.

lmm · on April 23, 2019

> Do you want to say that my logging routine is more complex because my windowing handle is global data?

If your logging routine touches your windowing handle that certainly makes it more complex. If I'm meant to know that your logging routine doesn't touch your windowing handle, that's precisely the statement that it isn't global data.

jstimpfle · on April 23, 2019

It is global data, because it can (and should be) used without threading it through 155 functions.

In terms of the relational data model, it is global data because there is always one, and only one, of it.

jstimpfle · on April 23, 2019

> But if we believe in using libraries then often our project will itself be a library.

How about making the project good first? Let's try to get something done instead of theoretizing.

lmm · on April 23, 2019

You mean start by building something that can be used and tested in isolation, rather than trying to build an enormous system in one go? Isn't that what you've been arguing against?

jstimpfle · on April 23, 2019

No I mean solve the problem "we need to build a program that does what it's required to do" (and no more) before trying to build a library that will cure diseases.

lmm · on April 23, 2019

That's a total non sequitur. Libraries can, and usually should, be much smaller than applications.

jstimpfle · on April 23, 2019

Libraries are much harder than applications because they must work for a large number of applications with diverse requirements. They need to be more abstract, and therein lies the danger.

Regarding the size, clearly wrong. It depends a lot on the library. A windowing or font rastering library will be a lot larger than your typical application.

And for libraries that are much smaller than the application itself, why bother depending on them? (Anecdote, I heard the Excel team in the 90s had their own compiler).

tomtung · on April 24, 2019

At this point I'm really unsure whether this is trolling or not.

jstimpfle · on April 24, 2019

Just discussing. Why would it be trolling what I do and not what the other guy does?