Isn't this a misinterpretation of what everyone in the AI safety space is worrie...

dan-robertson · 2026-06-01T21:54:15 1780350855

The super literal interpretation ideas were much more common in the past when LLMs didn’t exist. Now we have models that are generally pretty good at picking up on nuance and understanding what you mean but also often quite bad at execution, which is roughly the opposite of that idea. I think reward hacking is perhaps the closest we see llms get to literal/malicious interpretations of instructions.

wizzwizz4 · 2026-06-01T23:20:18 1780356018

LLMs are neither of those. They're quite good at pretending they understand what you mean, but they don't. That's why they can't execute: they're mimicking the form, not the substance, and then we see the form and anthropomorphise them in our minds.

Timwi · 2026-06-01T23:47:04 1780357624

That's a lot of assertions with no real argument to back it up.

customguy · 2026-06-02T05:23:15 1780377795

Any one of those "hey, can you count to 100 for me?" type shorts should be enough..

dmos62 · 2026-06-01T19:38:30 1780342710

It very well could be, I don't really follow those discussions. Honestly, if I were worried about something on Earth intellectually evolving at a suboptimal pace, it would be humans.