They can do fun and interesting stuff, but we keep hearing how they’re going to ...

vonneumannstan · 2025-03-28T14:40:05 1743172805

>I’m not sure how you can have played with LLMs so much and missed this. I hope you don’t trust what they say about recipes or how to handle legal problems or how to clean things or how to treat disease or any fact-checking whatsoever.

This is like a GPT3.5 level criticism. o1-pro is probably better at pure fact retrieval than most PhDs in any given field. I challenge you to try it.

In fact take the GPQA test yourself and see how you do then give the same questions to o1. https://arxiv.org/pdf/2311.12022