Saying these models are at GPT-4 level is setting anyone who doesn't place special value on the local aspect up for disappointment.
Some people do place value on running locally, and I'm not against then for it, but realistically no 70B class model has the amount of general knowledge or understanding of nuance as any recent GPT-4 checkpoint.
That being said these models are still very strong compared to what we had a year ago and capable of useful work
I have to disagree. I understand it's very expensive, but it's still a consumer product available to anyone with a credit card.
The comparison is between something you can buy off the shelf like a powerful Mac, vs something powered by a Grace Hopper CPU from Nvidia, which would require both lots of money and a business relationship.
Honestly, people pay $4k for nice TVs, refrigerators and even couches, and those are not professional tools by any stretch. If LLMs needed a $50k Mac Pro with maxed out everything, that might be different. But anything that's a laptop is definitely regular consumer hardware.
There's definitely been plenty sources of hardware capable of running LLMs out there for a while, Mac or not. A couple 4090s or P40s will run 3.1 70b. Or, since price isn't a limit, there are other easier & more powerful options like a [tinybox](https://tinygrad.org/#tinybox:~:text=won%27t%20be%20consider...).
Yeah, a computer which starts at $3900 is really stretching that classification. Plus if you're that serious about local LLMs then you'd probably want the even bigger RAM option, which adds another $800...
An optioned up minivan is also expensive but doesn’t cost as much as a firetruck. It’s expensive but still very much consumer hardware. A 3x4090 rig is more expensive and still consumer hardware. An H100 is not, you can buy like 7 of these optioned up MBP for a single H100.
In my experience, people use the term in two separate ways.
If I'm running a software business selling software that runs on 'consumer hardware' the more people can run my software, the more people can pay me. For me, the term means the hardware used by a typical-ish consumer. I'll check the Steam hardware survey, find the 75th-percentile gamer has 8 cores, 32GB RAM, 12GB VRAM - and I'd better make sure my software works on a machine like that.
On the other hand, 'consumer hardware' could also be used to simply mean hardware available off-the-shelf from retailers who sell to consumers. By this definition, 128GB of RAM is 'consumer hardware' even if it only counts as 0.5% in Steam's hardware survey.
On the Steam Hardware Survey the average gamer uses a computer with a 1080p display too. That doesn't somehow make any gaming laptop with a 2k screen sold in the last half decade a non-consumer product. For that matter the average gaming PC on Steam is even above average relative to the average computer. The typical office computer or school Chromebook is likely several generations older doesn't have an NPU or discrete GPU at all.
For AI and LLMs, I'm not aware of any company even selling the models assets directly to consumers, they're either completely unavailable (OpenAI) or freely licensed so the companies training them aren't really dependent what the average person has for commercial success.
The Qwen2 models that run on my MacBook Pro are GPT-4 level too.