Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

I tried Mixtral via ollama on my Apple M1 Max with 32GB of RAM, and was a total nonstarter. I ended up having to powercycle my machine. I then just used two L4 GPU's on Google Cloud (so 48GB of GPU RAM, see [1]) and it was very smooth and fast there.

[1] https://github.com/sagemathinc/cocalc-howto/blob/main/ollama...



Wow, as an author of the project I'm so sorry about you having to restart your computer. The memory management in Ollama needs a lot of improvement – will be working on this a bunch going forward. I also have a M1 32GB Mac and it's unfortunately just below the amount of memory Mixtral needs to run well (for now!)


Will the Mixtral 8x7 run well on a 64GB M2 Max then?


I'm using an M2 64GB and Mixtral works pretty well.


It’s wild my laptop can run AI models better than my RTX 4090 desktop. Thanks for the info!




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: