Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

1/3rd "activated parameters", while also requiring 2x the VRAM.


That's the point of MoE. Sacrificing VRAM for compute/RAM bandwidth which makes it harder sell for consumer devices but easier for server devices where things are more likely to be compute or RAM bandwidth bound.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: