Yes, we are also looking at integrating MLX [1] which is optimized for Apple Sil...

Yes, we are also looking at integrating MLX [1] which is optimized for Apple Silicon and built by an amazing team of individuals, a few of which were behind the original Torch [2] project. There's also TensorRT-LLM [3] by Nvidia optimized for their recent hardware.

All of this of course acknowledging that llama.cpp is an incredible project with competitive performance and support for almost any platform.

[1] https://github.com/ml-explore/mlx

[2] https://en.wikipedia.org/wiki/Torch_(machine_learning)

[3] https://github.com/NVIDIA/TensorRT-LLM