Yes, we are also looking at integrating MLX [1] which is optimized for Apple Silicon and built by an amazing team of individuals, a few of which were behind the original Torch [2] project. There's also TensorRT-LLM [3] by Nvidia optimized for their recent hardware.
All of this of course acknowledging that llama.cpp is an incredible project with competitive performance and support for almost any platform.
All of this of course acknowledging that llama.cpp is an incredible project with competitive performance and support for almost any platform.
[1] https://github.com/ml-explore/mlx
[2] https://en.wikipedia.org/wiki/Torch_(machine_learning)
[3] https://github.com/NVIDIA/TensorRT-LLM