Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

I wonder if Ollama will or plans to have other "Supported backends" than llama.cpp. It's listed on the very last line of their readme as if the llama.cpp dependency is just incidental and a very minor detail rather than Ollama as a deployment mechanism for llama.cpp and gguf based models.


Yes, we are also looking at integrating MLX [1] which is optimized for Apple Silicon and built by an amazing team of individuals, a few of which were behind the original Torch [2] project. There's also TensorRT-LLM [3] by Nvidia optimized for their recent hardware.

All of this of course acknowledging that llama.cpp is an incredible project with competitive performance and support for almost any platform.

[1] https://github.com/ml-explore/mlx

[2] https://en.wikipedia.org/wiki/Torch_(machine_learning)

[3] https://github.com/NVIDIA/TensorRT-LLM


MLX and TensorRT would be really nice!


I don't think they will move away from llama.cpp until they are forced to. The number of people contributing to llama.cpp is quite significant [1] and it wouldn't make sense to use another backend given how quickly llama.cpp is iterating and growing.

[1] https://devboard.gitsense.com/ggerganov?r=ggerganov%2Fllama....

Full disclosure: This is my tool


ghost of christmas future

The chance onnx becomes significantly relevant here went from 1% to 15% this week. They're demo'ing ~2x faster inference with Phi-3. There's been fits and starts on LLMs in ONNX for a year, but, with Wintel's AI PC™ push, and all the constituent parts in place (4 bit quants! adaptive quants!), I'd put very good money on it.


So you are saying Ollama is a strong MS acquisition in the future if onnx works out.


no, ONNX is a Microsoft project, I don't know why people know what Ollama is and I don't think they will in a year


I know it is a Microsoft Project. My reasoning is, if Ollama supports ONNX and if it can provide performance on par or better than llama.cpp, it would make sense for Microsoft to acquire Ollama for distribution reasons.


Llama.cpp is the valuable bit here, and Ollama is only good for end user convenience. It saves you 20 minutes of googling and futzing with the million and one llama.cpp wrappers available for every language, lets you set up things to load on startup, but if you're building something for scale or backend, neither llama.cpp or ollama are coming along for the ride. At best it'll live through a proof of concept stage, but as soon as you start caring about performance it's getting discarded.

Microsoft isn't going to pay for something that amounts to a useful setup script wrapped around an inefficient convenience library intended for people to be able to run AI on consumer hardware. There's no exploitable value proposition, whereas building their own closed source AI systems that are tightly coupled to the Windows ecosystem and favor cloud services allows them to extract maximum rent.


Their behaviour around llama.cpp acknowledgement is very shady. Until the very recent, there was no mention of llama.cpp in their README at all and now it's tucked away all the way down. Compare that to the originally proposed PR for example: https://github.com/ollama/ollama/pull/3700


Do you know maybe what are these alternative engines they're talking about? Or is it just a way to evade the fact that at the end of the day it is just a wrapper around llama.cpp?


It was mentioned in another comment to the parent. There are no alternatives currently, the whole thing has been built upon llama.cpp since its inception.


Ollama is great. I actually wish they would wrap OpenAI and Azure and generally act as as a proxy for third party APIs. Having a consistent, well thought out API which isn't tied to a single provider would be really good for the community.

Edit: this would be useful because in many cases some workloads can be local, but others cannot... e.g. if you really need gpt4 for specific queries.


It is open source, so if you want to see this in ollama, pull requests are welcome. :)




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: