Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

Yes, George Hotz (geohot) reverse engineered the neural engine and could make it work for tinygrad, the videos posted in the other reply describe the reverse engineering process.

I wonder why Apple didn't provide low-level API's to access the hardware? It may have various restrictions. I recall Apple also didn't provide proper API's to access OpenCL frameworks on iOS, but some people found workarounds to access that as well. Maybe they only integrate with a few limited but important use cases, TensorFlow, Adobe that they can control.

Could it be that using the ANE in the wrong way overheats the M1?



Because machine learning accelerators are, in the broadest sense, not "done" and rapidly evolving every year. Exposing too many details of the underlying architecture is a prime way to ossify your design, making it impossible to change, and as a result you will fall behind. It is possible the Neural Engine of 2022 will look very different to the one of 2025, as far as the specifics of the design, opcode set, etc all go.

One of the earliest lessons along this line was Itanium. Itanium exposing so much of the underlying architecture as a binary format and binary ABI made evolution of the design extremely difficult later on, even if you could have magically solved all the compiler problems back in 2000. Most machine learning accelerators are some combination of a VLIW and/or systolic array design. Most VLIW designers have learned that exposing the raw instruction pipeline to your users is a bad idea not because it's impossibly difficult to use (compilers do in fact keep getting better), but because it makes change impossible later on. This is also why we got rid of delay slots in scalar ISAs, by the way; yes they are annoying but they also expose too much of the implementation pipeline, which is the much bigger issue.

Many machine learning companies take similar approaches where you can only use high-level frameworks like Tensorflow to interact with the accelerator. This isn't something from Apple's playbook, it's common sense once you begin to design these things. In the case of Other Corporations, there's also the benefit that it helps keep competitors away from their design secrets, but mostly it's for the same reason: exposing too much of the implementation details makes evolution and support extremely difficult.

It sounds crass but my bet is that if Apple exposed the internal details of the ANE and later changed it (which they will, 100% it is not "done") the only "outcome" would be a bunch of rageposting on internet forums like this one. Something like: "DAE Apple mothershitting STUPID for breaking backwards compatibility? This choice has caused US TO SUFFER, all because of their BAD ENGINEERING! If I was responsible I would have already open sourced macOS and designed 10 completely open source ML accelerators and named them all 'Linus "Freakin Epic" Torvalds #1-10' where you could program them directly with 1s and 0s and have backwards compatibility for 500 years, but people are SHEEP and so apple doesn't LET US!" This will be posted by a bunch of people who compiled "Hello world" for it one time six months ago and then are mad it doesn't "work" anymore on a computer they do not yet own.

> Could it be that using the ANE in the wrong way overheats the M1?

No.


Was it really necessary to expand the fourth paragraph post-script to get your point across? Before it was a fairly holistic look at the difference between people who want flexibility and people who want stability, where neither party was necessarily right. Now it just reads like you're mocking people for desiring transparency in their hardware, which... seems hard to demonize?


There are other replies talking about Apple or whatever but I'll be honest: because 2 decades of online forum experience and FOSS development tells me that the final paragraph is exactly what happens anytime you change things like this and they are exposed to turbo-nerds, despite the fact they are often poorly educated and incredibly ill-informed about the topics at hand. You see it here in spades on HN. It doesn't have anything to do with Apple, either; plenty of FOSS maintainers could tell you similar horror stories. I mean it's literally just a paraphrase of an old XKCD.

To be fair though, I mean. I'm mostly a bitchy nerd, too. And broadly speaking, taking the piss is just good fun sometimes. That's the truth, at least for me.

If it helps, simply close your eyes and imagine a very amped up YouTuber saying what I wrote above. But they're doing it while doing weird camera transitions, slow-mo shots of panning up the side of some Mac Mini or whatever. They are standing at a desk with 4 computers that are open-mobo with no case, and 14 GPUs on a shelf behind them. Also the video is like 18 minutes long for some reason. It's pretty funny then, if you ask me.


For sure, I don't think I disagree with anything you've written here. Where I take umbrage is when there is no choice involved though. Apple could very well provide both a high-level, stable library while also exposing lower-level bindings that are expected to break constantly. If the low-level library is as bad and broken as people say it is, then they should have no problem marketing their high-level bindings as a solution. This is a mentality that frustrates me on many levels of their stack; their choice of graphics API and build systems being just a few other examples.

Maybe this works for some people. I can't knock someone for an opinionated implementation of a complicated system. At the same time though, we can't be surprised when other people have differing opinions, and in a perfect society we wouldn't try to crucify people for making those opinions clear. Apple notoriously lacks a dialogue with their community about this stuff, which is what starts all of this pointless infighting in the first place. Apple does what Apple does, and nerds will fight over it until the heat death of the universe. There really is nothing new under the sun. Mocking the ongoing discussion is almost as phyrric as claiming victory for either side.


Absolutely. It provided a visualization reminder of so many people that come out of their holes to argue whenever there is some criticism of open source. Its one thing to desire freedom but the reality of the situation is that community is toxic for some reason and just not fun to even converse with.


He's not wrong - that's absolutely what YouTube and online Linux commentators would do. They have their own echo chamber, just as much as any tech community. Heck, considering your past posts, it's probably something you would do.

As for transparency in hardware, it probably will become more transparent once Apple feels that it is done and a finished science. They don't want to repeat Itanium.


I think it was absolutely appropriate because I have seen that cycle happen many, many times over the years.

Especially when Apple is involved. Hell there are still people who see them as beleaguered and about to go out of business at any moment :p


I get where you're coming from. It's par for the course on Apple's behalf to push this stuff away in lieu of their own, high-level implementation, but I also think that behavior puts them at an impasse. People who want to use this hardware for arbitrary purposes are unable to do so. Apple is unwilling to do it because they want their hand on the "API valve" so to speak. In a case where absolutist rhetoric is being used on either side, I think this is pretty expected. If we're ultimately boiling this down to "having choices" vs "not having choices" though, I think it's perfectly reasonable to expect the most valuable company in the world to go the extra mile and offer both choices to their users and developers.

Or not. It's their hardware, they just won't be selling any Macs to me with that mindset. The only thing that irks me is when people take the bullet for Apple like a multi-trillion dollar corporation needs more people justifying their lack of interoperability.


Perhaps the "high-level access only" ideology extends to policy considerations as well. End-users appear to have no shortage of time or ideas to make AI trip over its shadow in ways that may have unfortunate policy implications for corporations with uncomfortably-large social and political footprints (where "footprint" represents "potential impact" and does not indicate extant specifics).

In much the same way the App Store is an infuriating shh-don't-call-it-censorship bottleneck that gives Apple total and final control over what your (sorry, Apple's) devices can do, I wonder if political considerations represents a portion of Apple's motivation to keep things reasonably locked down. Obviously Apple can just kick apps it doesn't like out of the App Store, and binaries that would need to be downloaded and run directly on Macs is exceedingly unlikely to go viral to the same extent, so perhaps I'm overthinking things to the point of paranoia.


Meh, it's okay to be grumpy sometimes. He got his point across and clearly knows what he's talking about. Let him be passionate :)


Possibly just to avoid having programs that rely too much on specific implementation details of the current engine causing issues in the future if they decide to change the hardware design? An obvious comparison is graphics cards where you don't get low level access to the GPU[1], so they can change architecture details across generations.

Using a high level API probably makes it easier to implement a software version for hardware that doesn't have the neural engine, like Intel Macs or older A-cores.

[1] Although this probably starts a long conversation about various GPU and ML core APIs and quite how low level they get.


Apple don't want to let people get used to the internals and spiritually like to enforce a very clear us versus them philosophy when it comes to their new toys. They open source things they want other people to standardize around but if it's their new toy then its usually closed.


In general I kind of agree with this, but this move isn't anything specific to Apple. Every company designing ML accelerators is doing it. None of them expose anything but the most high level framework they can get away with to users.

I honestly don't know of a single company offering custom machine learning accelerators that let you do anything except use Tensorflow/PyTorch to interface with them, not a chance in hell any they actually will give you the underlying ISA specifics. Maybe the closest is, like, the Xilinx Versal devices or GPUs, but I don't quite put them in the same category as something like Habana, Groq, GraphCore, where the architecture is bespoke for exactly this use case, and the high level tools are there to insulate you from architectural changes.

If there are any actual productionized, in-use accelerators with low level details available that weren't RE'd from the source components, I'd be very interested in seeing it. But the trend here is very clear unless I'm missing something.


Habana has their own SynapseAI layer that their TF/PyTorch port runs on. Custom ops are supported too, via a compiler targeting the TPCs, using a C language variant.

Oh, and they have an open-source UM software stack for those but it's really not usable. Doesn't allow access to the systolic arrays (MME), only using the TPCs is just _starting_ to enumerate what it doesn't have. (but, it made the Linux kernel maintainers happy so...):

https://github.com/HabanaAI/SynapseAI_Core#limitations (not to be confused with the closed-source SynapseAI)


Well, that's good to hear at least! I knew there was some back and forth between the kernel maintainers recently due to all these accelerator drivers going in without any usermode support; Habana's case was kind of interesting because they got accepted into accel/ early by Greg, but they wouldn't have passed the merge criteria used later on for most others like Qualcomm.

Frankly I kind of expected the whole result of that kerfuffle to just be that Habana would let the driver get deleted from upstream and go on their merry way shipping drivers to customers, but I'm happy to be proven wrong!


CoreML is the API to use the ANE.


Thanks, that's right there is a high level API. I meant low-level API's, and to clarify changed my post.


The likeliest reason is long-term ABI ossification.


All the sibling comments are better guesses, but I would also guess there could be security implications on exposing lower level access. Having it all proprietary and undocumented is itself a way of making it harder to exploit. Albeit, as mentioned, not having to settle ABI is way more likely the primary reason.


Apple Silicon has IOMMUs on everything - you generally can't exploit a bug in a coprocessor to gain more access on the main application processor (or another coprocessor). The only hardware bugs with security implications we've found was stuff like M1RACLES, which is merely a covert channel (and it's discoverer doesn't even think it's a problem). Apple does a pretty good job of making sure even their private/internal stuff is secure.


A high level API needs much less support effort.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: