Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

I just fundamentally disagree with the premise that it’s hard to build an IR.

It’s amazingly easy if you know how to do it and if you avoid overengineering. It takes about 3 months for 1-2 folks on average.

This new IR scaffold appears to be just general enough to handle things that Chris understands well (C compilers) and things Chris just learned about (ML). That’s cool but:

- there are other compilers that have constructs that require a fundamental paradigm shift compared to MLIR. That’s a good thing. Diversity is cool and shit. One example: the insane and wondeful IRs in JSC, which do things that aren’t compatible with MLIR.

- probably for ML it would be fine to just build your own IR. 3 months and 1-2 engineers, assuming they’ve done it before, and you’ll have yourself something purpose built for your larger team to understand, to fit into your coding style, and to surgically attack your particular problem.

On the flip side, I’m happy to see that location tracking is first class. The overall structure of this IR, like other IRs that Chris designed, is quite sensible and it seems like it’s not terrible to hack on. It’s just that it’s too much of an answer to a question nobody should be asking.



"3 months and 1-2 engineers"

That's nowhere close to the manpower you need for writing optimizations for your IR.


It’s exactly the amount of time it took to write JSC’s B3.


Basically 6 man months of some of the most senior Apple engineers? Seems fair. But at this point you "just" have an IR, and it is dedicated to your own use case, which has its advantage but also some inconvenient, especially when it comes to interacting with other tools/framework. So if every time you want to create a DSL for a use-case you need this amount of investment, this is a non-trivial cost.

Now think about use-cases like https://drive.google.com/file/d/19pSpEsi4I9-MKLRodD-po82HFCW... and what the MLIR ecosystem (if it develop can provide them). After 6 months of work, instead of having "just" an IR for their abstraction, they would likely have iterated on the IR, revisited their abstraction, write optimizations, and most importantly mapped their use-cases to heterogeneous HW! They can benefit fairly easily from the work of PlaidML on the affine/polyhedral optimizations for instance.

On the other hand, if a new HW vendor would like to expose their accelerator to a wide range of use-cases (whether it is Fortran OpenACC or the Stencil DSL), plugin lowering/backend to MLIR is a much efficient way than the alternative (which are almost non-existent today by the way).

If it was so easy to write its own compiler end-to-end from scratch, wouldn't LLVM be out of users by now?


After 3 months we had an IR, lowering to it, lowering out of it, and an optimizer. That IR is suitable for a broad range of use cases for us - basically anytime we want to JIT something and are willing to pay for compiler optimizations.

In the other two IR design efforts I’ve seen recently, it’s true that after 3 months you have less of the lowering and optimizations. But maybe for one of them it was because the person doing all the work was an intern who hadn’t ever designed an IR before.

You’re assuming that:

- 3 months is an outlier. It’s not.

- 3 months is more than it would take to glue your infrastructure to MLIR. Most likely there is a time cost to adopting it.

- that just because someone uses MLIR then they can somehow leverage - at no cost - the MLIR optimizations that someone else did. But MLIR is just scaffold so it’s not obvious that optimizations implemented by one MLIR user would be able to properly handle the semantics of code generated by a different MLIR user. I think this assumption reminds me of how folks 20 years ago thought that if only everyone used XML then we would be able to parse each other’s data.

I don’t buy that having MLIR makes it easier to interface different compilers. I’m used to writing lowerings from one IR to another. It’s not that bad and the hardest part is the different semantics; MLIR encourages its various users to invent their own instructions and semantics so I don’t see how to makes this any easier.

I think that it’s unfortunate that llvm has as many users as it does. It’s a good C compiler but whenever I see folks doing a straight lowering from something dynamic, I just want to cry. I think that the llvm community is doing a lot of up and coming compiler writers a disservice by getting them to use llvm in cases where they really should have written their own. It’s a big problem in this community.


"wonder IRs in JSC" Can you please provide a link/explanation to what JSC is? Did you mean this? https://github.com/eatonphil/jsc



B3 and Air are well documented: https://webkit.org/docs/b3/

DFG ThreadedCPS and DFG SSA are not well documented. That might be a monumental task at this point. They started simple but got so insane. But here are some slides: http://www.filpizlo.com/slides/pizlo-splash2018-jsc-compiler... http://www.filpizlo.com/slides/pizlo-speculation-in-jsc-slid...

And no, the thing you linked to is not t he JSC I think of.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: