I don't say nice things about Oracle very often, but they deserve some praise here. Graal is a very, very ambitious project, and Oracle has been funding it for years. It's still rough around the edges, but it promises to enable new programming languages to run on a high-performance JVM, compiled to native code. Write-once, run anywhere, at native speed. And now they're donating it. It's very decent of them.
I know absolutely nothing about Oracle, GraalVM, OpenJDK, or Java development in general. I haven't used Java since a CS intro class during my undergrad like 10 years ago. Despite this, I've internally adopted the general consensus that Oracle is a terrible, greedy company.
Can someone play devil's advocate to explain how this news may actually favor this negative view of Oracle? The sentiment in this thread is praising Oracle's decision here, so I'm just curious if there's an alternative viewpoint that's skeptical or wary of this decision and Oracle's motivation to make it.
With that said, I'm not looking for reason to diminish the positives. I'm just curious. Good moves that benefit the community should absolutely be acknowledged and encouraged.
EDIT: It looks like another person commented how this may not be benevolent in the context of WebAssembly while I was slowly typing this comment up on mobile. I'd still be interested in this discussion though.
I'm java developer for the last 15 years or something like that and I can't really say much bad about Oracle Java. IMO Oracle got a good team working on Java. Java is completely open source. There are many builds of Java from different companies, I like liberica one, for example (be aware of Russian roots, though).
I have no idea about this graal thing and I think it's rather niche application. I played with it but concluded that it's not ready for me yet.
Is there any “good companies”? I’m fairly sure they all become paperclip optimizers after a certain size and any seemingly good behavior is simply the better option from a profit-maximizing point of view.
Do you honestly think that all those libraries care about Pride or Black Lives Matters? Nope, they just had some calculations and as user-facing companies, marketing has a huge role on their profits. A logistic company won’t have done any such thing as they are likely not even known by the general public.
Oracle is not really an end-user facing company like Facebook or Google is, so they simply don’t care all that much about that (hence the lawnmower analogy). But.. that’s a good thing as well — you can use the lawnmower for its job, you’ll never be surprised. Graal and OpenJDK and other tech at the bottom of the tech stack are long term investments. Looking at the linux kernel, it is not developed primarily by some hacker in a basement, but by employees paid by Intel, Red Hat, Google, pretty much everyone.
The generalized dislike of Oracle you have seen really comes from a couple of different aspects, neither of which are relevant to this specific announcement:
1. License audits suck.
2. The Google lawsuit (about matters resolved long ago and which don't apply to anyone except Google, really).
A lot of ill will comes from people whose companies have been audited. The process is by all accounts very painful. However, what's rarely mentioned in such discussions is the alternatives and why Oracle does this. It's because their software is totally DRM free. It makes sense; you can't have a major airport or bank shutting down suddenly because a license key or credit card expired, can you? Oracle DB is mission critical stuff, it must be always available. That's why they use an audit model - it's "trust but verify". They make their stuff available for free download and you promise not to pirate it.
Every so often Oracle turn up and check to see if you're paying for what you're using. At this point there are usually two problems that crop up:
1. The company doesn't actually know if it's correctly paying for Oracle's stuff. It requires a lot of work to find out, maybe the right controls weren't in place and naughty developers just installed more copies because it was convenient etc. Then they discover they've unknowingly been pirating the DB.
2. And/or they discover they didn't understand the licensing model, which historically had some very sharp edges around virtualization (maybe still does).
Because the downloads are open and unrestricted by any form of DRM, it's easy to make these mistakes in a company that doesn't have good processes in place. At this point the users have a problem because it's just plain old copyright violation. Oracle prefers not to sue its own users for obvious reasons so at this point a third issue crops up - their sales guys like to cut deals. Buy more of our software and you'll have some useful stuff plus we'll forget about your non-compliance issues. Win/win, right? Not always for the people who aren't at the top of the firm of course, who may now be told to adopt some new product that they wouldn't otherwise have chosen and may not even be told why (it's embarrassing for the executives to admit they ended up in that situation!).
It's worth observing that with the cloud these problems go away. Use Oracle DB only in the cloud (or MS SQL etc) and the cloud provider will track your usage and ensure you're paying for it. In turn that means no need for audits.
It's very easy to criticize Oracle for the above outcomes. It's harder to come up with alternative approaches beyond really down in the weeds stuff like the exact ways virtualized cores are licensed, etc. The moment you have a commercial product there needs to be some way to ensure users are paying for it (because a lot simply won't if there's nothing in place to make them), but if you accept that outages cannot be caused by DRM or licensing errors, then you are almost forced to go with either the cloud or the audit+true-up approach. Many modern DB firms go hosted-only which brings its own problems (see the recent Azure leak). Plus Oracle DB predates the cloud, so ...
Thank you for the very detailed response. I recognize that my impression of Oracle was certainly unfair due to my lack of knowledge and experience with them. I also recognize that it was both selfish and ignorant of me to ask for someone to spin this negatively. My comment was in good faith, but I see now that my intention doesn't really change that it was encouraging an unjust characterization of this announcement and was unfairly asking someone else to do the leg work for me.
I guess my default (and flawed) heuristic is to assume that a big corporation has an ulterior motive with goodwill or community-focused announcements. I'll take this as a learning opportunity to not blindly adopt what I perceive as the general sentiment, to stop relying on such a heuristic, and to do my own research and evaluation. In addition to the detailed response, thank you for also indirectly kicking my butt into reevaluating how I approach the unknown both logically and emotionally.
You're welcome and I don't think you should feel bad, your post seemed OK to me.
This particular announcement isn't even meant to generate goodwill really, although I can see why it's interpreted that way. It's not like there's a new open source release coming. It's just resolving some duplication issues in the way these already open source projects are being developed by Oracle "donating" from one arm to the other :) The goodwill should instead come from the decade+ funding of this very advanced and large research project, which is teaching the world a lot about fundamental computer science (complete with large set of academic papers), and for which almost all the core cleverness is given away under liberal licenses.
Be aware that they do have a commercial offering built on top of Graal, the enterprise edition. It makes programs go faster, and has a few other useful features. But there's nothing nefarious about that of course.
I'm not worried about or bothered by audits. We put a lot of effort into license compliance and performing regular true ups with our vendors is the norm.
A large part of Oracle's bad reputation comes from a history of entering markets, gaining a captive customer base, and then engaging in blatant rent-seeking that squeezes every last penny that they can out of their customers.
As a direct example, I'm in the healthcare space and specifically work in a Cerner shop. Our quotes from Cerner for routine integration projects have gone up literally 5-10x since Oracle acquired them. An EHR migration costs well into eight figures, so Oracle is fully aware that customers aren't easily going to be able to move away from them because things that were $15-30K (and still are with other vendors) are now running into the six figures, but it certainly doesn't leave a positive taste in the mouth.
And we knew this was coming as soon as the Oracle acquisition was announced. Because Oracle has built a reputation for doing exactly this across multiple markets over multiple decades.
I don't think licensing is the only reason people hate Oracle. They have at the very best a mixed record when it comes to open source they've acquired. While they've done pretty well by Java and virtualbox after some initial missteps, other software such as Open office, MySQL, Solaris, ZFS, KSplice, etc has not faired nearly so well...
I feel like this is a bit too generous to Oracle. The DRM-free thing is indeed a feature, but they know full well that it will encourage more and more unlicensed usage which they can then "extort" additional fees out of a company once it's in place and critical, which they would probably never have gotten if beforehand a trade-off check would've been done for operational expenses or the right gates would've been needed to have gotten over.
Red Hat is the poster child for that approach and they charge for support based on per-core usage with audits / legal teams to back it up, no different to Oracle.
It's more like your company already uses it, so people just copy the binaries from one server to another and start it up. Oracle DB has massively more features than postgres or mysql so there are good reasons to use it.
There's an epic rant on youtube from one of the old Sun guys that gets linked occasionally. I can't remember all of it, but basically it goes that Larry Ellison/Oracle is just as simple as a lawnmower - if you stick your hand in it; it'll shred it without emotion because it's just an unthinking machine that cuts whatever you put into in the same way that Oracle is a machine to make money.
I've been at this long enough to have had my own experiences and they are indeed as bad as everyone has warned me, but it does feel like something you have to experience yourself (or at least through a trusted party like in my case) to totally believe.
It seems irrational for anyone to feel unexpectedly angry at a lawnmower for shredding their hand though -- you should have known it was a lawnmower, that's not a recent change, and it never claimed to be anything but a lawnmower. You should just be the normal level of angry that lawnmowers are dangerous.
The parent apologist was extremely precise in how they described Oracle. Oracle isn't in the "Be warm and fuzzy to developers" business: they're in the "Provide mission critical software to large enterprises" business. Logic for servicing the latter well makes things look screwy or sinister to people used to the former.
Which isn't to say the Oracle salespeople aren't scummy. But most salespeople are scummy. That's what happens when you incentivize closing sales, which is how almost all sales orgs are set up. So audit model + scummy salespeople = bad experiences. But to parent's point: what's a better model for their customer persona?
Well, I had a colleague who worked with them directly. He was a product manager who formerly managed incident response, and this was at a company that at one time had incidents that affected significant chunks of the Internet (no longer - that mantle has passed to AWS). It's been a couple years now, so the details are a bit fuzzy, but I remember him saying that dealing with Oracle was about the worst experience he had on the job. Their representatives were outright rude and condescending, and seemingly difficult to no purpose. So that's my (2nd hand) experience with them.
I'm a bit young to have ever known much about Sun, but I get the impression that they were very respected for their products and commitment to open source, and Oracle's purchase seems to have really rankled the community. Interestingly, the only Oracle products I use are all from the Sun era - MySQL, Java, Virtualbox, and ZFS.
It may not only be benevolence though. The JVM and Graal are not the only game in town anymore. Webassembly is growing in various areas, clienside, edge, backend. WASM fills much of the same needs as Graal, and then some since it’s embedded in every major browser.
This move is good for Graal, as it will help it compete, but there’s still a big question in my mind at least about if Graal is going to be able to compete long-term. Yes, I know Graal can also execute WASM generated binaries, so it may have a space in this environment, but will it outperform native WASM VMs? And will WASM become the de facto binary format for shipping things where JVM bytecode has been used in the past? What’s Graal’s place in the future?
WASM isn't really a replacement for JVM bytecode. It's not like you can take a random JAR and convert it to WASM. Last I checked, WASM doesn't even support GCd languages at all, and at any rate the whole insight that makes GraalVM unique and a big deal is that universal bytecodes are a poor choice for making fast polyglot VMs. The JVM world was doing that long before WASM was even a twinkle in Google's eye, with invokedynamic and other initiatives, and it kinda works. JRuby+indy is a lot faster than MRI. But, there's a lot of compromises involved.
Truffle is interesting because it says, no, we should not be trying to compile everything to a universal bytecode. Instead, we should JIT compile the source code directly, using the JVM as a runtime library but bypassing the bytecode layer. The language semantics can be expressed much more clearly, without needing to contort things to make them look like Java, whilst still benefiting from the JVM's core feature set.
So really I'd ask it the other way around. If it weren't for the politics of the browser world and the monolithic "Chrome is the OS" approach, would WASM be competitive? Because the Graal team already proved you can run lots of different languages at relatively insane speeds using partial evaluation and bypassing bytecode. The WASM world has proven it can run C++ and Rust at slower speeds than normal, which isn't particularly unexpected. If Chrome and Safari shipped GraalVM accessible via <script> tags, how many people would care about WASM? Remember that you can run WASM and LLVM bitcode on top of GraalVM too, it's not just about textual languages.
Arguably, if you wanted to give the web an instant free upgrade that'd make many developers rejoice, integrating Graal into Chrome would be an overnight way to do it. Python, Ruby, JVM bytecode and any other language you want at V8 like speeds, in a script tag? It's technically possible, it's just not politically possible.
>> It's not like you can take a random JAR and convert it to WASM.
Maybe you can:
"TeaVM is an ahead-of-time compiler for Java bytecode that emits JavaScript and WebAssembly that runs in a browser. Its close relative is the well-known GWT. The main difference is that TeaVM does not require source code, only compiled class files. Moreover, the source code is not required to be Java, so TeaVM successfully compiles Kotlin and Scala."
I've used TeaVM in a past project, to compile some Java to JS. It's very cool tech. Unfortunately I heard that the team ripped out the TeaVM after I left. That's understandable - I had to fix some bugs and missing pieces in TeaVM as part of that project, and I think a lot of codebases would face the same issue, but compiler hacking isn't everyone's, um, cup of tea. It was a great way to get a lot done in the two weeks I had available though.
I should have clarified. Yes, you can probably do a native-image style "compile an app+embedded JVM to wasm" by pretending V8 is a CPU. There are programs that do that sort of thing, I think Leaning Technologies makes one. That wouldn't be of use in any existing Java project though. The only reason you'd ever want to do that is because browsers offer nothing else, even though they could and at that point why not compile to JS, at least that way your GC isn't being interpreted too. If you're not constrained by the WHATWG's decisions though it doesn't offer anything.
>> That wouldn't be of use in any existing Java project though. The only reason you'd ever want to do that is because browsers offer nothing else, even though they could and at that point why not compile to JS, at least that way your GC isn't being interpreted too. If you're not constrained by the WHATWG's decisions though it doesn't offer anything.
It is very use case and "what is the future of your Java application" dependent. Some organizations are looking into migrating off of Java due to a variety of reasons. These kind of "Java conversion" tools help to keep legacy Java applications running until the legacy Java applications can be replaced.
> If Chrome and Safari shipped GraalVM accessible via <script> tags, how many people would care about WASM?
That’s a good question, and partly why I was pointing out that it’s not just benevolence for Oracle to OSS this. I don’t know if the browsers vendors could embed Graal, maybe they can, (license restrictions being some of the issues I’m sure) and then that could supplant the individual JS and WASM runtimes they support. But this wasn’t even an option until today.
> Remember that you can run WASM and LLVM bitcode on top of GraalVM too, it's not just about textual languages.
Which is exactly why I mentioned that in my original comment. Yes, Graal could be that runtime, will it? Seems like a gamble for anyone who’s not already in the JVM ecosystem to some degree.
Despite the wording of the announcement, Graal and Truffle have been open source under permissive licenses for a long time now. The "donation" is from one open source project run by Oracle to another. Confusing, indeed, but Oracle is a big company. So it's been possible license-wise for a long time.
In the announcement it sounds like they are planning to change the license in some manner, “Will the GraalVM license change?
The plan is to align all the GraalVM technologies with Java both from a release perspective and from a licensing perspective. Additional details will follow in the coming months as we move forward through this process.”
Well, some GC'd languages support compilation to WASM, for example, you can compile go to WASM. The issue here is that for GC'd languages you have to bring your own runtime with GC in every module, so this doesn't really work.
It's "experimental" in the sense that it's incomplete, but the core tech is mature and works fine. Full support means supporting all the third party modules along with interpreter extensions, every part of the standard library etc.
No, there is only a frontend. There is a JavaScript backend though. I hope the GC proposal for WASM ultimately goes through to enable a backend in WASM too.
What does frontend and backend mean in this context? With graal you can
- run wasm interpreted on OpenJDK, similar to javascript running interpreted in nashorn, or now graalvm, you just need to add couple of graal sdk jars to your dependencies
- run wasm compiled, you need to run on GraalVM for this. This is supposed to provide around 50x speedup compared to previous point
Maybe a newb question but may I ask for an explanation why someone would want Python to run on the JVM and how they'd get started? Is it adaptable to a Poetry-led workflow?
GraalVM native compilation helps Java in the data center to avoid being a cost sink and to reduce start-up latency. Oracle needs Java to sell enterprise software.
Oracle contributing to OpenJDK may be required for Amazon cooperation (since Amazon is pushing its own JDK build) and probably helps the library ecosystem work towards native compatibility.
Native support for reflection (used in many libraries) requires "reachability metadata", maps of reflective API usage, at build time. Anyone can do it, but enterprise requires authoritative sources. Until authoritative reachability metadata covers the transitive graph of library+version dependencies in enterprise software, GraalVM native AOT builds are a PITA.
(As a side note: Mark Reinhold has run the JDK team since 1997: is there any comparable example of such stellar leadership for broadly-adopted software across multiple technical and organizational eras?)
Java has been on a freight train run for the past 5 years. It is exhausting to keep up with. From the demonic release pace, to the dramatic changes to the platform, it's been a rough ride for some.
But at the same time, it's also smooth sailing, and it's getting better all the time. It's still Java, for a bazillion applications it still "Just Works". It's still (IMHO) far more manageable and stable than many other platforms.
I think the combination of Oracle and the entire community around it have been marshaling it really well with little drama. Change, sure. But not Drama.
The Enterprise Edition departure was a big deal, but even that transitioned pretty well. That was no small task, and I think the vendors and framework folks have been handling that pretty well.
GraalVM is just another step forward for the entire community, and it is kind of Oracle (however motivated) to release it. In truth, I think, overall, Java has been mostly (mostly) Oracle free, despite their monster investments into the technology and community. They could have been a much less benevolent dictator.
if it's "long" running, you want the full JIT experience, because you essentially lose the entire "But how does my code actually run" optimisations that the JIT can do, and only have stuff that can be done ahead of time.
I refuse to praise Oracle for anything. But I could make an exception if they put Isolates in the CE version of Graal and not only the Enterprise one. :)
Also, obviously Oracle didn't do this out of the kindness of their hearts, if they didn't contribute Graal into the OpenJDK the would risk to Graal never become anything other than a niche VM because most people, specially big clients don't want to change VMs and the CE version will be their new gateway drug to their Enterprise version of Graal.
Linux, Chromium and V8, Java, .NET, and Go, and most other large open source projects these days are developed by for-profit corporations, none of them do it out of kindness. Frankly, if corporations were to bestow gifts worth hundreds of millions of dollars on society, I'd rather those gifts not be free software that is largely enjoyed by other for-profit corporations.
This is what it is. GraalVM is implemented in Java, there is a Truffle framework that compiles Java, Ruby, JS, Python to Java Truffle, which then runs on GraalVM JIT compiler.
> Oracle plans to contribute the most applicable portions of the GraalVM just-in-time (JIT) compiler and Native Image. Oracle does not currently intend to contribute the polyglot technologies supporting other languages such as Python, Ruby, R, and JavaScript. Additional details will follow in the coming months as we move forward through this process.
It would appear this is about making native image artifacts / distribution a first class citizen across all of Java - making it an alternative to uberjars + jvm for running/distribution. Ie native desktop apps and native binaries for servers?
Does that mean to run it you need another JVM to run it on top of? That sounds stupid... Maybe you need another VM just to run it on once, so it can translate itself to native code on the target?
> Does that mean to run it you need another JVM to run it on top of?
GraalVM is the regular HotspotVM integrated with the GraalVM compiler (which is normally AOT compiled as a native library, but can be run as a JAR); it also supports tooling with the ability to compile code targeting the JVM to a native executable, as well as leveraging AOT compilation itself.
Lots of languages and run times are self hosting today. There are several Java VMs out there. This is but one of them. Its been demonstrated that it's not that hard to create an operational Java VM from scratch. A straightforward interpreter can do the job. The VM doesn't change the source language.
A component of the GraalVM project is called native-image. The native-image tool compiles a Java app to machine code ahead of time and combines it with a small JVM called SubstrateVM. That JVM is written in Java, just like the rest of your app.
As pointed out by other comments, this concept of a "meta-circular VM" isn't new. It's been done before by two other projects, Maxine and Jikes. What's different about SubstrateVM is that this is a production tool rather than a research project, and it's not just a JVM, it's also a way to pre-initialize the app. Therefore programs compiled with native-image can start as fast as programs written in C. Actually, slightly faster in some cases. You may wonder how that's possible given that Java apps normally start slowly, but it's because there's no JIT compilation and the state of the heap is snapshotted, with classes pre-initialized including the JVM itself. So the program can literally just start executing at main() in machine code with no VM startup overhead, because it's done already.
The downside is that snapshotting and AOT consume a lot of disk space.
There are some questions below asking how this works. It sounds initially "impossible", like a lot of stuff GraalVM/Truffle does, but it's quite easy to understand really.
You start with a bytecode compiler written in Java. This is a normal program written in the normal way, because a compiler is ultimately just a function that converts one stream of bytes to another. Then you write the runtime and GC in Java too, and compile that as well. This code is a bit special. It's still syntactically Java, but, some classes and methods are given special meanings and some extra rules apply. They aren't compiled in the same way as normal Java code. For example you can write code like this:
UnsignedWord value = Pointer.readUnsignedWord(address)
This doesn't allocate an object or call a static method. Instead it will be compiled down to a single mov instruction. Likewise for writing to memory - there are magic methods that are taken to mean "emit this assembly" instead of doing normal method calls.
Several other tricks are required. GC code can't allocate because it would mess up the heap it's working with, so you can use annotations to mark methods as "never access the heap". But then, GC code is written in Java and Java must allocate for almost anything non trivial, so how does that work? The answer is, the GC code is initialized at build time and all the objects it needs are snapshotted into the default heap that's mapped into memory at startup.
There are lots of other tricks, mostly annotations that control the compiler so that e.g. methods are guaranteed to be inlined and removed, objects are guaranteed to be stack allocated. This isn't available to normal Java but when you control the compiler it's not a problem. The advantage of this Java-superset (or subset) is that you can use all the normal tools that understand source code, like IntelliJ, JavaDoc etc.
Hi! I was under the impression that Graal reused OpenJDK’s GC implementations - or was it only in the Graal as JIT compiler mode?
Also, may I ask how do you know so much about the topic? I would really like to one day work on OpenJDK/Graal, but I just don’t see the road ahead me.. — I’ve just started my master in CS, but I don’t feel it closing the gap at all. Surely I can read up more and more on the topic in small steps, but I would be very grateful for any guidance/pointer.
When used on HotSpot it does. Native images/SubstrateVM have their own GC written. Native Image EE can also use the G1 GC so then your native image is a mix of C++ and Java. The Graal compiler can be used in both modes.
How did I learn about it - mostly by reading their papers, watching their videos and asking lots of inane questions on their Slack. Also, I happen to live around the corner from where the Graal team work so occasionally I've been able to meet them in person and ask questions then. But mostly I just followed their efforts for a long time. I got interested in Graal back before most people had heard about it, after somehow randomly encountering a discussion of TruffleRuby on Chris Seaton's blog. Then I wrote about it here:
Most of the Graal guys came out of masters and PhD programs at JKU Linz, so the path you're on is a well trodden one. I wouldn't feel down about it. For me, how it worked was quite mysterious for a long time and then one day it clicked, and I saw the essential simplicity behind the concept.
It's not uncommon to do a CS master Thesis as part of an internship, if your university allows that. We have plenty of topics to choose from and you can also come up with your own.
How hard is it to get accepted? I have written a toy JVM without a GC, only interpreter, but I don’t feel qualified, and I am afraid of blowing my chance (is there some limit on how often can one apply?). I’m working on adding a templated interpreter+GC version next, afterwards I might have a bit more confidence.
I'd ask people that know you and have experience in the field for a realistic opinion (maybe a professor/assistant/phd?). It does definitely not hurt to code all kinds of systems stuff. Try to tinker with some low-level optimization problems. You should be an excellent coder too. In our case it does help to know the JVM ecosystem, including HotSpot details.
The Java runtime contains ~8MLOC; of them, about 1.5M are C++ (and Assembly generated by a C++ DSL), and more and more of the runtime is being written in Java (e.g. the virtual thread scheduler is written in Java). The two main pieces that aren't are the bytecode interpreter, the JITs, and the GCs (although the Graal JIT is written in Java). Interestingly, the more latency-critical stuff is written in Java (the JITs and GCs work mostly in the background nowadays, and the interpreter is used only at startup and on deoptimisation slow-paths).
Some of the reason is historical, and some has to do with warmup. Project Leyden and Graal's Native Image will help compile more Java AOT, allowing even more of the runtime to gradually be written in Java. It will take some time as it's not a top priority: it won't immediately deliver user-facing functionality, and most of the work on the JDK is already done in Java code anyway.
Parts of it are written in Java. Over time, more of the JVM has been rewritten in Java. For example recently parts of the reflection subsystem were rewritten in Java. Of course many libraries have been ported over time too.
By the way, the .NET CLR is AFAIK written in C++, like with Hotspot. It's not fully self-hosting.
It's a tricky process because HotSpot is highly performance sensitive code. People won't accept regressions just to convenience the JDK maintainers. Java meanwhile is deliberately a simple language to make it accessible for people, so you lose some low level techniques that are useful for performance. Nonetheless, GraalVM native image has proven that you can achieve HotSpot like performance with a JVM written in Java. However it requires a big change in the compilation model that isn't always appropriate.
In each CLR release there are little pieces that transition from C++ to C#, as C# currently is getting quite a few low level features for systems programing.
Maybe it's my ignorance of the Java ecosystem or a lack of imagination but how could a language that requires a heavy runtime be written in itself? Wouldn't you need a runtime for the runtime, and then a runtime for the runtime for the runtime and then...?
I can't quite imagine how you'd bootstrap something like that.
It's no different to writing any other compiler for Language X in Language X, and that's a really common thing to do. The Java compiler emits files containing Java Bytecode which are subsequently run by a different invocation of the JVM.
It's different in that the JVM is not a compiler. It's a bit like suggesting writing CPython in Python. It's possible only if the interpreter's Python source can be compiled ahead of time, like PyPy.
Oh, I know, I'm just pointing out that a self-hosted compiler is a different ballgame than a runtime being written in the language that runs on it. Without some degree of AOT compilation (usually of a subset or dialect of the language compiled in a different fashion), it's not really possible. The GP seemed to miss that the discussion was about the JVM, not javac.
With Graal, it's possible to write a JVM in Java, but the JVM doesn't depend on another JVM to run, and the way it runs bytecode isn't the way it was compiled in the first place. It's not really self-hosted in the same way that a compiler can be.
You can write a C compiler in Java. The output of the C compiler doesn't depend on the JVM, it would be a regular old program written in C. Writing a Java runtime would be similar, the output would be an executable that doesn't depend on the JVM.
A runtime is not a compiler. The process of bootstrapping a compiler is understood, but if the runtime itself requires a runtime at runtime, that is much more of a turtles all the way down problem than with a runtime less language.
It's like how PyPy is Python but with the asterisk that it's bootstrapped with RPython which is an almost-subset of python so that it doesn't require a runtime.
You could define a statically compilable Java subset (like that which gcj used to accept) and build a runtime in that which would mean omitting features such as reflection but a lot of defacto standard java tooling like Spring Framework would not be compatible.
> You could define a statically compilable Java subset (like that which gcj used to accept) and build a runtime in that which would mean omitting features such as reflection but a lot of defacto standard java tooling like Spring Framework would not be compatible.
But the JVM you build using your statically compilable Java subset can then run the Spring Framework or whatever.
Sure, but that subset is only "kind of Java" just like RPython or CPython code is only "kind of Python". It's like calling a C compiler a C++ or Objective-C compiler because C is a subset of those languages.
How restrictive do you think the subset is? It only doesn't support some features you probably never wanted to use anyway, and arbitrary reflection. I maintain 125k lines of Java that conforms to the subset rules, and to be honest I never even think twice about the fact that it's a subset.
Isn't arbitrary reflection how Spring, Hibernate, AspectJ, every JSON library and other common libraries that you may be using directly or indirectly work?
No they use predictable reflection, not arbitrary reflection. The subset just needs to be told ahead of time which classes you want to be able to interact with reflectively. As I say, it's not an issue for my quite-large application.
Exactly, implementing the runtime is the bootstrapping (apart from the compilation of course). There is no implementing the runtime that implements the runtime that implements the runtime.
Mostly, since the .NET Core got introduced, the C++ surface area gets reduced with each release, there are occasionally references to it when a new release comes out on MSDN blogs.
Digging through that code might be a bit challenging. Do you happen to have a link to a paper/documentation on this? Or maybe some rough explanation how it works (that goes beyond the short summary you provided)
Java doesn't provide a primitive to deallocate memory. So while I can see how for instance allocation a huge chunk / big array could be allocated and you represent objects in there don't you end up with a situation where your process will always occupy a fixed amount memory? Might not need to be fixed. You might also be able to extend more but how would you free that again?
Not being facetious - but it works exactly like any other GC. There's nothing magic about writing code in Java instead of C that makes a huge difference.
But you might find this interesting as a specific example - this is where it actually obtains memory from the OS.
Note the @Uninterruptible annotation - that's saying that this code is safe to use within the GC itself. Notice how the file doesn't contain even a single 'new! (Outside of PosixVirtualMemoryProviderFeature, which is something else.)
I think that's the piece that got me and grand-GP maybe confused.
You are writing a GC in Java, yes but have access to low-level memory abstractions/interfaces, right?
Whereas I was initially wondering how to write a GC in "pure Java" that doesn't have access to low-level memory interfaces.
Does it make sense why I was asking, now? Or am I still not getting it?
To be clear, Java the language is Java, I'm not going to argue that it's not Java because of special primitives / interfaces available to write the GC here, but it is not what most people would think of when considering the limitations of the runtime everyone's using.
I think with regular old java this would be impossible to do in a reasonable way. But with AOT compiled Java and extensions like @Uninterruptible you can do it.
I think I'd phrase it like this:
you can write a GC mostly in Java.
Other GCs than the default Java one in native image like G1 actually embed the C++ version of the implementation instead of writing it in Java.
> But with AOT compiled Java and extensions like @Uninterruptible you can do it.
Yeah, but also the code shown above used native low level primitives like mmap which typically aren't available.
AOT, special behavior directives (@Uninterruptible), native memory access, making sure not to use new (?), at that point you are formally using Java, the language, but it's sort of its own thing.
So, that's still cool and likely no way around it but I wonder to what degree it's actually beneficial: Your program is much closer to a C++ program than a Java one except for syntax and the additional glue abstractions that typically exist in neither. In a (exaggerated) sense it's like it's written in a C++ DSL embedded in Java.
If you know how to write C++ it's possibly simpler to just write it in C++ as you know how memory management there works. If you know Java you need to get familiar to the extensions used here.
This is a bit different from the idea of self-hosting a compiler for instance where you can start writing your compiler now ideomatically in your own language instead of something different. For instance if your language & runtime has GC it's much simpler / safer to write a program in it, and so a compiler in it than pre-self-hosting (if the earlier compiler was written in C).
So what in particular is afforded by writing the GC in "Special Java" ? Is it just about being able to say "it's all in Java" or are there language features in "Special Java" that make live easier than C++? Or other benefits? For instance, I imagine use of the wider ecosystem / libraries isn't possible whereas in C++ it is (more so at least).
It would be nice if somehow you could write a GC itself in regular old Java without having to worry about aspects of how to do it in a special, restricted way. Say, the code you write and which runs the garbage collection creates objects, which subsequently get cleaned up by your very own GC program.
In the end you still have to work with low level abstractions / memory though and maybe that's still different to the self-hosted compiler example in a sense: There you translate code into a language that is more low level but you don't need to have these abstractions available in your own language / on your stage of computation / while running your compiler.
the gist is you treat memory as a big array, then write a program to manipulate that array. it's really just a decision about what you want to "take as primitive" in your implementation. could be brk, could be malloc and free, or something higher level.
Providing a single sample of random software as the ultimate rebuttal to my claim speaks more about your understanding of how performance comparisons of compilers and runtimes are made than about Java itself.
My turn: NASDAQ moved from cpp to Java quite a few years ago. Do you think you know something they don't?
SweetHome 3D IS slower than competitive projects written in C++.
C++ tools tend to be faster than comparative Java tools because of fast startup time, no GC, and being the default choice for performance sensitive projects for decades.
C++ is definitively faster because of inherent advantages.
You appear to be jumping to the third based on the first which appears erroneous in argument even if you turned out to be correct. In actuality it appears that for most things in the same ballpark language choice isn't necessarily the only or even the most important factor. This is even more true for things where startup time is an inconsequential factor, with better GC that doesn't result in lengthy pauses, and where development time is a substantial limiting factor wherein being quicker to work with may result in more time available to improve other design choices yielding as good or better results.
Wonderful news! I was always scared that Oracle could slap some $$$ licensing scheme for GraalVM, but this proves me wrong. I'm not following other parts of GraalVM development, except Native Image capability, but I must say that, IMHO, GraalVM/native-image is one of the best things after sliced bread and JVM out there. Native Image shows true power when it has to compile higher-level languages than Java, like Scala, Clojure, or Kotlin, and it compiles them pretty darn well.
Nothing technical is changing. It's just a change in the development process used by Oracle. A lot of this announcement will be opaque if you don't know the history of Sun, but suffice it to say that the core Java development process has a lot of oddness in it that exists to make Java more open and less controlled by Sun-then-Oracle, for example if you go look at their mailing lists you'll see they "vote" on whether their own new colleagues should be allowed to commit to the source tree. This is basically legacy stuff these days but if I understand correctly was written into various legal agreements.
Anyway, OpenJDK is its own project with its own processes and culture, GraalVM was historically from a totally separate part of Oracle and adopted its own processes and development culture. This announcement doesn't change what tech is available, it's more about re-organizing how development is done. It doesn't really affect Java developers much, except that maybe now more stuff will come out of the box with a 'regular' JDK. Currently to get Graal technology you need to use their own custom spin of the JDK.
The part I like is that it basically _forces_ GraalVM on everyone who is currently using a JVM with Oracle lineage. It would be not unlike PyPy and CPython agreeing to ship together, not really but kinda.
Oracle plans to contribute the most applicable portions of the GraalVM just-in-time (JIT) compiler and Native Image. Oracle does not currently intend to contribute the polyglot technologies supporting other languages such as Python, Ruby, R, and JavaScript.
The later one is just a library and you can already use it in the standard JVM ( with the disadvantage of not having the same performance as if you run it with graalvm), but if the JIT and other internal stuff is already on standard JVM, that will run the same way
Isn't this more about the GraalVM, as an alternate runtime to the JVM, than it is about Java, since the other JVM-target languages can leverage it also?