Hacker Newsnew | past | comments | ask | show | jobs | submitlogin
InNative: Run WebAssembly Outside the Sandbox at 95% Native Speed (innative.dev)
172 points by blackhole on May 27, 2019 | hide | past | favorite | 93 comments


For those curious why it isn't 100%:

The only reason we haven’t already gotten to 99% native speed is because WebAssembly’s 32-bit integer indexes break LLVM’s vectorization due to pointer aliasing. Once fixed-width SIMD instructions are added, native WebAssembly will close the gap entirely, because this vectorization analysis will have happened before the WebAssembly compilation step.


WebAssembly on desktop and servers is maturing pretty rapidly! There are several backend interpreters already in Rust and Go (life). The native, compiled options are even faster, Fastly's Lucet compiler and runtime was one of the first to implement WASI (https://wasi.dev/): https://www.fastly.com/blog/announcing-lucet-fastly-native-w...

The differentiator for InNative seems to be the ability to bypass the sandbox altogether as well as additional native interop with the OS. Looks promising!


I'm wondering what's the point of compiling server side stuff in wasm? Unless I forgot something, all main server languages are rather portable already, either because they are interpreted (python, node, jvm...) or can be compiled to many targets (rust, go...).


You're not confined by the instruction set below. It no longer matters if your language supports PowerPC, Power8, x86, x86_64, ARMv5 through v8 or even more exotics. Similar to the Java VM, which definitely showed the advantage of being more easily portable, it decouples the binary you deploy from the actual hardware.

Rust and Go have support for many targets but not as many as other compilers (GCC), a coverage problem that can be fixed by using WASM as intermediate and porting a WASM Runtime to that architecture.

Something like Innative would also enable desktop applications to be independent of the OS and architecture. The same binary would run on x86_64 Windows, PowerPC Mac and ARMv8 Linux.

It's basically Java but you don't have to use Java to get all the good parts.

(Disclaimer: I know the main dev of innative and do some WASM work myself)


> It no longer matters if your language supports PowerPC, Power8, x86, x86_64, ARMv5 through v8 or even more exotics

Yeah, instead it matters which WASM runtimes it supports and which archs those runtimes support.

But that's ok, we just need one more layer of abstraction to fix the whole mess.


For innative it matters what LLVM supports and LLVM supports already dictates the support of many programming languages atleast as a superset (Rust and Go support a subset of what LLVM supports), I don't see how this is "one more layer of abstraction will fix the whole mess" when it reuses existing abstractions.


WASM is based off of a specification, so compilers target the spec instead of an individual runtime.



Doubtful considering how much work is being done around WASM.


Usually being hyped by people that apparently don't know the history of bytecode formats since the early 60's.


I'm well aware of the history of bytecode formats, Java was pretty successful for a while, even managed to get in the Browser and only failed there due to a lack of a DOM Interaction story and similar integration problems.

There is already plenty of companies that deployed WASM on their stack (like Ebay, they use Wasm for their barcode scanner), it's not going away any time soon.



The CDAPI barely worked when it was released. It worked but poorly to the point that the Java Devs would have better not bothered at all. It's such a poor story for DOM Interaction that it effectively does not exist. Everyone prefered to use some UI library to render UI themselves.


You don't have to recompile for different architectures. Just compile once to wasm and it will run in all platforms it supports at the same speed. Note: the author mentions this too.

"WebAssembly’s 32-bit integer indexes break LLVM’s vectorization due to pointer aliasing. Once fixed-width SIMD instructions are added, native WebAssembly will close the gap entirely, because this vectorization analysis will have happened before the WebAssembly compilation step."


UNCOL, ANDF, P-Code, M-Code, JVM, PNaCL, MSIL, TIMI, ...


It is nice in development to have bcrypt, openssl, etc for other languages without a build step and build tools



One of my favorite examples of life imitating art.


"Run a fast, sandboxed bytecode outside of the sandbox by compiling it into a binary" so just a binary. I'm not sure I understand why you would use wasm here. No one writes wasm; you compile to it (usually from llvm ir). Why couldn't you just go straight from llvm ir to a binary; skip the wasm? I suspect I'm missing something here, but it doesn't seem to make sense.


WebAssembly is portable, stable, and well-specified. LLVM IR is not portable (it is platform-specific), not stable (it changes between LLVM versions), and not well-specified (especially around undef). Therefore, WebAssembly is a good distribution format, LLVM IR is not.


We already have a portable, stable, well specified cross platform language. It’s called Java.

What extra benefit does this provide, except avoiding Oracle?


Oh good point, we can just embed the JVM into browsers instead. WebAssembly’s cancelled everyone!

Jokes aside, why are you comparing these two very different virtual machines? WASM is a general purpose VM, JVM is not. For example, you won’t find a Rust JVM target any time soon. (Not to suggest that the JVM is strictly limited to Java, it isn’t obviously, but it is not nearly as suited to being a target for lower level languages. Also, the security model is very different.)


I think it is experimentally possible to target Rust -> LLVM -> JVM using this project https://github.com/davidar/lljvm


In fact there’s almost certainly more ways, too. You could probably transpile WASM to JVM bytecode. These things are most useful when you already are in the JVM anyways, I can’t imagine most people writing software in pure Rust would jump for this, especially given that WASM has a lot to offer for this use case already and has good momentum. I see it as a better fit, with useful security guarantees and a simple design.


The benefits of the JVM still stand out to me personally. Run anywhere, generate machine code at run time most place, world class optimizing compiler, battle tested & ready to deploy, integration with existing code base for gradual rewrites, etc.

Wasm is a good idea but it's going to need to reimplement a lot of existing code (optimizing + jit).

Doable and maybe it will convince people to do what has been a great idea first widely deployed by Java: compiled language in an abstract machine.


The JVM had many chances to realize that goal. Java applets, J2ME, etc. I’m not sure which particular issue really kept it from keeping mindshare. I don’t think the virtual machine itself was ever really the problem.

Still, since the Java platform didn’t capture this use case of a general purpose abstract machine, it makes a whole lot of sense to develop something like WASM. It’s a much more neutral platform to build on.

In particular we actually don’t need to go through all of the things Java went through; we have a wealth of knowledge about what things work and what things don’t work so well. Yes, its a new JIT, but not that new: from my understanding typically the JavaScript JIT machinery is reused for the WASM JIT in browsers.


> I’m not sure which particular issue really kept it from keeping mindshare.

The fact that it was bundled as a browser plugin, mostly.


java browser plugins have always been poorly designed and implemented


in v8 at least, we just plugged wasm into v8's existing jit. no need to make a new one from scratch.


>Wasm is a good idea but it's going to need to reimplement a lot of existing code (optimizing + jit).

Where the hell did you get that idea? WASM can be used as just another frontend for javascript JIT engines like V8 or SpiderMonkey.


If JavaScript JITs are anything like Java JITs then it might be that most of the performance-improving optimizations are based on optimizations of how Java/JavaScript operate and recognizing specific patterns that can be replaced with simpler instructions to improve performance. The JavaScript JITs help with the machine code generation but not necessarily with other higher-level optimizations.


We already tried JVM in browsers, remember java applets? I sure remember the security and compatibility problems :)

WASM has been designed from the ground up with portability, security, and stability in mind. It is also a lower target than JVM bytecode, which makes it more suitable to represent languages like rust and go. It has also been designed to take advantage of the sandboxed JIT engines that browsers already have for running javascript. Additionally, WASM is an open standard that anyone can contribute to, which is something that is greatly valued on the web.


We already tried JVM in browsers, remember java applets? I sure remember the security and compatibility problems :)

Actually, we didn't really try JVM in the browser. We tried it as a plug-in like Flash. The JVM didn't have access to the DOM like Javascript and WASM will have.


Actually it did.


Which browser gave the JVM the same access as Javascript because it sure wasn't Netscape.


All of them, given the right access permissions.

Better read the documentation?

https://docs.oracle.com/javase/tutorial/deployment/applet/ma...


given the right access permissions.

So, none of them out-of-the-box like Javascript. Given, it didn't really work as far out as 2005[1] which is 10 years after Javascript was introduced, I stand by my original statement.

1) https://www.eclipsezone.com/eclipse/forums/t16762.html


Which was a good thing, no one likes cryptominers on their pages, other than hackers that is.

With WASM it will even get better, Flash that one cannot disable.


remember them? Some of us still have the unfortunate pleasure of using them. (Worse: the IT desktop folks have to support a very specific outdated version of IE to keep it functional. I remember hearing that's how a new vendor's "solution" would be delivered, and the facepalm I did at the time. The look on desktop support's face was a bit more pale and filled with dread.)


That security part is a bit meh.

WASM code generated from languages like C is still open to internal memory corruption caused by out of bounds accesses.

If they were fully serious about security, memory tagging would be supported.


Besides the point about already trying out Java Applets, Why would you dismiss the Oracle avoidance? Oracle recently moved with Java to show a strong preference to monetising it, whereas lots of people on the web cannot afford any licencing just to be able to make stuff to run on the client machine. It is actually a very strong feature of it.


It's a really funny monetisation move, releasing everything as open source, putting it all in openjdk (which has been the language reference spec for a while anyway)

The only bit being monetised is the Oracle compiled and distributed version of the JVM. If you don't want to pay Oracle, just use OpenJDK, which has all of the same hotspot JIT stuff.


Oracle could have easily done as Google has done with Android, and make sure that the client end wouldn't run without some kind of proprietary extension[a]. They are the world leaders in deploying the Ask toolbar, and with that comes the end users. It is only because we aren't using Java that we don't see this happening.

a] Yes you can make android apps run on AOSP, but as many comment with regards to Huawei losing their Android licence, it will remove access to a lot of API infrastructure that isn't even Google specific.

edit: * caused formatting rather than being a note


I never had Ask Toolbar on my computer, why would you ask?

1 - I always read what I get proposed to install and disable what I don't care about.

2 - The JDK didn't had such "feature", only the consumer JRE


Yeah I know, it was in jest, as a reminder as to the practices of Oracle. Also most consumers do use the JRE, and the size of the audience is what Oracle would make decisions on.


Which Java opcode is for accessing linear memory again? I keep missing it somehow in these discussions...


They are aaload and aastore.


They access arrays. CIL can access native memory, but then it's not sandboxes anymore.


Wasm bytecode is lower-level than JVM bytecode, so I believe it's a much easier compilation target for a variety of source code languages.


Not really. NestedVM http://nestedvm.ibex.org/ had no problem targeting GCC to JVM.


Because LLVM IR is CPU-Arch specific. For instance, IR for x86 cannot be used on ARM CPUs, which is also the reason why Apple's bitcode representation intended to make apps portable, doesn't cross the iOS (ARM) <--> MacOs (x64) boundary, unless ARM ISA emulation is happening in Mac (like in Marzipan?)


It is not so much that the IR is specific, apart from platform intrinsics. It is that almost any optimization pass will encode architectural details, like packing and alignment.


Marzipan is not an ARM emulation layer.


That way the same binary could run on machines with different architectures. There is also work in progress to define common system api for wasm, which would allow to run the same binary on any platform.


For any other reader, the common system API that chr1 speaks of is called WASI. Mozilla's announcement of it is here:

https://hacks.mozilla.org/2019/03/standardizing-wasi-a-webas...


If your native code runs on different platforms, you can compile it once to wasm and it will run in the same speed on all of those platforms.


Because WASM is already bad for user security, they want to make it worse.


> WASM is already bad for user security

Any source?


Here's one: https://www.fastly.com/blog/hijacking-control-flow-webassemb...

I'm not sure any of this is any worse than javascript though.


Internal memory corruption caused by out of bounds accesses, in case the code was generated from C derived languages, as it doesn't provide memory tagging.


You are running a binary and you have to trust the sandbox. Hasn't worked so well for web browsers even with JavaScript.


Also didn't work well for Flash.


"We could break the stranglehold of i386 on the software industry and free developers to experiment with novel CPU architectures without having to worry about whether our favorite language compiles to it."

WASM is equivalent to early 80s ISAs but with different opcodes. Native WASM would be most efficient.


I used to resort to binary translation for experimental CPU projects, but WASM looks like a much better starting point. The controversial lack of arbitrary control flow is a massive advantage for me.

Writing an complete compiler backend/asm/ld is a significant undertaking which is only worthwhile for very stable architectures.


Java Bytecode is similar to WASM, opcode wise. Bytecode purposely omits many instructions that can be done with equivalents to fit the whole ISA in single byte instructions. WASM is similar for the same reason, code density.

The main difference in Bytecode/WASM is memory model and branching support. But you could say that both are equivalent to 80's ISA's.

Precompiled Java is closer to WASM than many would like to admit, so I expect performance to be similar in the long run. This isn't so bad, Java reaches over half native speed on many benchmarks. It's amazing that we'll be able to run untrusted code so quickly


Literally the whole point of WASM is fast, sandboxed code. It's not a language, it's an LLVM target. So why turn LLVM IR into a native binary? It's basically the same thing as "native wasm", whatever that would mean. "Native wasm" is just JITed bytecode. Either that, or you're turning it into a normal binary, in which case, why even use WASM?


You get a bit of safety because Wasm programs have statically validated function calls and operate only within linear memory. If there are no bugs in the stdlib or the bytecode compiler, then the host is still sandboxed from the module even without virtual memory.


No, it’s not possible to statically validate that addresses are within linear memory. WebAssembly implementations do runtime memory access checks (either with explicit if checks before some loads and stores or with virtual memory configuration).


> it’s not possible to statically validate that addresses are within linear memory

It might just be me, but I don't think this is what the parent comment said.


“Statically validated” AFAIU means there are no runtime checks.

E. g. WASM stack local variables and globals are statically validated. Compiler can translate loads and stores to locals and globals to simple movs. There are no additional runtime checks, no overhead. Unlike linear memory.


As far as I could parse that comment, it's saying that function calls are statically validated and that memory is linear. Not that all memory accesses can be statically validated.


Given WASM's 32 bit address space, you can effectively validate a programs memory access statically, with a one time initialization in virtual memory - for instance if you'd like to restrict all WASM memory to 64MB, you can allocate/map it to the top of a 4GB virtual address space, effectively giving it a start address of 4GB-64MB.

Since OS/process virtual memory bound checking is handled by the hardware, the one time setup above will lock down WASM memory access to within the 64MB, without software runtime overheads.

This is exactly why WASM memory was picked to be linear (unlike virtual memory that can have holes in continuity)


Fair point, author indeed said that function calls are statically validated, but did not say linear memory access is statically validated.

This phrase “host is still sandboxed from the module even without virtual memory” confused me. Because technically any interpreter (even qemu) can run without virtual memory with more or less expensive runtime checks.



x86 is translated to micro-ops in the processor. Why use x86 if you don't need a half-carry flag?


" It's not a language"

It sure looks like a language to me, like some sort of assembly language.

https://webassembly.org/getting-started/advanced-tools/

https://developer.mozilla.org/en-US/docs/WebAssembly/Text_fo...

granted this is a textual representation (very useful in certain circumstances) but that is semantics. Having done plenty of assembly, I don't see a huge distinction here.


LLVM IR is not architecture-independent. It encodes architecture and ABI assumptions, and is not portable.

Also, this is not "JITed bytecode", it is compiled AOT.


The PNaCL and Apple's variants are architecture independent.


Well. They pick an architecture and stick with it, more like. But one is dead in favour of WebAssembly and the other is an internal format.


It doesn't change the fact that they do exist.

PNaCL died in favour of WebAssembly due to politics.

With Chrome today's might, the decision would most likely be a different one.


Not it wouldn't be because most bytecodes generally operate at a higher level. A single java call instruction can in theory turn into an unbounded number of machine instructions through inlining. Therefore limiting the machine to only executing entire java instructions will make it extremely slow because it cannot use any optimizations.


You can avoid the effort to use the uppercase key and write wasm (or Wasm), that is the correct typography. Asm in wasm is acronym for nothing.


Asm for Assembly, so maybe WAsm is correct?


Isn’t it also using 32 bit pointers on 64 bit machines? That should also improve performance a bit.

It’s a shame that the x32 ABI is almost abandoned nowadays, it has some modest improvements for applications that don’t need that much memory.


why u bully me? ily and ur game diep.io


WASM sounds great and all with it’s sandbox like NaCl. But I have to imagine over 90% of client and server computers are x86 and ARM and I don’t see those targets losing any share in the near or long term. Also I don’t see wasm being used in micro controllers as well.


Isn't the sandbox a useful feature of WebAssembly? It gives you much better security guarantees than running untrusted native code on your system.


It looks like they provide different levels of sandboxing, so you can tweak things based on your specific requirements.


Alternatively: run native code at least 5% slower than before.


Before or after turning off Hyper Threading and speculative execution on your Intel microprocessor?

(The point being that 5% slowdown is a drop in the bucket compared to what we've already lost due to Intel's chip design problems. AIUI, HT is a 15-20% slowdown, and SE was another 20% slowdown.)


But now you can in theory run your programs on <strange underused architecture> that doesn't suffer from those vulnerabilities thanks to the power of WASM.


> on <strange underused architecture>

Did the Mill CPU finally get first silicon? ;)


Like the article said, in practice the WebAssembly can actually be faster, because it can use all the optimizations for the target machine.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: