Hacker Newsnew | past | comments | ask | show | jobs | submitlogin
Nuitka: a Python compiler written in Python (github.com/nuitka)
502 points by stunt on June 17, 2021 | hide | past | favorite | 181 comments


I've not tried Nuitka yet, but I've done similar things with 2 other tools to package up a Qt-based GUI tool that has a Windows installer, Mac ".dmg" or a Linux package:

* FBS (FMan's Build System) - worked well for a while, but we couldn't upgrade Qt past a certain version and he version we used was only happy with Python 3.6.

* Beeware - our current solution. Very, very happy with this; works smoothly, lets us use the latest Python and Qt versions and lets us make professional-looking installers that we can distribute.


fbs author here; The Pro version of fbs [1] now supports Python > 3.6 and recent Qt versions.

1: https://build-system.fman.io


What I love about HN. You can discuss almost anything and the project's creator pops up in the thread.


Hey - first of all, looks great and I'll definitely look into it. Good job!

Question - do you have any comparisons on the size of the resulting distributed executable compared to Nuitka, pyinstaller?

This is the issue I've faced. Nontrivial programs (that, say, import pandas, numpy, selenium) are huge, implying a ~250MB+ download for the user - and then maybe up to 1GB in disk when its all unpacked.


The OpenBLAS version of NumPy (from pip rather than conda), plus PyQt5, can fit in around 100 megabytes uncompressed on disk for a Win64 binary. However I no longer recommend writing number crunching apps in Python, since you don't have thread-based parallelism due to the GIL, and have to awkwardly shuffle data across processes. (Last time I tried multiprocessing, some data can only be sent to workers on Linux using fork-based worker spawning, and not on Windows which lacks fork and requires pickling.)


If the number crunching is numpy-heavy, multiple threads absolutely can use real thread-based parallelism. C extensions like numpy release the GIL and so can run in parallel.


Thanks! Regarding your question: fbs uses PyInstaller under the hood, so the binaries should be pretty similar in size.


BTW the error tracking described in the Manual sounds great. Are you affiliated to Sentry in any way?


No affiliation whatsoever. If you want to use Sentry for almost free, you can use my (old) scripts for deploying it to a $5 / month VPS: https://github.com/mherrmann/sentry-self-hosted


Ah, great news. Thanks for making FBS; it is incredibly liberating to find a tool that lets us write Python GUI apps and have them appear as if they were made by XCode or Visual Studio.


Happy if you find it useful!


I'm using fbs in an open source (GPLv2+) project, is there a way to upgrade or am I stuck with old python?


Nice option for building Python+Qt applications!

Buying the Pro version still leaves one to deal with Qt licensing does it not?


Yes. But usually you can use it under the LGPL and thus for free.


Tried to use this recently. I don’t understand the decision to limit the free version to versions < 3.6. It would be beneficial for you and your users if you at least have a free trial for versions above 3.6.


I track my time and have spent 570 hours since 2018 working on fbs. The vast majority of this (526h) was for the free version you are complaining about, the nice documentation etc. You are criticizing me for not giving you everything I do for free?


Just wanted to drop in and appreciate your work.

It's documented very well, extremely easy to use and the pro version is priced very well.


sorry, i don't mean it as criticism and I am not asking you to give it away for free.

I was suggesting that, for you to make more $$$ from it, its better to provide a free trial period for the said version.


Beeware sounds really interesting. I've often wondered if there are any things like it out there that can help me build complex GUIs (e.g. user interfaces with interactive 3D stuff), and an interface that doesn't look like its from the 90s. I'll look into Beeware more in depth, hopefully it can handle stuff like that.


Thanks, I'm going to look at Beeware. Struggled a while back trying to package a Qt app with Nuitka.


Did you also evaluate Kivvy?


Big fan of Nuitka, and specifically the author Kay.

I sent him an email years back, just thanking him for his work and asking how to donate. I was surprised when he wrote a very thoughtful and kind reply, acting rather shocked why someone would give him money. He's a real gem of a person, and I'm glad his work is getting (rightly) recognized.


Nuitka is the best python compiler I've used. I have tried at least three others and none worked as good as nuitka. And I'm talking about complicated dependencies such as pytorch, selenium, tesseract - that type of stuff.


How do the resulting executables perform, compared to using CPython? Did you happen to run any benchmarks?


Not the OP, but it claims to be ~2x faster than CPython. I haven’t done extensive benchmarking, but for my small projects that seems about right.


Based on my own tests, once you have a 'normal' program that does a bit of IO, calls out some C based libraries etc. etc. I've rarely seen more than 10-20% performance increase across the whole program run.

For individual functions you can see 2-4 times speed up over pure python.

Most importantly, I've never seen it result in slower than cpython performance.


That sounds significantly slower than PyPy on average, which would hardly make it "the best python compiler". YMMV, of course.


It's all about compatibility to be honest. PyPy has done yeoman's work and it's still lacking in compatibility for important packages (e.g., psycopg2). Performance means little if we can't use major parts of the ecosystem.


A few years ago I experimented with running a Pyramid/SQLAlchemy/Postgres app with PyPy. I was able to get it to work with https://pypi.org/project/psycopg2cffi/

I didn't notice any particular speedup for this app - manipulating lots of JSON Python data structures is not really PyPy's sweet spot. Whereas for some processing of large genomic data files I saw a substantial speedup.


Is it preferable on PyPy to use a libpq wrapper over a native solution? As far as I can tell, even though CPython C API is in some way supported on PyPy it's really only advisable to use packages using it as a last resort for performance reasons. Aside from psycopg2cffi, which you should be able to use with PyPy, I'm pretty sure I saw other PEP 249 implementations for PostgreSQL the last time I checked (which admittedly was a few years ago).


> As far as I can tell, even though CPython C API is in some way supported on PyPy it's really only advisable to use packages using it as a last resort for performance reasons

Yes, but so much of the Python ecosystem uses the C API that the "last resort" and the "common case" are one in the same most of the time. `psycopg2cffi` IIRC that's not very well supported. It looks like it was updated last in January of 2021 but before that the last update was from 2018. This was what prevented us from using it in 2020. Moreover, there are other packages besides Postgres drivers; that's just the one that I recall running into problems with. To be clear, I want Pypy to be successful, and the project is nothing short of amazing.

> I'm pretty sure I saw other PEP 249 implementations for PostgreSQL the last time I checked (which admittedly was a few years ago).

There was a pure Python version, but it didn't seem battle-tested and there was no indication of its quality or performance. I don't want to pull a package like that into production for something as important as a database driver. It's been a couple years since I looked into it as well, so perhaps things have changed for the better in the interim.


HPy looks like the best compromise in this area.

It's not too different to programming the CPython API, and will give optimum code in Pypy and CPython.


Agreed, but we need the community to migrate to it, which requires leadership from the maintainers.


> so much of the Python ecosystem uses the C API

Yeah, that's shame, because 1) that API is mildly awful, and 2) it really restricts implementation choices to the extent that making another implementation of the language is a problem just because of the need to fake to numerous existing C extensions that your implementation choices are the same as CPython's (when really your implementation is likely to work very differently on the inside). I guess Python people really programmed themselves into a corner here.


PyPy is a JIT that doesn't produce standalone executables.


I was talking purely of the "compiler" part, not about any "standalone executable" part that may or may not have been implied.


Without further clarification, when I read "compiler" I translate it as "standalone program which takes source code and emits exactly one binary which executes the program".

It can be a much broader term, for example you will run into people here arguing (with some justification!) that a "transpiler" doesn't exist since it's a mere subset of the broad definition of a compiler.

I'm just reporting on the mental image that calling such-and-such a compiler forms in my mind's eye. I expect I'm not alone in that.


There's nothing "broad" about a compiler not generating "exactly one binary". First, most of the time, even in conventional systems, it is the linker who creates that binary, not the compiler. Second, in other systems, such as Oberon, there's not even "exactly one binary" for anything, unless you're talking about the currently running system as a whole.


A JIT has runtime information that makes it a bit easier to optimize a highly dynamic language like Python.


Yes, that's why JITs are a good idea and often preferred.



>complicated dependencies such as pytorch, selenium, tesseract

When I last tried Nuitka I couldn't get it to work with pycryptodome. Sounds like I should maybe give it another go.


I had a similar error. If I recall correctyl, I had to copy the contents of lib/Crypto into the nuitka dist folder, and then it worked.


What the heck use case is this?


Is Nuitka able to create self-contained binaries that can be distributed as a single file? Or do they still have external dependencies such as the Python runtime?

I’ve chosen Go rather than Python for a few small projects recently. Not for performance, parallelism or a particular fondness for the language, but just because Go can build truly standalone executables.


The last time I tried this was quite a while ago and your question was exactly why I tried. But, it failed with some unintuitive errors in dependencies of my Python script, and I gave up after 10 mins of Googling.

I’m still wary of that experience and will avoid Python where I have such deployment needs unless the language natively comes up with such a build solution.

FWIW, I think borg(backup solution written in Python and C) uses pyinstaller to get a single binary executable. It may be of interest to you.


According to the readme:

> The created binaries can be made executable independent of the Python installation, with --standalone and --onefile options.


I use Pyinstaller to generate single file executables of my Python projects, which use Qt. It works great.


The only problem I have found with Pyinstaller is that their "standalone" builds still depends on certain libraries that will change with the version of Python in the host OS, such as the glibc library, or others.

The proposed solution by Pyinstaller is to build your program in the "oldest" version you support, which is silly compared to build systems like Go's, which are true standalone.

Having said that, I'm curious about this solution, since it seems to claim a true standalone build...


The other issue with PyInstaller is that, by default, it includes all dynamic libraries that the Python interpreter has on your machine. It makes sense, it needs an interpreter and collects what is needed for it to run.

Unfortunately this might include libreadline.so, which is licensed under GPL, making your resulting executable unable to be under a proprietary license.

There are ways to solve this issue, but one has to search and read documentation (and code, in my case -- when I was researching it the docs were not clear).


> [PyInstaller] includes all dynamic libraries that the Python interpreter has on your machine.

Yes - and last time I used it, it created either a large folder or a compressed archive containing all of those libraries. Only the latter gives a truly standalone executable - but it's very slow to start up because it has to extract the archive to disk every time it runs.

It sounds like Nuitka has a solution for this problem, at least on Linux: "[the binary] will not even unpack itself, but instead loop back mount its contents as a filesystem".


Indeed it does. The only downside of this is that the resulting binary is (at least in my case) almost tripled in size compared to Pyinstaller (5MB vs 14.6MB). But I can live with that.

Still doesn't statically link C libraries (or at least I didn't find the setting for it), or other libraries for that matter.

Pyinstaller binary build depends only on: libdl.so.2, libz.so.1 and libc.so.6.

Nuitka binary build depends on: libdl.so.2, libz.so.1 and libc.so.6 AND libpthread.so.0 (for the loopback mounts I suppose).

The one that always creates problems is libc.so.6, which usually is not present 4 year old systems...


Go binaries are often not truly standalone. You can certainly make static Go binaries that don't use cgo, but you can't then use standard libraries which rely on it.


I believe that if you use MUSL for libc, you can statically link libc into the binary. That should make the standard library all statically linkable, I'm not aware of any other C dependencies (although I could very well be wrong).


Hmm, with Pyinstaller I generate true standalones which require no libs at all, nor Python to be installed on the host system. Do you use the --onefile option?

As an example, here's a little project which I also release as a single binary file that works everywhere I've tested, with no deps:

https://github.com/vascocosta/glueather


Did it bundle all of Qt and its dependencies into a single executable file?


Yes. I do it for my weather app that uses PyQt. On the release files I have single binaries that work flawlessly:

https://github.com/vascocosta/glueather


I just downloaded it and it works great. Took me a minute to figure out how it wanted me to enter my location (Austin, TX USA is what worked). Would it be hard to add a button to get the location from the OS? I think all the major operating systems have a location service these days.


That's a very good idea. I'll try to add that feature on the next release. Thank you for using my little weather app. :)


Yeah but they only run on a single operating system, so all you've accomplished is going from O(n*2) files to O(n) which is good but it'll never be great like APE which needs 1 file.


Yeah, but in the first case you go from 1000 files to 2, whereas in the second you go from 2 to 1, which is much less impressive.

Metrics are fun!


In case anyone missed it, the Actually Portable Executable is well worth a read:

https://justine.lol/ape.html

https://news.ycombinator.com/item?id=26273960


Each binary runs on a single operating system, but you can easily cross-compile in Go. This is strictly better than the compiled Python case--either you're compiling Python to native code in which case you have the same problem and none of the easy-cross-compile benefits or else you ship a zip which inevitably contains lots of C dependencies that are also specific to one platform and you still can't easily cross compile.

> so all you've accomplished is going from O(n*2) files to O(n)

You go from N*M*2 where `M` is the number of target platforms to M (no need to use big-O notation as far as I can tell).


Are people actually using APE? It's a really clever hack but I can't imagine it's a good idea to actually distribute executables that rely on ancient COM / shell formats and a weird C library to work.

It's really not that hard to build standalone binaries for Mac, Linux and Windows using GitHub Actions. Especially if you use a reasonably modern language like Go, Dart or Rust.


Past related threads:

Nuitka 0.6.0 released - https://news.ycombinator.com/item?id=18092837 - Sept 2018 (14 comments)

Nuitka: A Python compiler - https://news.ycombinator.com/item?id=17683932 - Aug 2018 (6 comments)

Nuitka: A Python compiler - https://news.ycombinator.com/item?id=16980704 - May 2018 (4 comments)

Nuitka: a Python compiler - https://news.ycombinator.com/item?id=15354613 - Sept 2017 (60 comments)

Nuitka Progress in 2015 – Python Compiler - https://news.ycombinator.com/item?id=10994267 - Jan 2016 (52 comments)

Nuitka: a Python compiler - https://news.ycombinator.com/item?id=8771925 - Dec 2014 (135 comments)

Nuitka — A Python Compiler - https://news.ycombinator.com/item?id=1746340 - Oct 2010 (33 comments)


On their website:

> Future

> In the future Nuitka will be able to use type inferencing based on whole program analysis. It will apply that information in order to perform as many calculations as possible in C, using C native types, without accessing libpython.

They already claim a 300% speed boost, but with those type inferencing things, it will probably go a lot of further... Can't wait for this project to mature.

I guess compiling to WebAssembly should be trivial then?


Speaking of speedup and program analysis, there's also mypyc[0] which relies on type annotations and also claims a performance boost:

> The mypy project has been using mypyc to compile mypy since 2019, giving it a 4x performance boost over regular Python.

Does anyone know how mypyc and Nuitka compare in practice?

[0]: https://github.com/mypyc/mypyc


Another cool project like this is https://github.com/Pypperoni/ but it takes a bit more of your own set up to get going.

Disclaimer: just a happy user.


This looks pretty cool. A better summary at the top would be nice to new users. I had to scroll down a bit, and read few lines until I realized it translates Python to C, and then uses CC to do final compilation (it is also quite ambiguous if executable is native or interpreted, until you read that part).

I'd be looking more into this, but anyone else knows any other dynamically typed language which compiles to native code, not JITed or bytecode (I remember there were few Lisps out there).


Steel Bank Common Lisp compiles Lisp to executables.

Julia supports both a REPL and AOT compilation.

There are some Forth compilers for various systems. https://www.thefreecountry.com/compilers/forth.shtml


Humorously, https://cliki.net/Python - "Python is the name of a free high-performance compiler originally developed at Carnegie-Mellon University for CMU Common Lisp, it is now used by SBCL ..."


There are some for Perl in various degrees of emitting bytecode to actually optimizing (perlcc, RPerl). I think others prefer emitting to Java or making their own JITs.


My one claim to fame is that a few years ago, I suggested the TL;DR wording on the Nuitka Overview page: https://nuitka.net/pages/overview.html

If I could find the message I sent to the Nuitka mailing list (or did I suggest it via email? Can't remember) I'd publish that here.

;)


^ Seriously? Downvotes?

What did I say which was wrong? Honestly thought it was a mildly interesting anecdote!

Sheesh! You people! :)


We have an internal tool that autoformats code (it calls clang for C++, some other tool for C#, and autopep8 for python). But for Python I did not want to rely on python, but have it a separate standalone .exe that can run by itself (almost, still needed python38.dll). So nuitka helped me there:

I wrote a small app, which is simply:

    autopep8tool.py:
    import autopep8; autopep8.main()
Then had this .bat file to do the .exe for me:

    autopep8tool.make_exe.cmd:
    @echo off
    pushd %~dp0
    call "%VS140COMNTOOLS%\..\..\VC\bin\amd64\vcvars64.bat"
    p4 edit ..\tools\autopep8tool.\*
    p4 edit ..\tools\python38.dll
    for /d %%i in (*.dist *.build) do rd %%i /s /q
    echo import autopep8; autopep8.main() > autopep8tool.py
    call nuitka --standalone --assume-yes-for-downloads --unstripped --windows-dependency-tool=pefile autopep8tool.py
    copy autopep8tool.dist\autopep8tool.\* ..\tools
    copy autopep8tool.dist\python38.dll ..\tools
    for /d %%i in (*.dist *.build) do rd %%i /s /q
    popd
works great so far, and even that the produced .exe is 10mb + 4mb python38.dll, it's great logistical saver (IMHO), as I can rotate the tool to other teams, depots, without requiring proper python (which is even trickier on windows). It's probably not completely sandboxed (hermetic), but works well so far.

calling autopep8.exe --help even gives the right help.

So great tool (though I've been told about others, like google's subpar, etc.)


Do you find it increases the maintenance burden? Since now a previously 1-character change requires a massive recompilation, and updates to Python (which happen frequently...) also require recompilation.


Recompilation is the norm for compiled languages so this doesn’t seem like it would be very onerous or have any kind of substantial effect on maintainability.


I've only had to compile it once, unless we change python drastically and/or want autopep8.py changes (I would think this does not happen often, and if it does it also means we'll have to reformat our code).

Granted, my case is really not general one, but nuitka fit well there (like I really wanted to avoid another 'freeze' tool that unpacks python to a temp folder, and calls it from there just to run autopep8.py).

That to be said, I was not able to get nuitka working with some of the other popular format lib (I'll need to look again, it takes a bit of setup)


you could just set up a file system listener and then trigger recompiles at some arbitrarily decided-upon interval of time. Doesn't seem like a huge pain otherwise.


If you are looking for something lighter: CPython can execute ZIPs with a Python program inside.

You can also prepend the python shebang before the ZIP and mark it as executable in the filesystem. youtube-dl gets distributed like that.


The problem with this method is that the user still needs to have Python on their system. Pyinstaller (Py2exe, cx_Freeze) bundle the python interpreter along with your app.


I think its typical that only one python runtime is installed in the system. What is the use case for bundling the runtime with the app?


I'm one of the rare idiots who wrote a consumer-faced python+Qt program. I need to ship the right dependencies along with the right version of python to my customers.

Using Pyinstaller it works out in the end, but after everything, it would have been better to use a language meant for user-facing applications.


I package Captain's Log [0] with PyInstaller.

I need to, because otherwise I wouldn't be able to package it for Windows users.

The app is also closed source.

I cannot easily package the application for Linux users, because of so many different flavours of Linux, so many different combinations of Python and Qt versions.

Which is why I've given up on a pure Linux version and I'm in the middle of adapting it to work under WINE alongside Elite: Dangerous also running under WINE, instead.

[0] https://captainslog.scarygliders.net/captains-log-2/


Many systems still have both Python 2 and Python 3 installed. Also, in the standard case, the version of Python is controlled by the user, not by your app. Bundling it with the app lets you make sure that it's running Python 3.8, for example, so that you can use Python 3.8 features without having to add a bunch of compatibility shims in case the user is running an older version.


Distribute my cli app + gooey as a stand-alone binary for windows users


Perhaps on a server, but on my workstation I probably have 6 or 7 pythons installed.

edit: including virtualenvs and conda environments have 41 python.exe files on my workstation right now.


No there’s often tons of them. I’ve personally seen hundreds.


Roughly 0% of Windows users have Python installed.


Even better: use OS packages. They allow listing what is installed on a host, remove it cleanly, do atomic updates, ship manuals, configuration files, systemd unit files, implement sandboxing and so on.


I've used Nuitka a couple times for some commercial Raspberry Pi projects and it worked perfectly each time.


It is instructive to read GvR's take on Nuitka:

https://news.ycombinator.com/item?id=8772124

GvR and the old boys currently present themselves as CoC proponents, and strategically use the CoC against anyone who criticizes them.

Compare with his current Microsoft sponsored activities. If anyone would use that language against "his" (i.e. Mark Shannon's) project, they'd be out and suffer public defamation in no time.


I don't love the guy or anything. But you're pulling pieces together from different contexts over almost 10 years to tell your story.

The quote was about a presentation in 2013, not the project as it is now. Guido's complaint is mostly that they were equating "compiled" with "performance" and were not rigorous in backing that up. The presenters may have been young. But I'd have made the same complaint. The blog thing was just snarky.

That was a period of time when he has admitted he wasn't doing so well as a human being and this is pretty low on the snark-meter compared to other BDFLs.

The project page as it is today says very little about performance 'wins' as I read today. In fact, it points out losses, in the same vein as the RedHat issue with Python linked to libpython posted the other day here. Their point doesn't seem to be "make your code go faster", but as an easier mechanism of distribution for some cases.

I'm not getting the "his" shot. I haven't seem him take credit for the work, just that headlines do because journalists are lazy. I may have missed something.

But the MS team's activities today are things he would have dismissed years ago. He's learned and changed his mind. If he was completely consistent with himself from 2013, I'd think he wasn't that bright. (I'll never understand those kinds of arguments about politicians)

You're drawing a line from criticism of a 2013 paper, through his current career, to a malicious intent or conspiracy. That's a lot more defamatory than a valid (but a little rude) critique. You're attacking a person's character, not their work output.


contexts, events, whatever. It doesn't matter. If Kay had listened to any of them, the Python ecosystem would be lacking its best packaging tool, and a significant optimizing tool as well.

That is pretty unacceptable.

> In fact, it points out losses,

Because they have cleared away enough chaff to start encountering these problems with the design of the cpython code itself.

There isn't much point in saying X is faster if significant harassers in the python community will make a point of walking out of your talks and later belittling your work online, right?


> He's learned and changed his mind.

I don't think so. Call me cynical, but I suspect that Guido went where the money was (one does not simply go out of retirement). Otherwise there would be an apology to the projects he's been consistently berating for years.

Good thing Kay Hayen has the perseverance and quite mad planning skills. Today we have a compiler that works better with each release and is capable of delivering results today, while van Rossum's work at Microsoft as of now is largely vaporware I don't have very high hopes for.


What does CoC stand for?


Still can't believe that this abbreviation is actually in use without any objections while there are talks about needing to rename the Coq language.


I imagine that most people saying "CoC" out loud are not saying "cock", instead either "Code of conduct", or "cee oh cee".


I also imagine the same people pronounce the other word "cee oh cee kay".

I myself pronounce it so it rhymes with and sounds identically to "cock", for several reasons.


CoC = "Code of Conduct"

I haven't followed Python dev, so can't comment about how the CoC is being used in practice.


> I haven't followed Python dev, so can't comment about how the CoC is being used in practice.

In tough situations, CoC powers are inversely proportional to the height of your position on the totem pole.


hello.py 1-liner on Ubuntu Linux:

$ python -m pip install nuitka

$ python3.8 -m nuitka hello.py

install Nuitka with no problems and compiled "hello world", also with no problems.

It created a 4,060,104 byte hello.bin with no symbol table, so no need to run strip to make it smaller, which again worked like a breeze.

(The developers say on GutHub they will work on smarter ways to keep the dependency list small, which can at present escalate for modules like pandas that themselves have >1000 dependencies.)


I remember this being promising but struggling with a dependency that was using entry points to dynamically load plugins, and I couldn't get Nuitka to register that. Seeing this is a good reminder to have another shot at it.


How does this compare with PyInstaller?


pyinstaller doesn't compile your code. instead, it bundles a copy of the python interpreter along with your code, and sets everything up with an executable header so you can double-click the file to run your code within the embedded interpreter.


I suppose if my target is on x version of y OS then I'll have to compile on that machine, yeah? Sorry, I don't have much experience creating these sort of packages for distribution, but have recently found the need.


It depends on the language. Could require it or could not. Not sure about here.


Have used on home projects couple of years ago to compile code on macOS and run on Windows and on Debian(on a pi). Worked well. Looking forward to retrying nuitka again once my current project gets to point I've finished the firmware and get back to working on middleware.

Def worth a look for anybody that wants to distribute python developed system without installing python or the inevitable many dependencies of projects. It just makes distribution a whole lot more predictable.

And to the reader who asked about GraalPython its on my todo list - have it installed but life, children, and work keep getting in the way.


This looks rather fun, has anyone here used it in anger?


I wrote about this over a year ago over here https://ao.ms/how-to-package-a-python-app-using-nuitka/

I've also used it in a production estate, and it works wonderfully well.

Would recommend!


smashing, thanks for this.


I've used it for some simple, personal usage CLI tools and simple REST services. Works out-of-the-box and really well.

From my experience some libraries are hard to compile (e.g. BeautifulSoup) but not impossible they need some tweaking.

Sometimes I struggle with embedding data for some reason (especially since I'm spoiled with Go 1.16 `// embed`).


How does Nuitka deal with external resources (files) that the code pulls in via `open()`? Can I somehow include those in the executable if I want to?


I used Nuitka many years ago, good to see it on the front page.


Anyone have a sense of how this compares with cython performance wise?


Amazing idea, is there anything like this for Javascript?


What's the advantage of this over docker?

Presumably this just bundles a python runtime and a copy of the dependencies and hides it all inside some sort of huge file?


I have a project that pyinstaller can make into a 12 MB single file. Docker turns that same project into a ~400 MB container. This is not considering the Docker install footprint.


Does this mean python is now self-hosted?


Python is self-hosted under PyPy.


anyone tried GraalVM Python for this?


I looked into the this question for Graal Nodejs and I think it has the same problem in Python...

Basically Graal Python and Nodejs each provide a custom interpreter for the target lang, the main goal of which is to provide interoperability with the Graal polyglot ecosystem. So you can run your python code under the GraalPython interpreter and it will run JITed fast and can import libs from other Graal-supported langs.

But as far as outputting executable binaries, Graal only provides that for JVM projects and LLVM languages like C/C++/Rust.

So it's not impossible, but you have to build your own Java wrapper project that loads the Graal Python interpreter class in code and then runs your python lib inside that.

I expect eventually that boilerplate step can be automated as part of the Graal build tools.

If I've got this wrong I welcome any corrections!


I was half-remembering what I found when researching Graal Nodejs

clearer, less-garbled version as an answer here: https://stackoverflow.com/a/67331258/202168


is nuitka compilable with itself?


has it reached fixpoint?


I love Python so much. It's my favourite language. It's so easy.

I don't even use the dynamic nature of Python - monkey patching etc.

I wish more people wrote algorithms in Python for understandability then compiled them into a compiled language for performance. I find that Python resembles pseudocode because it is syntactically sparse, unlike C++. In Python I don't need to worry about ownership or memory. It probably needs Pointers to be a low level system language.

I just wish Python had parallelism.


I used to take your view, but actually there are plenty of languages (e.g. ML-family languages) where you get the best of both worlds - light, clean syntax, but also type safety and decent performance.


I experimented with OCaml and some of my Python code. I found that using Python with Numba enabled me to get much better performance than the same algorithm written in OCaml.

In this case, using Numba for the performance-critical aspect was trivially simple, just the addition of a decorator. And I've found that is generally pretty close to true, at least for the kinds of things where I care about performance.

That's why I'm still in the Python world rather than the OCaml world for my algorithmic work. I really like OCaml as a language and would actually prefer to use it, but Python/Numba seems to give me substantially better performance, as well as all of Python's standard libraries.

And Numba lets you turn off the GIL, so you can get multiple cores going at once.

YMMV, of course. I'm not making a general claim that would be true for everyone. But I think it may be worth mentioning that it could be true for more uses than one might assume.


I'm sympathetic to this point of view. But then there is 25 yrs of stdlib that's tattooed onto people's brains and stackoverflow.

A pure python implementation of python stdlib ready be transpiled could be that bridge.

Also it's possible to support multiple source syntaxes in such an ecosystem. ML based, python based or even a hybrid. In the end all I need is a well documented AST.


One of the main reasons people choose Python is the huge stdlib and vast package ecosystem. This is difficult to ignore if you need to do some very heavy lifting.


Can you share some examples? I'm very interested in a typesafe alternative to python, but low friction is also necessary.


Personally I moved to Scala, which has a poor reputation because of some symbol-heavy libraries, but you can use it to write very clean Python-like code. Version 3 has actually added a more Python-inspired syntax with colons and indentation instead of braces.

Standard ML was where I started. As a language it's great, as an ecosystem it's limited. OCaml seems to be where the action is (and even then its packaging/dependency management isn't great - but then it can't be worse than Python's).

F# has a very good reputation but I haven't used it a lot myself.


They said ML family, so that's usually OCaml and SML.


I use JetBrains PyCharm with type annotations in my code. I run the type-checker/linter ("Inspect Code") every 10 minutes or so, it's muscle-memory at this point. It covers most cases, problem resolved.


F# comes to mind. It uses indentation instead of braces and has a nice ecosystem.


Why do you think python3 is not typesafe?


Because it's not? It has a type system but it's still duck-typed at the end of the day, and no amount of Mypy will change that.


It should still be possible to use the python syntax and a reimplemented stdlib to build a statically typed, typesafe variant.

I think you object to the name python3, because there is a duck typed interpreter that's popular.


You fundamentally cannot build a statically typed variant of Python without cutting portions of the stdlib, most notably `mock.patch`. You could do a good majority of it, probably enough to be completely usable, but at what point does it stop being Python? Go use Nim if you want statically typed Python-ish syntax.


py2many transpiles py3 to nim. Think of python as a way to interpret nim code if you like, with a large installed base of existing libraries.

There is demand for iterative creation of software: first get the logic out with the least friction and then think about types, resource leaks, good software engineering practices etc.



https://github.com/adsharma/py2many

  pip3 install py2many
  py2many --nim=1 foo.py


Type safety and static typing are orthogonal. Both C and C++ are statically typed, yet are not type safe, for instance.


nim-lang.org might interest you!


Note that type safety typically means better tooling that what Python people are used to. If you’ve only used Python you have no idea what you’re missing out on with respect to refactoring, editor integrations, documentation generation, etc. And then there is the actual documentation for humans that is guaranteed to be correct and up to date so long as your build is green.


Agree, just that ML-family languages take quite long for an average mortal like me to grok.


(Standard reponse)

Python does support parallelism, so long as you're calling into a C library to do the work, because they (usually) release the GIL while doing the work. That sounds like a cop out but using C libraries is very common to do CPU intensive work anyway. Examples include numpy (e.g. you can call np.dot from multiple threads in Python and they will genuinely use multiple cores) and all the rest of the scipy stack, and even modules in the standard library (e.g. when using zipfile, even to process an in-memory buffer, the GIL is released).


if you call np.dot on the same underlying ndarray from multiple threads, what are the semantics?

I don't know what they are but I think most people want a single invocation of np.dot in a single thread to use multiple worker threads (as described in https://scipy.github.io/old-wiki/pages/ParallelProgramming) by pushing the work to a thread-aware matrix library.


> if you call np.dot on the same underlying ndarray from multiple threads, what are the semantics?

Usually, this will be safe because np.dot only reads from its arguments, and produces a fresh new array to hold the result.

I just checked the documentation and found there is an out=argument to pass a destination array. (In any case there are other numpy functions that do modify their argument.) If you pass the same array to that parameter in two different threads then I don't know what happens. Maybe numpy locks the GIL to write its output. Maybe it produces C-style undefined behaviour (this would be my guess ... and my preference).

> I think most people want a single invocation of np.dot in a single thread to use multiple worker threads

Maybe? I'm not convinced that's what I want! But it's pretty irrelevant either way to my point; np.dot was just an example. My real point is that lots of CPU-heavy Python functions are implemented in C and release the GIL, so in practice parallelism with Python is often possible.


I've worked in this field awhile.

99% of people want this: np.dot and others use all the cores on your system by default. This accelerates the common use case, which is individual user on a multicore machine who wants single result ASAP. Then, there's an env var that lets you configure the underlying matrix library to use only a single (or maybe another number) of cores, and you work with the multithreaded/multiprocess libraries to run many independent calculations (possibly on shared read-only source data); then that scheduler has a tunable set to saturate the cores or IO on the system.

This is what most BLASses use and it's how data analysts and supercomputer folks can get along.


To reiterate: None of this contradicts my main point that numpy.dot releases the GIL, which it does even if it calls into a BLAS library that uses multiple cores (although of course that would then have its own global lock - potentially even an interprocess lock - to manage concurrent uses). numpy.dot was only meant to be an example in the first place! I could just as easily have talked about some optimisation routine in scipy or something in the standard library (actually I did mention ZipFile.extract).

> I've worked in this field awhile.

This statement undermines your point more than it supports it. There is no "this field" for numpy. Just working in some field that uses it is no qualification for knowing whether users as a whole would like it multithreaded automatically. "Data analysts" and "supercomputer folks" no doubt include many users but certainly not all and possibly not even most (for all we know).

> 99% of people want this: ...

I can believe this might be true. I don't think I ever claimed otherwise. In fact I said "maybe"! I only claimed that I don't want this.

I will make one more claim though: I don't think there's any way that you could know this, even if it's really true.

But yes, I agree and already knew that it's how BLAS libraries work by default, including OpenBLAS which is what the prebuilt numpy wheels use. Even then, as you're probably already aware by the sounds of it, that's only for sufficiently large matrices. If you're just using small matrices and vectors (e.g. in 3D to represent real-world coordinates) then processing will be done directly in the calling thread (because the overhead of using multiple threads would be greater than the benefit).


Hi, would the downvoter mind adding a comment explaining why? What I said is technically correct so it's unclear what the objection is.


No need to downvote a request for explanation; this is within HN rules. Otherwise, you're giving an ambiguous signal so I don't know whether you're unhappy with the current state of Python, or disagree with my technical statement (in which case, please provide a counterargument instead of a vote).


FYI complaining about votes is against HN comment policy.


I'm not complaining about votes. I'm asking for people to clarifying why they are downvoting a technically correct statement.


Requesting explanation of downvotes is among the things the policy against discussion of the votes on a post is directed at, and, IMO, also demonstrates a fundamental failure to understand (or respect) the purpose of downvotes, which is about managing signal-to-noise ratio by, among other things, not polluting threads with meta-level discussion (abd invariably debates) of why a particular comment is (or is not) inappropriate.


I have 11,659 karma. I'm not bragging but that means I have a pretty good understanding and respect for the purpose of downvotes (because I learn from them how to make more upvotes). Nothing is being polluted here, just a reasonable request for clarification (for example I would love to learn that I was factually wrong!). I'm guessing the downvote signal was actually somebody agreeing with me and showing displeasure with the problem, or believing I was technically incorrect, or gauche to be pointing out the problem in the first place.


"Python does support parallelism, so long as you're calling into a C library to do the work"

Actually, you can use Numba to get the same effect while still writing Python. Numba allows you to simply apply a decorator to critical Python code and it is JITted into C-like performance. I do this in my code, and it has worked great for my uses.


Not really. You can't build large data structures (not arrays, but trees, graphs, etc) with structural sharing between threads. Unless of course you translate all of your data to C too, at which point you are really doing everything in C (data and code).


OK, let's put it like this: In some contexts, but not all, Python supports a useful amount of parallelism.

I have definitely worked on Python projects in the past that were easily parallisable and worked absoluately fine. I simply created threads with threading.Thread, passed messages around with queue.Queue and used some standard Python modules to process those messages. The messages were independent enough from each other that I didn't have any issues with marshalling access to them, while being large enough that the overhead of mulithreading was still outweighed by its benefits. Rewriting the whole thing in C++ (etc.) would have made negligable difference to performance.

I can certainly imagine a program similar to what you describe, where there are lots of messages that operate on a single monolythic data structure, which couldn't easily be made to work with true parallelisation in Python.


> You can't build large data structures (not arrays, but trees, graphs, etc) with structural sharing between threads.

You probably do not want to do that in the first place. The reason that the GIL still exists is that the overhead of fine-grained locking for Python datastructures would negate the speedup from multithreading.


Which is why Python doesn't really support parallelism.

Other language (like GoLang) do support parallelism but they have a more advanced runtime and don't use a GIL.


It's not specific to Python. If you want to do fine-grained "parallelism" on a set of complex shared datastructures, your cores might look busy, but really you're just wasting most of the time locking/unlocking stuff. That's not necessarily going to result in a speedup over a lock-free single-threaded implementation (that you didn't implement to compare against).

In the case of Python, the single-threaded implementation was already there, the implementation that removed the GIL in favor of fine-grained locking failed to deliver the goods.

If you want to actually take advantage of parallelism, you want data structures that are amenable to it, which is basically sections of arrays without a lot of potential read/write conflicts over multiple threads. If you're at that point, you probably want to use C/C++/Rust anyway, you don't want speed up something that's already 100x slower (the interpreter) by parallelizing it. Python offers that with C-extensions like NumPy.


> your cores might look busy, but really you're just wasting most of the time locking/unlocking stuff.

Not if you're reading most of the time and are using a modern concurrent GC.


It doesn't matter if you're reading or writing, if there's a potential conflict you need to lock, or you need some complex and probably buggy "lock-free" data structure. The complexity and memory overhead of a concurrent GC just goes on top of that.


Locking an uncontended lock is cheap (at least compared to a contended one) so actual conflict does matter more than potential conflict.


Please try py2many, which targets this use case. Go, rust, C++ are among the 7 supported backends.

http://github.com/adsharma/py2many


That is Julia for you: https://julialang.org/blog/2012/02/why-we-created-julia/

\s? :D

I understand that you (and I for that matter) can't stop using python altogether rn. Just wanted to point out that these two desirable qualities where build into Julia from the get go.


Julia just isn't a replacement for Python yet. Recently did a deep dive researching various capabilities that would make Julia at least useful as a microservice/RPC target for a more I/O heavy language like Python. gRPC support is really poor. There's some OpenAPI/Swagger tooling, but it's not great. Likewise for ZeroRPC. The best candidate is probably the WebApi library [0] but even then, it does not inspire a ton of confidence for production. From cruising github issues and the Julia forums, nothing in this space (rpc/io/microservice) feels really bulletproof. Combined with no solid native executable support, it's just a nonstarter for me.

My gestalt sensation is the community is still too small, the tooling too unpolished, to be ready for anything production-worthy. If there were a bulletproof 0rpc library, I think it would go a huge way towards growing the community and mindshare through successive approximation. But the other problem is many of the folks on the forum seem totally disinterested in this problem.

[0] https://github.com/JuliaWeb/JuliaWebAPI.jl


You can do low level coding via ctypes, and on Micro/CircuitPython you can even mess with Assembly, just like on some old BASIC dialects.


> I just wish Python had parallelism.

Would support for parallelism have benefits other than increasing speed? If not, I'd rather see an increase in Python's single-threaded performance - which is what Guido and his colleagues at Microsoft are currently working on.


The problem with interpreting the Python way, using a switch loop ceval.c is that there is plenty of overhead with that loop. Branch prediction fails. It could be every opcode is compared before reaching the actual opcode in the switch.

I wonder if Python will ever take the JVM approach of having a TemplateInterpreter and mapping each bytecode to assembly directly. Of course you would have to do it for each platform.

http://openjdk.java.net/groups/hotspot/docs/RuntimeOverview....


By parallelism do you mean something different from multiprocessing? [1]

[1]: https://docs.python.org/3/library/multiprocessing.html


I want Thread to run in parallel.

I've tried multiprocessing but I found it buggy for my use case.

I tried to create a worker queue and submit items to worker threads with JoinableQueue but eventually the system deadlocks and no progress is made. I'm not sure what the problem is. It never finishes.

It's a sentence correlator using multiprocessing - it generates correlations of words in your sentences.

https://github.com/samsquire/notebook/blob/master/sentence-c... https://github.com/samsquire/notebook/blob/master/workers.py

Maybe somebody can catch what I'm doing wrong.

edit: it may be because i've forgotten to run task_done on joinable queue. edit2: yes it was that! it works!


Out of curiosity, is using multiprocessing.Pool’s map/etc.() methods not an option? It’s a higher-level API which would remove the need to manually manage workers/a queue.


Pretty sure they're referring to the GIL.[1]

1. https://realpython.com/python-gil/


yes, for many of us, we really do want a CPython that runs multiple threads of the same interpreter, with memory sharing. That said after decades of CPython not supporting it, I've rewritten most of my code to use multiprocessing with granular parallelism.


I wish Nuitka were a drop-in solution, like PyPy, but it seems like I've never had a project that is 100% Nuitka-compatible.


Any scientific computing application that uses numpy, that is the vast majority of them, will not work with nuitka.


Hey, I did just that last week (actually I import Pandas, which requires numpy).

There's a "plugin" option for numpy, in Nuitka, that makes it work. There are similar plugins for a handful of packages.


For reference, the page with the plugins' documentation is not a very top search result, so I'll include it here:

https://github.com/Nuitka/Nuitka/blob/develop/Standard-Plugi...

The list of plugins is:

  dill-compat      Required by the dill module
  eventlet         Required by the eventlet package
  gevent           Required by the gevent package
  multiprocessing  Required by Python's multiprocessing module
  numpy            Required for numpy, scipy, pandas, matplotlib, etc.
  pmw-freezer      Required by the Pmw package
  pylint-warnings  Support PyLint / PyDev linting source markers
  qt-plugins       Required by the PyQt and PySide packages
  tensorflow       Required by the tensorflow package
  tk-inter         Required by Python's Tk modules
  torch            Required by the torch / torchvision packages




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: