Threadripper 3960X Compiles Linux Kernel in Under 30 Seconds

l33tman · on Nov 25, 2019

I was compiling the kernel for my boxes for the most part of the 90's and 00's and the funny thing is a sub-60-second compile time was always possible with the "latest gear". Like a 100 MHz pentium. Of course the number of modules you want/need to compile has steadily increased but still funny to see someone impressed with a 30 second compile time 25 years later and with a 100x+ increase in CPU power :)

cesarb · on Nov 25, 2019

The kernel has gotten bloated. Back in the day, it could easily run in 8 megabytes of RAM, with plenty left over for the userspace and cache. Nowadays, just the kernel code (before loading any modules) is already bigger than that.

l33tman · on Nov 25, 2019

I was one of the first who worked on embedded Linux, and I put the kernel and a user-space program in a device with 512kb RAM once. It was too tight to do anything useful with, but just barely. It was no problem doing useful stuff on a 2 MB device (this was in 1999) :)

Though to Linux defence, it is not nearly as bloated over this period of time as, say, Windows, by orders of magnitude. And, you aren't required to compile in everything. It's just very tedious (and frankly a bit pointless) to trim it down..

xenospn · on Nov 25, 2019

Everybody knows the REAL test is a FreeBSD 'make world' anyway :)

basementcat · on Nov 25, 2019

I used to think 'make world' on BSD was some big deal until I started to synthesize HDL code to bitfiles or GDS.

Scipio_Afri · on Nov 25, 2019

What is GDS in this context?

the-dude · on Nov 25, 2019

https://en.wikipedia.org/wiki/GDSII

File format commonly used for wafer masks.

jandrese · on Nov 25, 2019

Compiling Firefox or Wireshark or Libreoffice all work the machine harder than a make world.

kstenerud · on Nov 25, 2019

Wow... I still remember recompiling the NetBSD kernel on my Amiga 3000 to tweak the timings on my Retina Z3. Took 14 hours to compile the kernel...

So if processors are now >1500x faster than they were in 1991, why is it that my Amiga 3000 has a faster, more responsive UI?

Why is it that an ssh session takes 8 seconds to establish?

Why is it that browsing a network share is 10x slower than viewing a file list was on a BBS?

ralgozino · on Nov 25, 2019

Edit your sshd config and add useDNS: no to it, don't forget to restart the deamon.

Either way, I get your point, I remember programming in Delphi, compiling, chatting on IRC and browsing the net with a 56K Winmodem (all the modulation was software driven) with a Pentium 2 all at the same time, and now I can't even have and IDE and slack open at the same time.

GordonS · on Dec 1, 2019

I'm old enough to remember 14k, 28k and 56k modems - and while yh internet held great wonder, waiting for pages to load and files to download was really frustrating.

Oh, and having to hangup so my parents could make a phonecall wasn't fun either!

This at least is something that has got orders of magnitudes better, and is very much noticeable. At times I still shake my head in wonder at being able to download a GB in less than 2 mins on a residential FttC connection.

ncmncm · on Nov 25, 2019

IDE? Slack? "There's your problem right there."

vbezhenar · on Nov 25, 2019

Something's wrong with your DNS setup, I guess. SSH session takes less than one second to establish for me.

fooker · on Nov 25, 2019

Because not everything is CPU bound.

yellowapple · on Nov 25, 2019

But all the other bounds have improved massively, too:

- CPUs are multiple orders of magnitude faster

- GPUs are multiple orders of magnitude faster

- RAM is multiple orders of magnitude faster and more abundant

- Disks are multiple orders of magnitude faster and bigger

- Network connections are multiple orders of magnitude faster

Why is it that the end result feels the same?

The answer, of course, is that software complexity has kept pace: software, like a gas, expands until it fills its container.

lordlimecat · on Nov 25, 2019

No, the answer is your ssh configuration. You're hanging on timeouts, either doing DNS resolution or trying multiple auth mechanisms.

Use -vvv to find the hangup.

yellowapple · on Nov 26, 2019

I wasn't the author of the original comment. SSH ain't a problem for me. I was talking more about system responsiveness in general.

dr_dshiv · on Nov 25, 2019

You are getting good replies, but the main answer is... nobody knows (and you aren't alone). There are so many little reasons for slower modern computers, but it seems to come down to increased complexity and, I guess, entropy? It's frustrating.

ajford · on Nov 25, 2019

Regarding your SSH session, have you turned off gssapiauth (assuming you aren't using it)?

See https://coderwall.com/p/fukoew/speed-up-ssh-logon-by-disabli... for some details.

jandrese · on Nov 25, 2019

Sounds like you have a slow network and are using protocols optimized for high throughput on LAN connections.

CodeWriter23 · on Nov 25, 2019

How many -b bits did you specify for ssh-keygen?

And today’s software is more than 1500x the size.

linsomniac · on Nov 25, 2019

It is cool to see this level of parallelism coming to commodity hardware. 5-ish years ago a friend of mine had access to an experimental box one of the server vendors had made. Never went to production AFAIK. My memory is that it had 128 cores, 256 threads. One of the things we did on it, of course, was compile the kernel. I don't remember the exact numbers, but recall it being in the single digits number of seconds. This was done with just a ton of Xeon sockets.

corysama · on Nov 25, 2019

We need to move on from the Linux Kernel and start using “compile Unreal Engine” as a CPU benchmark.

https://gpuopen.com/threadripper-for-gamedev-ue4/

dragontamer · on Nov 25, 2019

That's probably a good point: the Linux Kernel is written using the C language. But C++ is a far more complicated language to parse and compile.

Unreal Engine, Firefox, Chrome... large C++ programs easily take an hour to compile on normal hardware.

With that being said: compiling for an hour is a huge hassle for reviewers. I think reviewers prefer benchmarks that complete in under a minute, rather than over an hour.

michaellarabel · on Nov 25, 2019

That's why I also use the LLVM compilation test (C++) and other workloads besides just the Linux kernel... But it seems the Linux kernel results are always what generates the most interest.

Matthias247 · on Nov 25, 2019

I would bet it's just familiarity. Most people that are interested in PCs have heard of Linux, and know that it has a Kernel. Or they even have already compiled kernels themselves. Whereas LLVM is more of a thing that is known to a subset of software developers.

I'm personally happy to see the LLVM tests, since it gives a good impression on what benefits software developer which work on big native code-bases can expect.

cesarb · on Nov 25, 2019

I think it's path dependence. The kernel has a large number of compile-time configuration options; back in the 90s, it was common to compile your own kernel, after tuning these options to your own particular hardware. It was also common to get new kernel releases (as a source code .tar.gz, or as a patch to the previous source code .tar.gz) directly from kernel.org, instead of waiting for the next release of the distribution you were using. So the time it took to compile a new version of the kernel for your machine was something many Linux users had experience with, and it was clear when a machine was faster (perhaps as a consequence of having previously compiled and installed a newer kernel!) because it took less time to compile the kernel. To turn that into a benchmark was just a matter of standardizing on a kernel release and a set of configuration options.

catalogia · on Nov 25, 2019

> With that being said: compiling for an hour is a huge hassle for reviewers.

Can't they just set it to run then go to lunch? Seems like a non-issue.

dragontamer · on Nov 25, 2019

There's a lot of other tests that reviewers have to do, depending on their audience. Video editing (Adobe Premier), 3d Modeling (Blender), a few video games across genres (maybe an Adventure Game (Tomb Raider), a FPS like Overwatch, a strategy game like Total War, and a simulation game like City Skylines), and a few other workloads (audio, programmer, CAD, HPC, etc. etc.).

So there's that, but also the time to write these reviews. And finally, you're trying to out-compete everyone and release the review before everyone else so that your website gets more traffic.

beagle3 · on Nov 25, 2019

It's a beast, but ... that's not a very impressive measure. Bellard strikes again -

From [0], in 2004: TCCBOOT is a boot loader able to compile and boot a Linux kernel directly from its source code. It is only 138 KB big (uncompressed code) and it can compile and run a typical Linux kernel in less than 15 seconds on a 2.4 GHz Pentium 4.

[0] https://bellard.org/tcc/tccboot.html [2004]

anticensor · on Nov 26, 2019

I would prefer gccboot (with gcc -march=native -mtune=native) though, because tccboot cannot optimise the output.

abridgett · on Nov 25, 2019

Reminds me when they built a (much older, smaller) kernel on a pSeries (POWER architecture) in under 5 seconds: http://es.tldp.org/Presentaciones/200211hispalinux/blanchard...

Gosh, 18 years ago, now I feel really old :D

fortran77 · on Nov 25, 2019

If you want to keep these compile times, better not rewrite the kernel in Rust.

hrgiger · on Nov 25, 2019

i think this the script they use: https://openbenchmarking.org/innhold/9a7355bdb73d85c9e044d02...

on vanilla head branch myn "time make -s -j32" took "0m56.226s" with 1950x

Havoc · on Nov 25, 2019

Was recently googling what the Kernel guys were using out of curiosity. GKH mentioned that he had access to 32 core AWS instances for it....7 years ago.

Still...exciting times for consumers.

ilaksh · on Nov 25, 2019

Nice. How long to build Chromium?

KirinDave · on Nov 25, 2019

I don't really understand why this is newsworthy, so maybe someone could help me understand. It's a big, power hungry desktop CPU with incremental performance gains over the last generation.

Why is this important? Is there some kind of architectural breakthrough these CPUs are using? Are these CPUs recovering performance lost to Spectre/Meltdown mitigation?

barkingcat · on Nov 25, 2019

This is a big power hungry desktop CPU with incremental performance gains with the x86/x64 instruction set from someone not Intel and forcing intel to compete on pricing and performance.

Just that alone is newsworthy.

KirinDave · on Nov 25, 2019

I think that's it: it's sort of an intel vs AMD interest piece. It's newsworthy in that it's a market signal, not a technology signal. The desktop CPU market segment is seeing a lot of transformation as secondary purpose built processors and SBCs become more and more of the market, so this is just sort of a fun puff piece.

Thanks for helping me answer the question.

Karliss · on Nov 25, 2019

3970x almost halves the LLVM compilation time compared to 2990wx using the same core count. I would call that more than incremental performance gain.

KirinDave · on Nov 25, 2019

Why? Isn't that within a modest error margin for the general arc of these? I'm looking at this graph over time and it seems like a linear projection wouldn't be broken by this.

"The next generation of processors is not quite twice as fast as the current generation" just seems like a pretty normal statement to me.

unethical_ban · on Nov 25, 2019

The last decade has been pretty stagnant in CPU development. That AMD has gotten ahead, and is making significant CPU advancements in each generation for the past two years, is something we haven't seen in 10-15 years.

yellowapple · on Nov 25, 2019

Yep. My top-of-the-line Threadripper 1950X is already multiple generations behind the bleeding edge; I built that machine last year, and there are already newer CPUs blowing it out of the water.

It reminds me of the 90's when Moore's Law was still a Law-with-a-capital-L and I love it.

KirinDave · on Nov 25, 2019

Other than benchmarks, do you actually feel this? Dev environments have gone to great lengths to reduce the total amount of compilation required and for nearly every other application, a GPU does most of the work. And if you've got a lot of ML to do, you probably would save more time with a TPU.

I don't really see what the point of faster desktop CPUs are without being accompanied by substantial power use reduction. I'm a software engineer and other than when I'm lazy about my Haskell build process and dont offload it to a farm, it doesnt really seem to matter at all. It doesnt come into play with my CAD work, and it doesn't make my Java GC cycles faster.

yellowapple · on Nov 26, 2019

You're right that the individual tasks don't feel much faster. What's impressive and more relevant to me is that I can do more of them at the same time. I can have Firefox open on a bunch of tabs and a resource-intensive video game or CAD program running and Spotify and Slack and a bunch of Emacs windows and a bunch of terminal and file manager windows and even a couple VMs all at the same time, and can Alt-Tab freely among them (or even put them on separate monitors). My desktop doesn't break a sweat in the process. My previous desktop (with an early-gen i7) could handle a fraction of that under the same operating system (Slackware). My desktops before that under any operating system handled even less, as do most laptops even today.

And that's with 16 cores. The 3990X is gonna have four times that many. That's four times the number of things my computer can be doing at the same time at the same per-core load.

My work laptop, according to htop, is running 263 "tasks" right now (which I assume to be processes+threads). If AMD can in the next few years pull off another quadrupling like they're trying to do with the 3990X, then I'd be very close to being able to give every process and thread on a computer its own x86 core. That's fucking ludicrous.

KirinDave · on Nov 25, 2019

Has it though? We've been seeing meteoric improvements in low power applications and in specialized processing packages like GPUs, TPUs and SBUs.

That desktop CPU performance has slowed down is more a sign to me of fundamental challenges with the architecture (e.g., gains at the cost of isolation as in Specter and Meltdown), not process challenges.

So I think maybe this is story for folks who have interest in AMD vs. Intel, which is reasonable, but it's not particularly exciting except from a vendor diversity standpoint.

mattmar96 · on Nov 25, 2019

Dont take these gains for granted. We're nearing the physical limit for current technologies. Transistors a handful of atoms across. My guess is in 20 years we will celebrate a 5 percent gain.

KirinDave · on Nov 25, 2019

I'm not sure that's true for other parts of the market. Certainly single board computers and microcontrollers have been seeing incredible gains, as have TPUs and GPUs.

It is cool we are reaching limits for our current process, but I was under the impression that in a post-Specter/Meltdown world what we're really waiting for is new architectures that support better speed under safe isolation.

zamadatix · on Nov 26, 2019

Just to note AMD CPUs were not affected by Meltdown in the first place.

ddevault · on Nov 25, 2019

LLVM in under 2 minutes is way more impressive. That's a crazy CPU.

pella · on Nov 25, 2019

discussion: https://news.ycombinator.com/item?id=21628149

Linux # Threadripper : https://news.ycombinator.com/item?id=21628482

philliphaydon · on Nov 25, 2019

I clicked to see discussion on the topic of the thread but you’re linking to threadripper discussion. Disappointed in lack of linux kernel compile time discussion.

ghostpepper · on Nov 25, 2019

Not trying to be rude, but genuinely curious: what else is there to discuss?

Is this a new record, or just a record for the price point? What was the previous record? Will this have a large effect on how the kernel is developed?

For what it's worth, I understand how hard it is to focus on a project when compile times get more than about thirty seconds.

philliphaydon · on Nov 25, 2019

Was thinking people would talk about compile times with their hardware or experiences with intel vs amd etc.

It’s just this title is very specific to compile time compared to the other being very general discussion.