Hacker Newsnew | past | comments | ask | show | jobs | submitlogin
Threadripper 3960X Compiles Linux Kernel in Under 30 Seconds (phoronix.com)
190 points by zzeder on Nov 25, 2019 | hide | past | favorite | 54 comments


I was compiling the kernel for my boxes for the most part of the 90's and 00's and the funny thing is a sub-60-second compile time was always possible with the "latest gear". Like a 100 MHz pentium. Of course the number of modules you want/need to compile has steadily increased but still funny to see someone impressed with a 30 second compile time 25 years later and with a 100x+ increase in CPU power :)


The kernel has gotten bloated. Back in the day, it could easily run in 8 megabytes of RAM, with plenty left over for the userspace and cache. Nowadays, just the kernel code (before loading any modules) is already bigger than that.


I was one of the first who worked on embedded Linux, and I put the kernel and a user-space program in a device with 512kb RAM once. It was too tight to do anything useful with, but just barely. It was no problem doing useful stuff on a 2 MB device (this was in 1999) :)

Though to Linux defence, it is not nearly as bloated over this period of time as, say, Windows, by orders of magnitude. And, you aren't required to compile in everything. It's just very tedious (and frankly a bit pointless) to trim it down..


Everybody knows the REAL test is a FreeBSD 'make world' anyway :)


I used to think 'make world' on BSD was some big deal until I started to synthesize HDL code to bitfiles or GDS.


What is GDS in this context?


https://en.wikipedia.org/wiki/GDSII

File format commonly used for wafer masks.


Compiling Firefox or Wireshark or Libreoffice all work the machine harder than a make world.


Wow... I still remember recompiling the NetBSD kernel on my Amiga 3000 to tweak the timings on my Retina Z3. Took 14 hours to compile the kernel...

So if processors are now >1500x faster than they were in 1991, why is it that my Amiga 3000 has a faster, more responsive UI?

Why is it that an ssh session takes 8 seconds to establish?

Why is it that browsing a network share is 10x slower than viewing a file list was on a BBS?


Edit your sshd config and add useDNS: no to it, don't forget to restart the deamon.

Either way, I get your point, I remember programming in Delphi, compiling, chatting on IRC and browsing the net with a 56K Winmodem (all the modulation was software driven) with a Pentium 2 all at the same time, and now I can't even have and IDE and slack open at the same time.


I'm old enough to remember 14k, 28k and 56k modems - and while yh internet held great wonder, waiting for pages to load and files to download was really frustrating.

Oh, and having to hangup so my parents could make a phonecall wasn't fun either!

This at least is something that has got orders of magnitudes better, and is very much noticeable. At times I still shake my head in wonder at being able to download a GB in less than 2 mins on a residential FttC connection.


IDE? Slack? "There's your problem right there."


Something's wrong with your DNS setup, I guess. SSH session takes less than one second to establish for me.


Because not everything is CPU bound.


But all the other bounds have improved massively, too:

- CPUs are multiple orders of magnitude faster

- GPUs are multiple orders of magnitude faster

- RAM is multiple orders of magnitude faster and more abundant

- Disks are multiple orders of magnitude faster and bigger

- Network connections are multiple orders of magnitude faster

Why is it that the end result feels the same?

The answer, of course, is that software complexity has kept pace: software, like a gas, expands until it fills its container.


No, the answer is your ssh configuration. You're hanging on timeouts, either doing DNS resolution or trying multiple auth mechanisms.

Use -vvv to find the hangup.


I wasn't the author of the original comment. SSH ain't a problem for me. I was talking more about system responsiveness in general.


You are getting good replies, but the main answer is... nobody knows (and you aren't alone). There are so many little reasons for slower modern computers, but it seems to come down to increased complexity and, I guess, entropy? It's frustrating.


Regarding your SSH session, have you turned off gssapiauth (assuming you aren't using it)?

See https://coderwall.com/p/fukoew/speed-up-ssh-logon-by-disabli... for some details.


Sounds like you have a slow network and are using protocols optimized for high throughput on LAN connections.


How many -b bits did you specify for ssh-keygen?

And today’s software is more than 1500x the size.


It is cool to see this level of parallelism coming to commodity hardware. 5-ish years ago a friend of mine had access to an experimental box one of the server vendors had made. Never went to production AFAIK. My memory is that it had 128 cores, 256 threads. One of the things we did on it, of course, was compile the kernel. I don't remember the exact numbers, but recall it being in the single digits number of seconds. This was done with just a ton of Xeon sockets.


We need to move on from the Linux Kernel and start using “compile Unreal Engine” as a CPU benchmark.

https://gpuopen.com/threadripper-for-gamedev-ue4/


That's probably a good point: the Linux Kernel is written using the C language. But C++ is a far more complicated language to parse and compile.

Unreal Engine, Firefox, Chrome... large C++ programs easily take an hour to compile on normal hardware.

With that being said: compiling for an hour is a huge hassle for reviewers. I think reviewers prefer benchmarks that complete in under a minute, rather than over an hour.


That's why I also use the LLVM compilation test (C++) and other workloads besides just the Linux kernel... But it seems the Linux kernel results are always what generates the most interest.


I would bet it's just familiarity. Most people that are interested in PCs have heard of Linux, and know that it has a Kernel. Or they even have already compiled kernels themselves. Whereas LLVM is more of a thing that is known to a subset of software developers.

I'm personally happy to see the LLVM tests, since it gives a good impression on what benefits software developer which work on big native code-bases can expect.


I think it's path dependence. The kernel has a large number of compile-time configuration options; back in the 90s, it was common to compile your own kernel, after tuning these options to your own particular hardware. It was also common to get new kernel releases (as a source code .tar.gz, or as a patch to the previous source code .tar.gz) directly from kernel.org, instead of waiting for the next release of the distribution you were using. So the time it took to compile a new version of the kernel for your machine was something many Linux users had experience with, and it was clear when a machine was faster (perhaps as a consequence of having previously compiled and installed a newer kernel!) because it took less time to compile the kernel. To turn that into a benchmark was just a matter of standardizing on a kernel release and a set of configuration options.


> With that being said: compiling for an hour is a huge hassle for reviewers.

Can't they just set it to run then go to lunch? Seems like a non-issue.


There's a lot of other tests that reviewers have to do, depending on their audience. Video editing (Adobe Premier), 3d Modeling (Blender), a few video games across genres (maybe an Adventure Game (Tomb Raider), a FPS like Overwatch, a strategy game like Total War, and a simulation game like City Skylines), and a few other workloads (audio, programmer, CAD, HPC, etc. etc.).

So there's that, but also the time to write these reviews. And finally, you're trying to out-compete everyone and release the review before everyone else so that your website gets more traffic.


It's a beast, but ... that's not a very impressive measure. Bellard strikes again -

From [0], in 2004: TCCBOOT is a boot loader able to compile and boot a Linux kernel directly from its source code. It is only 138 KB big (uncompressed code) and it can compile and run a typical Linux kernel in less than 15 seconds on a 2.4 GHz Pentium 4.

[0] https://bellard.org/tcc/tccboot.html [2004]


I would prefer gccboot (with gcc -march=native -mtune=native) though, because tccboot cannot optimise the output.


Reminds me when they built a (much older, smaller) kernel on a pSeries (POWER architecture) in under 5 seconds: http://es.tldp.org/Presentaciones/200211hispalinux/blanchard...

Gosh, 18 years ago, now I feel really old :D


If you want to keep these compile times, better not rewrite the kernel in Rust.


i think this the script they use: https://openbenchmarking.org/innhold/9a7355bdb73d85c9e044d02...

on vanilla head branch myn "time make -s -j32" took "0m56.226s" with 1950x


Was recently googling what the Kernel guys were using out of curiosity. GKH mentioned that he had access to 32 core AWS instances for it....7 years ago.

Still...exciting times for consumers.


Nice. How long to build Chromium?


I don't really understand why this is newsworthy, so maybe someone could help me understand. It's a big, power hungry desktop CPU with incremental performance gains over the last generation.

Why is this important? Is there some kind of architectural breakthrough these CPUs are using? Are these CPUs recovering performance lost to Spectre/Meltdown mitigation?


This is a big power hungry desktop CPU with incremental performance gains with the x86/x64 instruction set from someone not Intel and forcing intel to compete on pricing and performance.

Just that alone is newsworthy.


I think that's it: it's sort of an intel vs AMD interest piece. It's newsworthy in that it's a market signal, not a technology signal. The desktop CPU market segment is seeing a lot of transformation as secondary purpose built processors and SBCs become more and more of the market, so this is just sort of a fun puff piece.

Thanks for helping me answer the question.


3970x almost halves the LLVM compilation time compared to 2990wx using the same core count. I would call that more than incremental performance gain.


Why? Isn't that within a modest error margin for the general arc of these? I'm looking at this graph over time and it seems like a linear projection wouldn't be broken by this.

"The next generation of processors is not quite twice as fast as the current generation" just seems like a pretty normal statement to me.


The last decade has been pretty stagnant in CPU development. That AMD has gotten ahead, and is making significant CPU advancements in each generation for the past two years, is something we haven't seen in 10-15 years.


Yep. My top-of-the-line Threadripper 1950X is already multiple generations behind the bleeding edge; I built that machine last year, and there are already newer CPUs blowing it out of the water.

It reminds me of the 90's when Moore's Law was still a Law-with-a-capital-L and I love it.


Other than benchmarks, do you actually feel this? Dev environments have gone to great lengths to reduce the total amount of compilation required and for nearly every other application, a GPU does most of the work. And if you've got a lot of ML to do, you probably would save more time with a TPU.

I don't really see what the point of faster desktop CPUs are without being accompanied by substantial power use reduction. I'm a software engineer and other than when I'm lazy about my Haskell build process and dont offload it to a farm, it doesnt really seem to matter at all. It doesnt come into play with my CAD work, and it doesn't make my Java GC cycles faster.


You're right that the individual tasks don't feel much faster. What's impressive and more relevant to me is that I can do more of them at the same time. I can have Firefox open on a bunch of tabs and a resource-intensive video game or CAD program running and Spotify and Slack and a bunch of Emacs windows and a bunch of terminal and file manager windows and even a couple VMs all at the same time, and can Alt-Tab freely among them (or even put them on separate monitors). My desktop doesn't break a sweat in the process. My previous desktop (with an early-gen i7) could handle a fraction of that under the same operating system (Slackware). My desktops before that under any operating system handled even less, as do most laptops even today.

And that's with 16 cores. The 3990X is gonna have four times that many. That's four times the number of things my computer can be doing at the same time at the same per-core load.

My work laptop, according to htop, is running 263 "tasks" right now (which I assume to be processes+threads). If AMD can in the next few years pull off another quadrupling like they're trying to do with the 3990X, then I'd be very close to being able to give every process and thread on a computer its own x86 core. That's fucking ludicrous.


Has it though? We've been seeing meteoric improvements in low power applications and in specialized processing packages like GPUs, TPUs and SBUs.

That desktop CPU performance has slowed down is more a sign to me of fundamental challenges with the architecture (e.g., gains at the cost of isolation as in Specter and Meltdown), not process challenges.

So I think maybe this is story for folks who have interest in AMD vs. Intel, which is reasonable, but it's not particularly exciting except from a vendor diversity standpoint.


Dont take these gains for granted. We're nearing the physical limit for current technologies. Transistors a handful of atoms across. My guess is in 20 years we will celebrate a 5 percent gain.


I'm not sure that's true for other parts of the market. Certainly single board computers and microcontrollers have been seeing incredible gains, as have TPUs and GPUs.

It is cool we are reaching limits for our current process, but I was under the impression that in a post-Specter/Meltdown world what we're really waiting for is new architectures that support better speed under safe isolation.


Just to note AMD CPUs were not affected by Meltdown in the first place.


LLVM in under 2 minutes is way more impressive. That's a crazy CPU.



I clicked to see discussion on the topic of the thread but you’re linking to threadripper discussion. Disappointed in lack of linux kernel compile time discussion.


Not trying to be rude, but genuinely curious: what else is there to discuss?

Is this a new record, or just a record for the price point? What was the previous record? Will this have a large effect on how the kernel is developed?

For what it's worth, I understand how hard it is to focus on a project when compile times get more than about thirty seconds.


Was thinking people would talk about compile times with their hardware or experiences with intel vs amd etc.

It’s just this title is very specific to compile time compared to the other being very general discussion.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: