Ryzen on Linux
It ticks all the boxes on paper, but how well is Ryzen supported, and how well does it perform, on Linux? Jonni Bidwell fires up the APC test bench.
AMD being competitive with Intel once again is an exciting prospect. Free market economics says that should mean better performance and prices for everyone, after all.
At least, once retailers have AMD back in stock.
We’ve seen some impressive benchmarks on the Windows side and we’ve also seen a few shortcomings.
More precisely, we’ve seen a cheap chip that does exceptionally well for professional workloads such as video transcoding. But also one that falters a little when it comes to single-core performance and serious gaming, at least compared to Intel’s latest Kaby Lake flagship, the 7700K. But how does it work on Linux? And what of AMD’s wider open-source strategy? Armed with a few review samples, a fresh install of Phoronix Test Suite and an insatiable thirst for filling-in spreadsheet data, we give Ryzen the APC once-over.
In the official launch announcement, its makers were keen to extol its performance compared to the top-of-the-line previous-gen Broadwell-E chips. These match the Ryzen 7’s 8-core/16-thread makeup, but also cost well above the budget of many a gaming enthusiast — the top-of-the-line i7 6950X retails in Australia. In many ways, the 7700K is the more natural competitor. Sure, it has half as many cores/threads, but multithreading is hard for heterogeneous workloads like gaming, so this won’t be much of a detriment. The 7700K also happens to cost significantly less than the Ryzen 7 1800X that features in our tests, so we shall make careful comparisons between these two bits of silicon, too. Benchmarking is a dark art, and it’s worth keeping in mind that Linux and Windows benchmarks can differ wildly. Also worth remembering is that new hardware has teething issues — over the coming weeks and months, we will very likely hear tell of things that don’t work as well as they should, and of the resulting fixes.
Our first task was to get a working test bed set up. Fortunately, we’d already done this, using the aforesaid top-of-the-line Ryzen 1800X CPU, 16GB of RAM and ASUS’s high-end AM4 motherboard, the RoG Crosshair VI Hero. We started with a fresh install of Ubuntu 16.10, which certainly booted and seemed to work.
However, we encountered spurious segfaults during our kernel compilation tests, which was odd, because other tests worked OK and the machine was certainly stable.
These went away when we used the 4.10 kernel from kernel.ubuntu.com, but that caused other problems, specifically that the Nvidia driver doesn’t build against this, and we neglected to mention that our machine also had an Nvidia 1080 in it. So rather than mess around with ugly patching and manual installs, we raided Zak Storey’s bountiful cupboard and purloined a Radeon 470X. Since AMD added a lot of Ryzen-specific code to Kernel 4.10 (some of it has been backported to 4.9), we figured we should stick with this, but instead, we opted to use the second beta of Ubuntu MATE 17.04 so that we could enjoy the general refresh of system packages. As an aside, we should mention that having
a modern AMD card meant that we could benefit from the new AMDGPU driver model, which allows you to have entirely open-source video drivers. To ensure this card got the support it deserved, we upgraded Mesa to the version 17 (or 13.1 in the old versioning scheme) stack from the xorg-edgers PPA.
A great deal of early Linux Ryzen 1800X benchmarks were released on Michael Larabel’s Phoronix site, based on their test suite, and for the most part, these showed an eminently capable processor that certainly gave Intel a run for its money. We started by rerunning a selection of these, and our results more or less tallied with Michael’s. Check out the table to see the exact values, and check out openbenchmarking.com for details of what the tests involve.
As highlighted by Phoronix, the general picture is of a chip that does well at workloads that scale efficiently over multiple cores and threads, but struggles at certain single-core workloads. The most glaring disappointments were the Himeno Poisson pressure solver (which was less than half as performant as on the 7700K), and FFmpeg which took longer to decode h.264 video than the i5 4670 chip, which was Intel’s sweet spot two generations ago.
- Audio Affair will give you £20 off all orders over £250 when you use the code 7RW20.
Chess isn’t really taken seriously as a benchmark, but the Stockfish engine actually provides a reasonable workout for a CPU. It analyses game trees, which branch plentifully (there are often lots of moves to choose from) and evolve in a non-uniform manner.
So there’s plenty of opportunity for parallelisation, but a given position may lead to checkmate quickly, or may open many more doors.
Essentially, there are lots of differently shaped and sized workloads, and getting through them all will be a challenge for any CPU.
As it turns out, the ancient game of chess is not some weird Achilles’ heel of the Ryzen architecture — the chip performed marginally worse than the i5 4670 in Phoronix’s benchmarks and ours corroborated this. However, the problem lies in the benchmark itself, which doesn’t feed Stockfish suitable parameters for a many-cored processor, and ought to measure the rate of the test (nodes/s) rather than the overall time taken, since the number of nodes (positions) changes with each run. Some details are available in this post http://bit.ly/2o5aFK4.
Ultimately, the benchmark is just measuring single-core performance.
As it turns out, Ryzen is a fine platform for playing chess. We saw it peak at 8.8Mnodes/s using a hash size of 512, 12 threads and a depth of 20, which is pretty meaningless without context (other Stockfish benchmarks use different settings), so we tested it on the FX8350, the one-time top dog of the previous Piledriver architecture. It peaked at 5.5MNnodes/s. It is interesting to see how it scales with thread count.
As you can see, throwing more cores at a problem is not always the best way to solve it quicker.
Ryzen is a new architecture and it would be foolhardy to just assume that all existing binaries out there will understand the platform and be able to make best use of it. It’s been noted the ALC1220 audio codec found on a number of AM4 boards will not be supported until Kernel 4.11. That said, the required code will be easily backported, so users of fixed release distros won’t have to wait too long after the 4.11 release in order to have functioning audio. Likewise, at time of writing, the chip’s thermal sensors weren’t yet available in lm_sensors, so we don’t have evidence of how hot under the collar the chip gets.
Also there are some other platform components that lspci can’t identify, describing them only as “Non-Essential Instrumentation”.
This we found amusing.
In the weeks following Ryzen’s release, reports began to circulate that Windows 10’s scheduler was not being kind to Ryzen, and that disabling Symmetric Multi-Threading (SMT) features actually improved performance. These rumours have since been firmly debunked by AMD, but do provide this tenuous segue into a kernel patch signed off in early February (http://bit.ly/2mRbh53).
Here, we see an AMD employee contributing kernel code to fix SMT scheduling topology. They probably do this on Windows too, but the process has to take place behind closed doors — with Linux, it all happens in the open. Ryzen-specific optimisations first appeared in GCC 6.1, so those who aren’t afraid of compiling their own programs can try the -march=zn1ver switch for extra performance.
This code runs on independent ARM processors that initialize the x86 cores and potentially has transparent access to anything that system is doing henceforth. But there are other components and keys that would need to be released before a free boot process can be had, (see Libreboot’s call to AMD at https://libreboot.org/amd-libre), and if there is any progress here, it will likely be slow and convoluted. Up until 2014, AMD released the source for their AGESA firmware, so perhaps we can hope for a return to at least this kind of partial openness.