Zen 2 Specs Seems to be Everything Right

May 28, 2019

After the Computex keynote on May 27th, most of the specifications on Zen 2 are public knowledge. Architecture-wise, AMD will give a deep dive in July / August from what I heard. But from the specs, it seems Zen 2 got everything right.

Intel’s 10nm Canon Lake debacle taught us that not every shrinkage are good in terms of raw performance. Shrinkage in manufacture process can certainly help power consumption, but as we go down to 7nm and 5nm territory, it is far from given performance will be better with each lithography process improvement. Particularly, the widgets provided by TSMC would certainly favor simpler units as the process continue to shrink. When we look at improvements on IPC, we should take these as improvements over architecture tweaks, rather than natural benefits from the shrinkage. With that in mind, let’s examine what Zen 2 gives us from the specs.

Floating-point

When comparing Zen with more recent Coffee Lake, people often complained the lack support of AVX512, and the neutered support for AVX2. Limited by the floating-point register width in Zen / Zen+, AVX2 support was emulated by two 128bit registers. With the extra transistors available in 7nm Zen 2, AMD finally added 256bit width registers for AVX2. This is a welcoming change for productivity apps such as ffmpeg, Adobe Premiere or DaVinci Resolve.

Cache

It is no coincidence that in 7nm, AMD managed to cram 32MiB per chiplet L3 cache into Zen 2. Cache is expensive when you can use these transistors for many important use cases (more ALUs, more FPUs etc.). Given the 7nm trade-offs we discussed earlier, it seems far more economical now to use these transistors as memory instead. AMD is not alone. A12 Bionic from Apple doubled the system cache size as well when moving to 7nm. For today’s application, more cache meant less memory stall and better real-world performance.

Chiplet

Zen architecture was known for its chiplet design. Chiplet design lets AMD to ship the same Zen chip from server all the way down to Ultrabook laptop. The new Zen 2 design separated IO / Infinity Fabric into another die manufactured at 14nm. It is a cost saving design, but rumor on the street is that the new Infinity Fabric can be operated at different frequency from the main memory. It would still be interesting to benchmark how the separate IO die going to impact more multi-thread cooperative programs (i.e. complex gaming logic).

Conclusion

We left with very positive impression after AMD’s Computex keynote. From technical standpoint, it made all the right decisions in Zen 2. With the correct leverage over 7nm process, it is hard to find how Zen 2 cannot deliver the performance boost it promised across server, HEDT, desktop and laptop offerings.

Dr Z Today

Discussion about this post