Five years ago, AMD was hanging on by a thread. Sales had dropped below $1 billion per quarter. Its client and server CPUs were no longer competitive with Intel’s. Its Opteron server-CPU market share was less than one percent. Its GPU products were a little better but Nvidia had the mindshare.
Then two things happened: Dr. Lisa Su ascended to the CEO position, and the vendor developed the Zen microarchitecture, a clean-sheet, from-scratch redesign of the x86 architecture.
The result? Epyc server processors now account for somewhere between 10 per cent market share, as per Mercury Research, and 16 per cent, as per Omdia. The AMD Ryzen desktop processor is the CPU of choice for gamers. And in Q3 of 2021, AMD reported sales of $4 billion, more in one quarter than AMD did in all of fiscal 2015 ($3.9 billion).
Silicon Valley has had some notable comebacks, but few could surpass the turnabout in fortune AMD has experienced.
“They’ve been consistently on a tear of making material market-share gains, and I think their biggest opportunity that they see is in the data centre,” said Daniel Newman, principal analyst with Futurum Research. “And of course the company that has benefited from a number of struggles from Intel, and they’ve been able to capitalise on those struggles very successfully.”
Much credit is given to Su. Armed with a Ph.D. in electrical engineering from MIT and time at IBM, Texas Instruments, and Freescale before joining AMD in 2012, her qualifications are unimpeachable. “Lisa Su has done a tremendous job at the helm of the company, seemingly making all the right moves at the right times,” said Newman.
“She’s one of the best tech CEOs in the industry hands down,” said Jim McGregor, principal analyst at Tirias Research. “However, I’m not just going to give her credit. I’m also going to give [AMD CFO] Devinder Kumar credit because the two of them really saved AMD.”
During the lean times, Kumar did a stellar job of restructuring debt, pushing out repayments as long as possible—retiring it when he could—and other cost-savings manoeuvres like selling AMD’s Austin, Texas, facility and then leasing it back, which brought in some cash.
The other half of the puzzle was the Zen architecture. AMD’s old designs were at best average compared to Intel’s, but with Zen, AMD took a giant leap in performance while costing much less than an Intel Xeon equivalent.
When AMD began designing Zen, it basically started from scratch and looked at not just what they had done in CPU design, but also at what everyone else from Intel to Nvidia to Arm to IBM had done with processor architectures.
McGregor said AMD took a completely new approach to the architecture, to memory, to the pipeline structure, and made additions such as hyper-threading. “I think the best way to describe Zen is they took the best pieces of CPU architecture development that had been done and said, ‘OK … let’s start from scratch and develop the best architecture we possibly can,’” he said.
Historically, AMD has had a lot of firsts over Intel. It was first to 64-bit x86, first to dual core, first to put the memory controller on the CPU die, and first to integrate a GPU with a CPU. With Zen came another first: the multi-chip module (MCM), a design less prone to fabrication flaws.
Chip manufacturing is a difficult process and the more cores you add, the more transistors per CPU, which means more opportunities for failure or errors. Instead, AMD designed chiplets, with eight cores each and four per die for 32 cores, connected by a high-speed link. As in the past, Intel was initially dismissive of the design change, but now “even Intel admitted that is required going forward,” said McGregor.
AMD has launched three generations of Epyc server processors based on the Zen architecture. Milan-X, introduced in early November, features up to 64-cores and 768MB of total L3 cache per chip for the 64-core design. It uses 3D stacking for the cache, which reduces its sprawl, resulting in less latency. AMD says this high-capacity cache can improve workloads as much as 50 per cent over previous-generation chips.
AMD has announced Zen 4, the next-generation Zen, that will debut in 2022 with the introduction of the Genoa family of Epyc Processors. Zen 4 will support DDR5 memory, which is considerably faster than DDR4 while drawing the same amount of power, and PCI Express Gen 5, which doubles the transfer rate over Gen 4. Genoa will be built by TSMC using a 5nm process and sport up to 96 cores.
AMD also announced a new line called Bergamo, with a special version of the core architecture called Zen 4cC, optimised for cloud-native applications and featuring 128 high-performance cores. Bergamo is on track to ship in the first half of 2023.
AMD in 2021
So AMD is back as a competitor and no longer at risk of bankruptcy, but it still has work to do, said Forrest Norrod, senior vice president and general manager of the data centre division at AMD.
“I think we’re executing well, on the data centre strategy that we put in place six, seven years ago,” he said. “I think we’ve hit an inflection point with customer perception and acceptance of AMD, which was our objective. With the second, third generation [of Epyc], we also dealt with the questions around could AMD be a reliable partner for the long-term. And I think we’ve checked all those boxes.”
AMD already makes CPUs and GPUs and is lined up to buy FPGA technology when it’s acquisition of Xilinx goes through, expected by the end of 2021. But it still has gaps: networking acceleration like Intel and Nvidia; a mobile line; and specialty AI processors. And don’t expect them any time soon, as Norrod said the company will be cautious in how it expands.
“The last thing I want to do is get spread too thin,” he said. “I don’t ever want to try to do eight things [at once] when you’ve got the capability of doing four things in a world-class way. When you stretch yourself too thin, you do eight things poorly. We’re gonna to stay focused and disciplined and for the markets that we participate in, we want to always produce world-class products,” he said.
Compared to Intel, AMD is missing two key pieces of silicon: a network accelerator and persistent memory. Intel’s persistent memory, called Optane, has speed close to that of standard DRAM but the storage capabilities of SSDs. It sits between memory and storage and serves as a cache.
It’s hard to say whether Optane is a threat to AMD, according to Newman. “I think Optane is a strong product that certainly provides the users with some benefits,” he said.
“But without knowing some of the specific numbers about Optane wins and the actual revenue numbers for specific Optane products, I just don’t have an idea if it’s actually gaining momentum, or if it’s more of a philosophical advantage,” he said.
Norrod says AMD has no need to compete against Optane. “I think that our memory partners are doing a great job pushing forward on memory technology, and we’re going to support open industry standards for non-volatile memory,” he said.
Network accelerators, sometimes called smartNICs, are a different story. They offload packet processing from the CPU, and in addition to Intel, Nvidia and Marvell also have network-accelerator offerings. Newman says there is a strong need for their ability to maximise the utility of general-purpose.
CPUs to run optimised workloads, and AMD can’t ignore it. “We’re at a point now where we’re trying to optimise every bit of our computing architectural resources. So yeah, I think that’s a gap that they’re either gonna need to fill in on their own or through partnership,” he said.
But rather than make a network processor, Norrod believes the Xilinx FPGA technology will allow for a different level of processing. “It’s really more about adaptive computing for data-flow oriented workloads,” he said.
“There’s a lot of cryptographic applications, networking applications, security applications, and storage that benefit from [FPGA] as well. And so there’s a lot of interesting areas that we think the Xilinx technology brings to the table.”
Boosting the GPU business
AMD is proving the old adage about building a better mousetrap both true and false at the same time. The Zen-powered Ryzen client CPU for gaming and the Epyc-powered server line have brought in a great deal of new business, but on the GPU side, Nvidia is proving an obstinate foe.
Despite AMD’s line of Radeon cards that are on par with Nvidia GPUs, Nvidia outsells Radeon almost two-to-one, according to Jon Peddie Research, which focuses on the graphics market.
In AI and high-performance computing, it’s even more lopsided. In the most recent TOP500 list of supercomputers, just one machine had an AMD co-processor to its CPUs vs. 143 with an Nvidia co-processor. When you say AI, the association is with Nvidia, not AMD.
“We’re cognisant of the fact that we’re up against a great competitor,” said Norrod. “Nvidia has done a masterful job building out the software ecosystem around their GPUs—really focused on AI,”
But don’t expect a head-on confrontation. “Just charging forward and trying to duplicate everything that they’re doing is a fool’s game. You’re attacking an entrenched competitor by trying to copy what they’re doing.”
McGregor said Nvidia’s real success hasn’t been on the silicon side but on the software side, starting with the CUDA language and building from there.
“Nvidia has done an incredible job with their narrative,” he said. “Whether it’s been for gaming or for enterprise conversations, they’ve really built these software frameworks. And I think if AMD wants to play in that space, they have to sort of maybe change the language, get away a little bit from architectures, and be a little bit more focused on software,” he said.
One thing that has hamstrung AMD in gaining traction with HPC and AI is that there was only one alternative library to CUDA that could be used on its GPUs: OpenCL. It was developed by Apple and donated to a standards organisation called the Khronos Group.
About five years ago, Apple walked away from OpenCL, which left OpenCL languishing. Khronos released OpenCL 3.0 last year after a three year lag from version 2.2, so AMD can’t really rely on it.
AMD does have a tool called HIP that converts CUDA code to a vendor-neutral C language variant, and a lot of customers have been able to take very large CUDA projects and move them in over to AMD GPU hardware in a span of days. But just converting the competitor’s code still leaves AMD lacking a platform on which to build.
Norrod said AMD does plan to work on the software side of things. “One of the things I constantly tell my team is nobody wants to buy a server. Nobody wants to buy a GPU. They want to run an application. So yeah, we are investing tremendously in software,” he said.
AMD had tremendous momentum in 2005-2006, then lost it with the mess that was the launch of its Barcelona processor. It was late, underperformed, and was buggy. Chips should not have bugs. While none of the current top executives were at AMD at the time, they felt its impact.
At the time, Norrod was running Dell’s server business, and AMD's current CTO Mark Papermaster was heading up IBM’s x86 server business. They are not going to make a similar mistake; AMD plans to grow cautiously.
Norrod says AMD regards making processors for workstations and cars as attractive new markets for the company. “I think those are all fantastic opportunities,” he said. “Those are all ones where you’ve got so much opportunity to grow, but we want to make sure we do a great job in each one of those before we get too far afield.”