Microsoft Releases First In-house Microprocessor Architecture Designed for GenAI Workloads

B. Valle

Summary Bullets:

• The Azure Maia 100 and Cobalt 100 chips are the first two custom silicon chips designed by Microsoft for its cloud infrastructure.

• Microsoft is looking for alternatives to expensive Nvidia chips, following the lead of cloud rivals Amazon and Google, which released their own chips years ago.

In the kerfuffle surrounding Microsoft’s involvement in OpenAI’s boardroom saga last week, some of the most salient news out of Microsoft’s Ignite event got a bit lost in the news cycle. However, the announcement that Microsoft is coming to market with its own proprietary microprocessor technology is a big deal for the industry. The company is the last of the big three US hyperscalers to launch bespoke AI chips. Google was the pioneer with its TPU architecture, in 2016. In 2018, Amazon followed with a slew of CPU chips, the Inferentia and Trainium architectures. The company also has the ARM-based Graviton series for AI workloads. Google, which released the fifth generation of its TPU chips during Google Next 2023, is also rumored to be working on the development of ARM-based processors. The company already supports virtual machines powered by ARM-based Altra chips, but doesn’t have its own proprietary technology like Amazon’s in-house ARM-based CPUs.

The move by Microsoft was a long time coming. After ChatGPT went viral in early 2023, and with the explosion in GenAI, demand for high-performance infrastructure has led to growing success for companies such as Nvidia. As a result, both Amazon and Google have cranked up their in-house silicon production, and Microsoft was under tremendous pressure to follow suit. Despite much effort by Intel and AMD to compete with Nvidia, the GPU maker continues to reign supreme in the global AI stakes as more and more companies rely on it for computing power. Demand for Nvidia’s A100, H100, and forthcoming H200 chips is higher than ever, pushing up prices and constraining supply, and Microsoft is trying to stop or at least limit its dependence on the chipmaker. Nvidia recently released the GH200, which has the same GPU as the H100, Nvidia’s current highest-end AI chip, but pairs it with 141 gigabytes of memory, as well as a 72-core ARM central processor.

Microsoft is highly invested in the new wave of GenAI technologies, having made Copilot a cornerstone of its strategy, and the chatbot has been infused in every one of the company’s main applications. Microsoft’s two new processors, the Cobalt 100 ARM server CPU for general purpose tasks, and the Maia 100 AI accelerator, will be deployed in Azure data centers next year, supporting services including OpenAI and Copilot. After news that development of GPT-5 is already under way, Microsoft being able to deploy its own chips seems strategic.

The company says that it has focused on efficiency, investing in making its data centers as efficient as possible to maximize turnover per rack. This is good news for the company and also the environment. Cloud computing providers have seen their profit margins narrow as a result of the inflationary spiral affecting the price of electrical power globally. In addition, despite their tremendous popularity, the fact is that GPU chips were not created with AI workload processing in mind but for graphic processing of videogames, and the alternative technologies brought to market by cloud providers worldwide, while still not as powerful, are increasingly sophisticated. This has to be good news for the industry (perhaps not so much for Nvidia). The forthcoming AMD Antares Instinct MI300X and MI300A GPU accelerators are expected to add further pressure to the competitive landscape.

Leave a Reply