Three is a Crowd: Microsoft Strikes Sweetheart Deal with Mistral while OpenAI Trains GPT-5

Summary Bullets:

• Microsoft has signed a multiyear agreement with startup Mistral to include GenAI models in Microsoft’s Azure AI Studio and Azure Machine Learning model catalog.

• Mistral-Large is a new LLM that introduces native fluency in English, French, Spanish, German, and Italian.

Microsoft has signed a multiyear agreement with GenAI startup Mistral, signaling the hyperscaler’s ambitions to branch out from its relationship with trailblazer OpenAI, based in San Francisco, California (US), and explore more diverse opportunities. The alliance makes Mistral the second company to provide commercial large language models (LLMs) on the Microsoft Azure platform. Microsoft has partnerships with Meta and Hugging Face, whose open-source models are also available on Microsoft Azure.

Mistral, based in Paris (France), is a 10-month-old company that launched its latest LLM, Mistral-Large, on February 26, 2024. Although containing a smaller number of parameters than OpenAI’s GPT-4, the model’s reasoning capabilities and other benchmarks make it a powerful contender. Mistral also unveiled a chatbot, Le Chat, powered by the new LLM.

The deal with Microsoft includes a standard computing agreement that allows the startup to use Microsoft Azure’s infrastructure to train and run Mistral AI technologies. These agreements have attracted scrutiny from the FCA in the US, which says the money big companies supposedly invest into smaller AI startups is bouncing back toward the investor companies. In addition, Mistral’s GenAI models become available to customers of Microsoft’s Azure AI Studio and Azure Machine Learning model catalog as part of the agreement. Microsoft said it will be investing EUR15 million ($16.3 million) in Mistral during the term of the deal.

However, Mistral has not been sitting idly; on February 23, 2024, Amazon announced that two of Mistral’s models, Mistral 7B and Mixtral 8x7B, will be made available as part of the Amazon Bedrock services platform. This demonstrates that Mistral is actively pursuing alternative arrangements with other hyperscalers in the vein of competitors such as Anthropic and Cohere. OpenAI doesn’t have such freedom, being tied exclusively to Microsoft by a gigantic investment agreement worth around $13 billion initiated in 2019.

Mistral-Large introduces several notable features, including native fluency in English, French, Spanish, German, and Italian. It also offers precise instruction-following, allowing developers to design moderation policies. Crucially, the model is proprietary and not open source like other models previously released by the company, including the ones that will be made available via Amazon’s Bedrock soon. The capabilities of this commercial release put Mistral squarely in the competitive league of OpenAI’s GPT-4 and Anthropic’s Claude 2.

The Microsoft-Mistral deal lands amid the usual noise surrounding regulatory oversight in AI, predictably attracting the attention of anti-trust EU authorities. The European Commission argues that Microsoft is concentrating too much power during the nascent stages of the GenAI market, still so far from maturity. The argument is not without merit: The deal has the hallmarks of previous (and successful) maneuvering to gain penetration in other technology areas and quash competition.

However, these conversations often miss a more interesting point: the shape, or lack thereof, of regulatory oversight of open-source models. Open-source technologies are more transparent than proprietary ones, allowing companies greater insight into the inner processes of curating the training data. Mistral claims one of its secret ingredients is the method for curating its data, which makes its models more efficient in terms of computing usage requirements (for more, please see Mistral is Not Afraid of OpenAI and has the Open-Source Goods to Prove It, January 25, 2024).

In the meantime, the world patiently awaits the release of GPT-5 with the potential to quiet the ripples created by these novel techniques in deploying neural networks, which are confounding parameter size and other traditional metrics used to measure model performance. After all, this is not the first time a breakthrough startup threatens to steal the GenAI crown that belongs (for the time being) to OpenAI. Anthropic was the name on everyone’s lips after the release of Claude 2 just over six months ago, but the notion vanished as fast as you can say ‘ChatGPT Enterprise.’

Making Educated Purchasing Decisions Easier

Three is a Crowd: Microsoft Strikes Sweetheart Deal with Mistral while OpenAI Trains GPT-5

Like this:

Related

Leave a ReplyCancel reply

Share this:

Like this:

Related

Leave a ReplyCancel reply

Discover more from IT Connection