Nvidia Will Spend $26 Billion to Build Open-Weight AI Models, Filings Show

0 0 2 minutes read

Nvidia Will Spend Billion to Build Open-Weight AI Models, Filings Show

https://www.profitableratecpm.com/f4ffsdxe?key=39b1ebce72f3758345b2155c98e6709c

Nvidia will spend $26 billion over the next five years to create open source artificial intelligence models, according to a 2025 financial filing. Executives confirmed the news, which has not been previously reported, in interviews with WIRED.

This significant investment could help Nvidia move from a chipmaker with an impressive software stack to a true frontier lab capable of competing with OpenAI and DeepSeek. It’s a strategic move that could further cement Nvidia’s place as the world’s leading AI chip maker, since the models are tailored to the company’s hardware.

Open source models are those in which the weights or parameters that determine a model’s behavior are made public, sometimes along with details of its architecture and training. This allows anyone to download and run it on their own machine or on the cloud. In Nvidia’s case, the company also reveals the technical innovations involved in building and training its models, making it easier for startups and researchers to modify and build upon the company’s innovations.

On Wednesday, Nvidia also released Nemotron 3 Super, its most capable open-weight AI model to date. The new model has 128 billion parameters (a measure of model size and complexity), making it roughly equivalent to the largest version of OpenAI’s GPT-OSS, although the company says it outperforms GPT-OSS and other models in several benchmarks.

Specifically, Nvidia claims that Nemotron 3 Super received a score of 37 on the Artificial Intelligence Index, which evaluates models on 10 different criteria. GPT-OSS scored 33, but several Chinese models scored higher. Nvidia claims that Nemotron 3 Super was secretly tested on PinchBench, a new benchmark that evaluates a model’s ability to control OpenClaw, and ranked number one in this test.

Nvidia also introduced a number of technical tricks used to train Nemotron 3. These include architecture and training techniques that improve the model’s reasoning capabilities, handling of long contexts, and responsiveness to reinforcement learning.

“Nvidia is taking the development of open models much more seriously,” said Bryan Catanzaro, vice president of applied deep learning research at Nvidia. “And we’re making a lot of progress.”

Open border

Meta was the first major AI company to launch an open model, Llama, in 2023. However, CEO Mark Zuckerberg recently revived the company’s AI efforts and signaled that it might not make future models completely open. OpenAI offers an open model, called GPT-oss, but it is inferior to the company’s best proprietary offerings and poorly suited to modification.

The best American models, from OpenAI, Anthropic and Google, are accessible only via the cloud or via a chat interface. In contrast, the weights of many major Chinese models, from DeepSeek, Alibaba, Moonshot AI, Z.ai and MiniMax, are published openly and for free. As a result, many startups and researchers around the world are currently relying on Chinese models.

“It’s in our interest to help the ecosystem grow,” says Catanzaro, who joined Nvidia in 2011 and helped lead the company’s transition from making graphics cards for gaming to making silicon for AI. Nvidia released the first Nemotron model in November 2023. He adds that Nvidia recently completed pre-training a 550 billion parameter model. (Pre-training involves feeding huge amounts of data into a model distributed across a large number of specialized chips operating in parallel.) Nvidia has since released a range of specialized models for use in areas such as robotics, climate modeling and protein folding.

Kari Briski, vice president of enterprise generative AI software, says Nvidia’s future AI models will help the company improve not only its chips, but also the data centers on the scale of the supercomputers it builds. “We’re building it to expand our systems and test not just compute, but also storage and networking, and to kind of build out our hardware architecture roadmap,” she says.

abdulmanannet77@gmail.com7 days ago

0 0 2 minutes read