Nvidia Becomes a Major Model Maker With Nemotron 3

Nvidia did a fortune supplying chips to companies working on artificial intelligence, but today the chipmaker has taken a step toward becoming a more serious modeler itself by releasing a series of cutting-edge open models, along with data and tools to help engineers use them.
The move, which comes at a time when AI companies like OpenAI, Google and Anthropic are developing their own increasingly capable chips, could provide a hedge against these companies moving away from Nvidia’s technology over time.
Open models are already a crucial part of the AI ecosystem, with many researchers and startups using them to experiment, prototype, and build. Even though OpenAI and Google offer small open models, they don’t update them as frequently as their Chinese rivals. For this and other reasons, Chinese companies’ open models are currently much more popular, according to data from Hugging Face, an open source project hosting platform.
Nvidia’s new Nemotron 3 models are some of the best that can be downloaded, modified, and run on your own hardware, according to benchmark scores shared by the company ahead of release.
“Open innovation is the foundation of AI progress,” CEO Jensen Huang said in a statement before the news. “With Nemotron, we are transforming advanced AI into an open platform that provides developers with the transparency and efficiency they need to build agentic systems at scale. »
Nvidia is taking a more transparent approach than many of its U.S. competitors in publishing the data used to train Nemotron, a fact that should help engineers modify models more easily. The company also releases tools to facilitate customization and fine-tuning. This includes a new latent hybrid model architecture made up of a mix of experts, which Nvidia says is particularly effective for creating AI agents that can perform actions on computers or the web. The company is also launching libraries that allow users to train agents to do things using reinforcement learning, which involves giving models simulated rewards and punishments.
Nemotron 3 models are available in three sizes: Nano, which has 30 billion parameters; Super, which has 100 billion; and Ultra, which has 500 billion. A model’s parameters loosely correspond to its capacity as well as its execution complexity. The largest models are so bulky that they must run on racks containing expensive hardware.
Model foundations
Kari Ann Briski, vice president of enterprise generative AI software at Nvidia, said open models are important to AI builders for three reasons: builders increasingly need to customize models for particular tasks; it is often useful to pass queries to different models; and it’s easier to extract more intelligent responses from these models after training by having them perform some sort of simulated reasoning. “We believe open source provides the foundation for AI innovation and continues to accelerate the global economy,” Briski said.
Social media giant Meta released the first advanced open models under the Llama name in February 2023. However, as competition has intensified, Meta has signaled that its future releases may not be open source.
The move is part of a broader trend in the AI industry. Over the past year, American companies have moved away from openness, becoming more secretive about their research and more reluctant to inform rivals of their latest technical tricks.


