Nvidia unveils Nemotron 3 Ultra: America’s smartest open-weights AI model, 30% cheaper to run
Nemotron 3 Ultra, the new flagship AI model features 500-550 billion parameters
Nvidia has recently unveiled a new AI model with unprecedented capabilities at Computex 2026 in Taipei, Taiwan.
Named Nemotron 3 Ultra, the new flagship AI model is packed with 500-550 billion parameters. It is mainly designed for complex planning, reasoning and agentic workflows.
Talking about its efficiency, the model delivers 5x faster inference, thereby promising a significant reduction in AI-driven cost-per-inference for enterprises. Nemotron 3 Ultra model utilizes NVFP4 training techniques and latent mixture-of-experts (MoE), thereby optimizing performance by activating only relevant parts of the network per task for better efficiency.
Being a small but a faster open model built for long running agents, the newly unveiled model tops US open-weights rankings, outperforming rivals like Gemma 4 31B.
When it comes to costs, it delivers 30 percent lower costs for complicated agentic tasks. The feature holds a significant importance as nowadays tech companies have been facing AI-driven costs surge at unprecedented levels.
It serves as the new top-tier model, joining the mid-range Super and the lightweight Nano variants. The Super model was launched in March 2026 with 120 billion parameters.
According to Artificial Analysis which partnered with Nvidia to assess the model’s capabilities, such as intelligence and speed. In terms of intelligence, Nemotron 3 Ultra scores 48 on the Artificial Analysis Intelligence Index. This score makes it America’s smartest model till date, outweighing Gemma 4 31B (39), Nemotron 3 Super (36) and gpt-oss-120b (33). But it fails to outcompete Chinese open-weights frontier Kimi K2.6, showing scores at 54.
It also serves over 300 tokens per second. Peer models in its size such as DeepSeek and Moonshot are generally served at speeds of 50-100 tokens per second in the market.
The model will give Nvidia a significant edge in the competitive industry landscape. For instance, Nvidia will deploy its tools for companies that are looking to strengthen its position through means of AI integration.
Moreover, the outstanding performance, efficiency with significantly lower costs will grab the attention of major enterprises which are looking for cost-effective agentic AI deployments.
During the conference, Huang highlighted its openness for developers building everything from search tools to molecular simulations, paired with new agent deployment tools, as Nemotron 4 looms on the horizon.
-
Here's why London is becoming the next battleground for AI giants
-
Nvidia unveils new CPU ‘superchip’ for Windows laptops: A new era for personal computers?
-
CNN sues Perplexity AI over alleged scraping of 17,000 news stories
-
MSI announces world’s first Agentic AI gaming monitor at Computex 2026
-
AI shopping now converts better than Google Search, Adobe data shows
-
Elon Musk reacts after ChatGPT says Charlie Kirk ‘wasn’t assassinated’
-
Apple's first foldable iPhone likely to cost over $2,000: Check spec details here
-
Can AI cure loneliness? New research says no
