Nvidia unveils Nemotron 3 Ultra: America’s smartest open-weights AI model, 30% cheaper to run
Nemotron 3 Ultra, the new flagship AI model features 500-550 billion parameters
Nvidia has recently unveiled a new AI model with unprecedented capabilities at Computex 2026 in Taipei, Taiwan.
Named Nemotron 3 Ultra, the new flagship AI model is packed with 500-550 billion parameters. It is mainly designed for complex planning, reasoning and agentic workflows.
Talking about its efficiency, the model delivers 5x faster inference, thereby promising a significant reduction in AI-driven cost-per-inference for enterprises. Nemotron 3 Ultra model utilizes NVFP4 training techniques and latent mixture-of-experts (MoE), thereby optimizing performance by activating only relevant parts of the network per task for better efficiency.
Being a small but a faster open model built for long running agents, the newly unveiled model tops US open-weights rankings, outperforming rivals like Gemma 4 31B.
When it comes to costs, it delivers 30 percent lower costs for complicated agentic tasks. The feature holds a significant importance as nowadays tech companies have been facing AI-driven costs surge at unprecedented levels.
It serves as the new top-tier model, joining the mid-range Super and the lightweight Nano variants. The Super model was launched in March 2026 with 120 billion parameters.
According to Artificial Analysis which partnered with Nvidia to assess the model’s capabilities, such as intelligence and speed. In terms of intelligence, Nemotron 3 Ultra scores 48 on the Artificial Analysis Intelligence Index. This score makes it America’s smartest model till date, outweighing Gemma 4 31B (39), Nemotron 3 Super (36) and gpt-oss-120b (33). But it fails to outcompete Chinese open-weights frontier Kimi K2.6, showing scores at 54.
It also serves over 300 tokens per second. Peer models in its size such as DeepSeek and Moonshot are generally served at speeds of 50-100 tokens per second in the market.
The model will give Nvidia a significant edge in the competitive industry landscape. For instance, Nvidia will deploy its tools for companies that are looking to strengthen its position through means of AI integration.
Moreover, the outstanding performance, efficiency with significantly lower costs will grab the attention of major enterprises which are looking for cost-effective agentic AI deployments.
During the conference, Huang highlighted its openness for developers building everything from search tools to molecular simulations, paired with new agent deployment tools, as Nemotron 4 looms on the horizon.
-
How to watch WWDC 2026 keynote live on June 8
-
Study finds AI doesn't understand sports like humans do
-
More than half of web traffic now comes from bots, Cloudflare says
-
Oxford University hit by second data breach in a month
-
Marvell set to join S&P 500 as AI-driven growth boosts profitability
-
Bluesky COO warns social media bans risk cementing big tech's power
-
US plans to accelerate development and use of AI for national security: Here’s why
-
Humanity still has choice in frontier AI’s future, says Ex-OpenAI CTO
