Nvidia推出Nemotron 3 Super,這是一款專為Agent應用設計的開源混合型大語言模型
Nvidia推出Nemotron 3 Super,這是一款專為Agent應用設計的開源混合型大語言模型。該模型擁有1200億參數(其中120億為活躍參數),採用Mamba-Transformer混合架構搭配混合專家模組,具備原生百萬token的上下文窗口。
效能提升:相比前一代,Nemotron 3 Super在吞吐量上提升5倍,準確度增長2倍。其潛在型混合專家機制能在相同運算成本下調用4倍的專家數量。多token預測功能則能大幅縮短長序列生成時間,實現內建推測解碼。
架構設計採用混合Mamba-Transformer骨幹網絡,各層各司其職:
- Mamba層提供線性時間複雜度處理長序列
- Transformer注意力層負責精確檢索
- 混合專家層在保持低延遲的同時擴展有效參數量
原生NVFP4預訓練格式針對Nvidia Blackwell最佳化,相比FP8推論速度提升4倍,同時降低記憶需求。
基準測試:在PinchBench基準測試中,Nemotron 3 Super得分85.6%,成為同級最佳開源模型。模型完全開放,包含權重、資料集和訓練配方,使用者可在本地基礎設施上自訂部署。目前已在主流推論平台上線,以NVIDIA NIM形式提供,並可透過API、OpenRouter或nvidia.com取用。
Introducing NVIDIA Nemotron 3 Super 🎉
— NVIDIA AI Developer (@NVIDIAAIDev) March 11, 2026
Open 120B-parameter (12B active) hybrid Mamba-Transformer MoE model
Native 1M-token context
Built for compute-efficient, high-accuracy multi-agent applications
Plus, fully open weights, datasets and recipes for easy customization and… pic.twitter.com/kMFI23noFc
This latest addition to the Nemotron family isn't just a bigger Nano.
— NVIDIA AI Developer (@NVIDIAAIDev) March 11, 2026
✅ Up to 5x higher throughput and 2x accuracy than the previous version
✅ Latent MoE that calls 4x as many expert specialists for the same inference cost⁰
✅ Multi-token prediction that dramatically reduces… pic.twitter.com/18KgqdN0H4
🦞These innovations come together to create a model that is well suited for long-running autonomous agents.
— NVIDIA AI Developer (@NVIDIAAIDev) March 11, 2026
On PinchBench—a benchmark for evaluating LLMs as @OpenClaw coding agents—Nemotron 3 Super scores 85.6% across the full test suite, making it the best open model in its… pic.twitter.com/01R0oImsJb
“NVIDIA Nemotron 3 Super: The new leader in open, efficient intelligence”https://t.co/JN3iEX3A35
— NVIDIA AI Developer (@NVIDIAAIDev) March 11, 2026
Ready to get started?
— NVIDIA AI Developer (@NVIDIAAIDev) March 11, 2026
Nemotron 3 Super supports deployment across environments, from workstations to the cloud, and can be accessed through API, OpenRouter, or https://t.co/fC1rz1G9c4.
It is now live and available on major inference platforms, packaged as NVIDIA NIM:
📥…
