CES 2025, Las Vegas — NVIDIA announced NVIDIA Cosmos, a platform comprising generative world foundation models (WFMs), advanced tokenizers, guardrails, and an accelerated video processing pipeline built to advance the development of physical AIs like autonomous vehicles (AVs) and robots.
“The AV data factory flywheel consists of fleet data collection, accurate 4D reconstruction and AI to generate scenes and traffic variations for training and closed-loop evaluation,” said Sanja Fidler, vice president of AI research at NVIDIA. “Using the NVIDIA Omniverse platform, as well as Cosmos and supporting AI models, developers can generate synthetic driving scenarios to amplify training data by orders of magnitude.”
NVIDIA Cosmos
NVIDIA Cosmos is a new part of the equation to the development of AVs, Currently, this is made possible by three distinct computers:
- NVIDIA DGX systems for training the AI-based stack in the data center
- NVIDIA Omniverse (running on NVIDIA OGX) systems for simulation and synthetic data generation
- NVIDIA AGX in-vehicle computer to process real-time sensor data for safety
With Cosmos added to the three-computer solution, developers gain a data flywheel that can turn thousands of human-driven miles into billions of virtually driven miles to help amplify training data quality.
Cosmos world foundation models are a suite of open diffusion and autoregressive transformer models for physics-aware video generation. The models have been trained on 9,000 trillion tokens from 20 million of real-world human interactions, environment, industrial, robotics, and driving data.
The models come in three categories:
- Nano – for optimized for real-time, low-latency inference and edge deployment
- Super – for highly performant baseline models
- Ultra – for maximum quality and fidelity, best used for distilling custom models
Developers can use Cosmos’ open models for text-to-world and video-to-world generation. There are versions of the diffusion and autoregressive models, with between 4 and 14 billion parameters each. Moreover, a 12-billion-parameter upsampling model for refining text prompts, a 7-billion-parameter video decoder optimized for augmented reality, and guardrail models to ensure responsible, safe use.
“Developing physical AI models has traditionally been resource-intensive and costly for developers, requiring acquisition of real-world datasets and filtering, curating and preparing data for training,” said Norm Marks, vice president of automotive at NVIDIA. “Cosmos accelerates this process with generative AI, enabling smarter, faster and more precise AI model development for autonomous vehicles and robotics.”
Transportation leaders are using NVIDIA Cosmos to build physical AI for AVs including:
- Waabi, a pioneering generative AI for the physical world, will use Cosmos for the search and curation of video data for AV software development and simulation
- Wayve, developing AI foundation models for autonomous driving, is evaluating Cosmos as a tool to search for edge and corner case driving scenarios used for safety and validation
- Foretellix, AV toolchain provider, will use Cosmos, alongside NVIDIA Omniverse Sensor RTX Apis to evaluate and generate high-fidelity testing scenarios and training data at scale.
Moreover, Uber is partnering with NVIDIA to accelerate autonomous mobility. Rich datasets from Uber along with the features of NVIDIA Cosmos and NVIDIA DGX Cloud will help AV partners build stronger AI models even more efficiently.
Availability
NVIDIA Cosmos WFMs are available under an open model license on Hugging Face and the NVIDIA NGC catalog. Cosmos will soon be available as fully optimized NVIDIA NIM microservices.
You can learn more here.
Ram found his love and appreciation for writing in 2015 having started in the gaming and esports sphere for GG Network. He would then transition to focus more on the world of tech which has also began his journey into learning more about this world. That said though, he still has the mentality of "as long as it works" for his personal gadgets.