At the GPU Technology Conference (GTC) this week, NVIDIA made a slew of announcements highlighting how the company is making it easier than ever for developers to build and deploy generative AI applications at scale. New offerings include powerful computing platforms optimized for AI workloads, cloud services to access NVIDIA infrastructure and software, and microservices and APIs to simplify development.
“Generative artificial intelligence is the defining technology of our time. Blackwell is the driver of this new industrial revolution,” said Jensen Huang, founder and CEO of NVIDIA. “Working with the world’s most dynamic companies, we will realize the promise of AI for every industry.”
Blackwell GPU architecture powers next-generation AI computing
The main topic of the announcements was the new Blackwell GPU architecture, NVIDIA’s next-generation platform for accelerated computing and generative AI. Blackwell introduces several innovations to enable trillion-parameter AI models, including a unified 208 billion transistor GPU, a second-generation Transformer Engine, and fifth-generation NVIDIA NVLink for high-speed interconnects between GPUs.
The Blackwell architecture delivers an impressive 2.5x better FP8 performance for AI training compared to previous NVIDIA Hopper GPUs. For content inference and generation, Blackwell delivers up to 30x faster performance for large language models. This leap in performance will enable developers to create and run more sophisticated AI models than ever before.
“Blackwell offers huge leaps in performance and will accelerate our ability to deliver cutting-edge models,” said Sam Altman, CEO of OpenAI. “We’re excited to continue working with NVIDIA to advance AI computing.”
Dgx supercomputer provides Exaflop AI performance
To showcase Blackwell’s capabilities, NVIDIA announced its new DGX supercomputer powered by Blackwell GPUs. A single rack of the new DGX provides an exaflop of AI performance, equivalent to the world’s top 5 supercomputers. With 576 Blackwell GPUs connected as a single system via NVLink, NVIDIA is touting it as an “artificial intelligence factory” for generative AI.
NVIDIA AI Model Microservices simplifies deployment
To make Blackwell’s power accessible, NVIDIA announced dozens of NVIDIA NIM (NVIDIA AI Model) microservices for inference. Built on top of the NVIDIA CUDA platform, these cloud-native microservices provide optimized inference with industry-standard APIs for more than two dozen popular AI models from NVIDIA and partners.
NIM microservices come prepackaged with all necessary dependencies, such as CUDA, cuDNN, and TensorRT, to eliminate configuration hassles. They provide the fastest AI inference through containers thanks to optimized NVIDIA software such as Triton Inference Server.
Developers can easily deploy these microservices on any NVIDIA-accelerated computing platform, from cloud instances to local servers to edge devices. Major cloud providers such as AWS, Azure, and Google Cloud will offer NIM microservices, as will NVIDIA DGX Cloud and NVIDIA-Certified Systems from server vendors.
“Created with our partner ecosystem, these containerized AI microservices are the foundation for businesses in every industry to become AI businesses,” Huang explained. “Established business platforms are sitting on a gold mine of data that can be transformed into generative AI co-pilots.”
Omniverse and CUDA-X microservices accelerate development
In addition to compute and implementation services, NVIDIA announced new SDKs and APIs to accelerate AI development across industries. Omniverse Cloud APIs enable developers to integrate core Omniverse technologies into existing design and simulation applications. These APIs provide physically accurate 3D simulation and visualization capabilities for digital twins.
Industry software giants such as Ansys, Autodesk, Bentley and Siemens are integrating Omniverse Cloud APIs into their product design and engineering platforms. Omniverse enables users of these tools to seamlessly collaborate on 3D models and apply generative artificial intelligence to computer-aided engineering workflows.
“The future convergence of 6G and AI promises a transformative technology landscape,” said Charlie Zang, senior vice president at Samsung Research America. “This will bring seamless connectivity and intelligent systems that will redefine our interactions with the digital world.”
CUDA-X microservices provide end-to-end building blocks for data preparation, training, and deployment for common AI workflows. These include NVIDIA Riva for adaptive speech AI, cuOpt for routing optimization, Earth-2 APIs for global climate simulations, and NeMo Retriever services for knowledge retrieval and language understanding.
SAP partnership brings generative artificial intelligence to enterprises
NVIDIA brings generative AI capabilities to key industries such as healthcare and life sciences through targeted microservices packages and partnerships. Cooperation with the leader in business software, SAP, was highlighted. SAP and NVIDIA are working to integrate generative artificial intelligence with SAP’s portfolio of business applications and the SAP AI Core platform.
Using NVIDIA’s AI foundation and NeMo customization tools, SAP will build generative AI assistants embedded in its product lines. These include the AI co-pilot for the enterprise resource planning suite and AI-enhanced capabilities in its SAP SuccessFactors HR software and SAP Signavio business process intelligence solutions.
“Strategic technology partnerships, like the one between SAP and NVIDIA, are at the heart of our technology investment strategy that maximizes the potential and opportunities of artificial intelligence for business,” said SAP CEO Christian Klein. “NVIDIA’s expertise in delivering AI capabilities at scale will help SAP accelerate the pace of transformation and better serve our cloud customers.”
NVIDIA AI powers next-generation robotics and quantum computing
In robotics, NVIDIA introduced the GR00T project, a core model for teaching and training general skills of humanoid robots. It uses the new Jetson Thor robotic computer and upgrades to the Isaac robotic platform to create what Huang called “artificial general robotics.”
GR00T aims to enable robots to understand natural language and imitate human actions simply by observing examples. The model takes multimodal inputs that include video, audio, and sensor data for learning tasks. It can then emit motor control signals to reproduce the skills in the physical world using a robotics simulator built by NVIDIA.
Finally, for quantum computing, NVIDIA debuted Quantum Cloud, a cloud service based on the open source CUDA-Q platform that enables researchers to develop quantum algorithms and applications. It features powerful new capabilities developed with the quantum ecosystem, including a generative model for quantum machine learning and integrations with QC Ware and Classiq software.
“Quantum computing represents the next revolutionary frontier of computing and will require the world’s most brilliant minds to bring this future one step closer,” said Tim Costa, director of HPC and quantum computing at NVIDIA. “NVIDIA Quantum Cloud is breaking barriers to exploring this transformative technology.”
A comprehensive platform simplifies the generative development of artificial intelligence
From chips to cloud services to AI microservices, NVIDIA’s GTC announcements show how the company is providing developers with an end-to-end platform to simplify and accelerate the building of cutting-edge generative AI applications across industries. With these new tools, developers can focus on implementing transformative AI innovations faster than ever before.