Nvidia speeds AI, climate modeling
- 24 March, 2022 11:15
Jensen Huang (Nvidia)
It’s been years since developers found that Nvidia’s main product, the GPU, was useful not just for rendering video games but also for high-performance computing of the kind used in 3D modelling, weather forecasting, or the training of AI models—and it’s on enterprise applications such as those that CEO Jensen Huang will focus his attention at the company’s GTC 2022 conference.
Nvidia is hoping to make it easier for CIOs building digital twins and machine learning models to secure enterprise computing, and even to speed the adoption of quantum computing with a range of new hardware and software.
Digital twins, numerical models that reflect changes in real-world objects useful in design, manufacturing, and service creation, vary in their level of detail.
For some applications, a simple database may suffice to record a product’s service history—when it was made, who it shipped to, what modifications have been applied—while others require a full-on 3D model incorporating real-time sensor data that can be used, for example, to provide advanced warning of component failure or of rain. It’s at the high end of that range that Nvidia plays.
At GTC 2022, the company announced new tools for building digital twins for scientific and engineering applications. Two groups of researchers are already using Nvidia’s Modulus AI framework for developing physics machine learning models and its Omniverse 3D virtual world simulation platform to forecast the weather with greater confidence and speed, and to optimise the design of wind farms.
Engineers at Siemens Gamesa Renewable Energy are using the Modulus-Omniverse combination to model the placement of wind turbines in relation to one another to maximise power generation and reduce the effects of the turbulence generated by a turbine on its neighbours.
While the Siemens-Gamesa model looks at the effects of wind on a zone a few kilometres across, the ambitions of researchers working on FourCastNet are much greater.
FourCastNet (named for the Fourier neural operators used in its calculations) is a weather forecasting tool trained on 10 terabytes of data. It emulates and predicts extreme weather events such as hurricanes or atmospheric rivers like those that brought flooding to the Pacific Northwest and to Sydney, Australia, in early March. Nvidia claims it can do so up to 45,000 times faster than traditional numerical prediction models.
The system is a first step towards delivering a still more ambitious project that Nvidia calls Earth-2. It announced in November 2021 that it plans to build a supercomputer using its own chips and use it to create a digital twin of the Earth at 1-meter resolution in its Omniverse software to model the effects of climate change.
To help other enterprises build and maintain their own digital twins, later this year Nvidia will offer OVX computing systems running its Omniverse software on racks loaded with its GPUs, storage, and high-speed switch fabric.
Nvidia is also introducing Omniverse Cloud to allow creators, designers, and developers to collaborate on 3D designs without needing access to dedicated high-performance computing power of their own, a way for CIOs to temporarily expand their use of the technology without major capital investment.
And it’s teaming up with robotics makers and data providers to increase the number of Omniverse connectors developers can use to help their digital twins better reflect and interact with the real world.
It’s already working with retailers Kroger and Lowes, which are using Omniverse to simulate their stores and the logistics chains that supply them.
Machine learning models can be computationally intensive to run, but are even more so to train, as the process requires a system that can crunch through complex calculations on large volumes of data. At GTC2022, Nvidia is introducing a new GPU architecture, Hopper, designed to speed up such tasks, and showing off the first chip based on it, the H100.
Nvidia said the chip will make it possible to run large language models and recommender systems, increasingly common in enterprise applications, in real time, and includes new instructions that can speed up route optimisation and genomics applications.
The ability to segment the GPU into multiple instances—much like virtual machines in a CPU—will also make it useful for running several smaller applications, on premises or in the cloud.
Compared to scientific modelling, training AI models requires less mathematical precision, but greater data throughput, and the H100’s design allows applications to trade one off against the other. The result, says Nvidia, is that systems built with the H100 will be able to train models nine times faster than those using its predecessor, the A100.
Nvidia’s said its new H100 chips will also enable it to extend confidential computing capabilities to the GPU, a feature hitherto only available on CPUs. Confidential computing enables enterprises to safely process health or financial data in the secure enclave of a specially designed processor, decrypting it on arrival and encrypting the results before they are sent to storage.
The option to securely process such data on a GPU, even in a public cloud or a colocation facility, could enable enterprises to speed up the development and use of machine learning models without scaling up capital spending.
Quantum to come
Quantum computing promises—or perhaps threatens—to sweep away large swathes of today’s market for high-performance computing market with quantum processors that exploit subatomic phenomena to solve hitherto intractable optimisation problems.
When that day comes, Nvidia’s sales to the supercomputing market may take a hit, but in the meantime its chips and software are playing a role in the simulation of quantum computing systems
Researchers at the intersection of quantum and classical computing have created a low-level machine language called the Quantum Intermediate Representation. Nvidia has developed a compiler for this language, nvq++, that will first be used by researchers at Oak Ridge National Laboratory, and an SDK for accelerating quantum workflows, cuQuantum, which is available as a container optimised to run on its A100 GPU.
These tools could be useful to CIOs unsure what advantages quantum computing will offer their businesses—or to those already sure and wanting to help their developers build a quantum skillset at a time when real quantum computers are still laboratory curiosities.