Nvidia is extending its solution footprint far beyond artificial intelligence (AI) and gaming, venturing broadly across the entire computing ecosystem into mobility and the next-generation cloud data centre.
Demonstrating that he’s putting substantial research and development (R&D) dollars behind this vision, at the virtual Nvidia GPU Technology Conference this month, Huang announced the rollout of the vendor’s new BlueField 'data processing unit (DPU)' chip architecture.
Accelerating diverse workloads through programmable CPU offload
Strategically, the BlueField DPU builds on two of Nvidia’s boldest recent acquisitions. The new hardware architecture runs on Arm’s CPU architecture. It also incorporates high-speed interconnect technology that Nvidia acquired recently with Mellanox.
Marking the company’s evolution beyond a GPU-centric product architecture, Nvidia’s new DPU architecture is a high-performance, multicore SoC (system on chip). BlueField DPUs incorporate software-programmable data-processing engines that can accelerate a wide range of AI, networking, acceleration, virtualisation, security, storage, and other enterprise workloads.
As the foundation of server-based intelligent network interface controllers, DPUs offload workloads from CPUs while efficiently parsing, processing, and transferring high volumes of data at line speeds.
In addition to their CPU-offload acceleration benefits, Nvidia’s DPUs can strengthen data centre security because the Arm cores embedded within them provide an added level of isolation between security services and CPU-executed applications.
Announced at this latest GTC were the following versions of this new DPU SoC family, starting with Nvidia BlueField-2.
Due to be included in new systems from Nvidia server hardware partners in 2021, this architecture features all capabilities of the Nvidia Mellanox ConnectX-6 Dx SmartNIC. It incorporates programmable Arm cores, supports data transfer rates of 200Gbps, and provides hardware offloads to accelerate key data centre tasks.
Furthermore, it speeds up security, networking and storage tasks, including isolation, root trust, key management, RDMA/RDMA over Converged ethernet, GPUDirect, elastic block storage, and data compression.
The offering also includes a controller for managing high-performance back-end nonvolatile memory express storage, all-flash arrays and hyperconverged systems. A single BlueField-2 DPU can offload data centre workloads from as many as 125 CPU cores, thereby freeing up cycles to process other enterprise applications.
Meanwhile, Nvidia BlueField-2X is under development and due to become available in 2021, adding an Nvidia Ampere architecture GPU to BlueField-2 for in-networking computing with CUDA and Nvidia AI.
The solution includes all the key features of BlueField-2 and leverages Nvidia’s third-generation Tensor Cores for real-time AI-driven security analytics. It can identify abnormal traffic indicative of theft of confidential data, and also encrypt traffic analytics at line rate and introspect traffic to identify malicious activity and automatically trigger security features and automated responses.
Nvidia also announced that it will launch next-generation BlueField-3 and BlueField-3X DPUs in 2022, and BlueField-4X in 2023. In the latter generation, Nvidia will integrate the GPU and Arm cores at the silicon level. The company promised that BlueField-4 will boost the DPU’s processing speeds 1000 times beyond BlueField-2X and 600 times beyond BlueField-3X.
Building a robust ecosystem around the DPU accelerator architecture
As it evolves its hardware platform into a DPU-centric architecture in support of new enterprise applications, Nvidia is also making sure that it fully integrates its BlueField/DOCA accelerators into the Arm partner ecosystem.
Signalling that strategy at GTC, the vendor announced that it will help Arm partners go to market with full-stack solution platforms that consist of GPU-enabled as well as DPU-enabled networking, storage and security technologies.
It has engaged Arm partners to create full-stack solutions for high-performance computing, cloud, edge and PC opportunities. Also, it is porting its AI and RTX engines to Arm, so that they address a much larger market than the x86 platforms on which Nvidia has traditionally run.
Partners are essential to Nvidia’s plans to support a wider range of enterprise application workloads than just AI on its new DPU product family. Integral to Nvidia’s land-and-expand strategy is DOCA, a new data centre infrastructure SoC architecture and software development kit.
Currently available to early access partners only, the DOCA SDK enables developers to program applications on BlueField-accelerated data centre infrastructure services. Developers can offload CPU workloads to BlueField DPUs.
Consequently, this new offering builds out Nvidia’s enterprise developer tools, complementing the CUDA programming model that enables development of GPU-accelerated applications. In addition, the SDK is fully integrated into the Nvidia NGC catalog of containerised software, thereby encouraging third-party application providers to develop, certify, and distribute DPU-accelerated applications.
Several leading software vendors - such as VMware, Red Hat, Canonical and Check Point Software Technologies - announced plans at GTC to integrate their wares with the new DSP/DOCA acceleration architecture in the coming year.
In addition, Nvidia announced that several leading server manufacturers, including Asus, Atos, Dell Technologies, Fujitsu, Gigabyte, H3C, Inspur, Lenovo, Quanta/QCT and Supermicro, plan to integrate the DPU into their respective products in the same timeframe.
Although there was no specific Arm tie-in to Huang’s announcement that Microsoft is adopting Nvidia AI on Azure to bring GPU-accelerated smart experiences to its cloud-based Microsoft Office experience, it would not be surprising if, in coming years, more of the mobile experience on this and other Office apps were accelerated locally by leveraging DPU-offload technology.
Enabling Nvidia solutions to lessen their dependency on GPU-centric functionality
Nvidia’s product teams are wasting no time to incorporate the DPUs’ CPU-offload acceleration into their solutions. Most notably, Huang announced that the Nvidia EGX AI edge-server platform is evolving to combine the Nvidia Ampere architecture GPU and BlueField-2 DPU on a single PCIe card.
Although there was no specific BlueField DPU tie-in to Nvidia Jetson, the company’s Arm-based SoC for AI robotics, one should expect that the DOCA SDK will advance to support development of these applications, which are a hot growth field for Nvidia’s core platforms.
It’s also a safe bet that the company will use its new hardware and SDK to accelerate its Omniverse platform for collaborative 3-D content production, its Jarvis platform for conversational AI, and its newMaxine platform for cloud-native, AI-accelerated video streaming.
Nvidia’s new BlueField DPU architecture and DOCA SDK provide a strategic platform for broadening its reach into enterprise, service provider, and consumer opportunities of all types.
By enabling hardware-accelerated CPU-offload of diverse workloads, the DPU architecture provides Nvidia with a clear path for converging the new DOCA programming models with its CUDA AI development framework and NGC catalog of containerised cloud solutions.
This will enable the company to provide both its own product teams and solution partners with the hardware and software platforms needed to accelerate a full range of application and infrastructure workloads from cloud to edge.
As it awaits the eventual approval of its proposed acquisition of Arm Technology, Nvidia will need to prove this new architecture to its existing partner ecosystem.
If DPU technology falls short of Nvidia’s aggressive performance promises, that deficiency could sour relations with Arm’s vast array of licensees, all of whom rely heavily on its CPU-based processor architecture and would benefit from more seamless integration with Nvidia’s market-leading AI technology.
Clearly Nvidia cannot afford to lose momentum in the cloud-to-edge microprocessor wars just when it has begun to pull away from arch-rival and CPU-powerhouse Intel.