NVIDIA A100 TENSOR CORE GPU

Unprecedented Acceleration at Every Scale

Our GPU instance is designed to meet the growing demand for high-performance computing (HPC), artificial intelligence (AI), machine learning (ML), and other GPU-intensive workloads. Equipped with cutting-edge GPU hardware from Intel, AMD, and NVIDIA, our datacenter offers unparalleled processing power, scalability, and flexibility. Here’s an overview of the capabilities, features, and benefits that set our GPU datacenter apart.

The Most Powerful Compute Platform for Every Workload

The NVIDIA A100 Tensor Core GPU delivers unprecedented
acceleration—at every scale—to power the world’s highestperforming elastic data centers for AI, data analytics, and highperformance computing (HPC) applications. As the engine of the NVIDIA data center platform, A100 provides up to 20X higher performance over the prior NVIDIA Volta™ generation. A100 can efficiently scale up or be partitioned into seven isolated GPU instances with Multi-Instance GPU (MIG), providing a unified platform that enables elastic data centers to dynamically adjust to shifting workload demands.

NVIDIA A100 Tensor Core technology supports a broad range of math precisions, providing a single accelerator for every workload. The latest generation A100 80GB doubles GPU memory and debuts the world’s fastest memory bandwidth at 2 terabytes per second (TB/s), speeding time to solution for the largest models and most massive datasets.

A100 is part of the complete NVIDIA data center solution that incorporates building blocks across hardware, networking, software, libraries, and optimized AI models and applications from the NVIDIA NGC™ catalog. Representing the most powerful end-to-end AI and HPC platform for data centers, it allows researchers to deliver real-world results and deploy solutions into production at scale.

Incredible Performance Across Workloads

Groundbreaking Innovations

NVIDIA AMPERE ARCHITECTURE

Whether using MIG to partition an
A100 GPU into smaller instances or NVLink to connect multiple GPUs to speed large-scale workloads, A100 can readily handle different-sized acceleration needs, from the smallest job to the biggest multi-node workload. A100’s versatility means IT managers
can maximize the utility of every GPU in their data center, around the clock.

THIRD-GENERATION TENSOR CORES

NVIDIA A100 delivers 312 teraFLOPS (TFLOPS) of deep learning performance. That’s 20X the Tensor floating-point operations per second (FLOPS) for deep learning training and 20X the Tensor tera operations per second (TOPS) for deep learning inference compared to NVIDIA Volta GPUs.

HIGH-BANDWIDTH MEMORY (HBM2E)

With up to 80 gigabytes of HBM2e, A100 delivers the world’s fastest GPU memory bandwidth of over 2TB/s, as well as a dynamic randomaccess memory (DRAM) utilization efficiency of 95%. A100 delivers 1.7X higher memory bandwidth over the previous generation.

NEXT-GENERATION NVLINK

NVIDIA NVLink in A100 delivers 2X higher throughput compared to the previous generation. When combined with NVIDIA NVSwitch™, up to 16 A100 GPUs can be interconnected at up to 600 gigabytes per second (GB/sec), unleashing the highest application performance possible on a single server. NVLink is available in A100 SXM GPUs via HGX A100 server boards and in PCIe GPUs via an NVLink Bridge for up to 2 GPUs.

MULTI-INSTANCE GPU (MIG)

An A100 GPU can be partitioned into as many as seven GPU instances, fully isolated at the hardware level with their own high-bandwidth memory, cache, and compute cores. MIG gives developers access to breakthrough acceleration for all their applications, and IT administrators can offer right-sized GPU acceleration for every job,
optimizing utilization and expanding access to every user and application.

STRUCTURAL SPARSITY

AI networks have millions to billions of parameters. Not all of these parameters are needed for accurate predictions, and some can be converted to zeros, making the models “sparse” without compromising accuracy. Tensor Cores in A100 can provide up to 2X higher performance for sparse models. While the sparsity feature more readily benefits AI inference, it can also improve the performance of model training.

The NVIDIA A100 Tensor Core GPU is the flagship product of the NVIDIA data center platform for deep learning, HPC, and data analytics. The platform accelerates over 2,000 applications, including every major deep learning framework. A100 is available everywhere, from desktops to servers to cloud services, delivering both dramatic performance gains and cost-saving opportunities.

Optimized Software and Services For Enterprise

Every Deep Learning Framework

MXNet
PyTorch

Apache Spark
TensorFlow

2,000+ GPU-ACCELERATED APPLICATIONS

HPC Altair nanoFluidX
DS SIMULIA Abaqus
OpenFOAM
Altair ultraFluidX
GAUSSIAN
VASP

AMBER
GROMACS
WRF
ANSYS Fluent
NAMD

Ready to start your digital journey?

We provide tailored infrastructure solutions that are readily available and ready to be provisioned.