AMD Instinct MI210 With Single Aldebaran ‘CDNA 2’ GPU Die Features 104 Compute Units & 64 GB HBM2E Memory, 40% Faster Than MI100

Hassan Mujtaba
AMD Instinct MI210 To Feature A Single Aldebaran 'CDNA 2' GPU Compute Die With 6656 Cores & 64 GB HBM2E Memory

AMD has more Instinct MI200 series cards on the way for the HPC segment based on its brand new Aldebaran CDNA 2 GPU architecture. The latest card that's being talked about is the Instinct MI210 which features a single graphics compute die.

AMD Instinct MI210 To Feature A Single Aldebaran 'CDNA 2' GPU Compute Die With 6656 Cores & 64 GB HBM2E Memory

With the Instinct MI250X and MI250, AMD brought MCM technology to the data center and HPC segment. Based on its new CDNA 2 architecture, the new Aldebaran GPU offers immense power aimed at HPC and Data Center workloads. But there are more MI200 series cards on the horizon and the MI210 is one of them.

Related Story AMD Reveals Open-Sourcing Of Additional Radeon GPU Stacks, On-Track To Debut This Year

Unveiled by George Markomanolis, an Engineer working on the upcoming LUMI supercomputer & lead HPC scientist at CSC, who got remote access to the AMD Instinct MI210 boasts some impressive specs out of the box. George has shared that the Instinct MI210 features a single GCD which means it is a completely new SKU and doesn't feature both GCD dies on board the package. The single GCD is equipped with 104 CUs out of the 128 CUs featured on the Aldebaran chip. Even the higher-end MI250X features just 110 CUs enabled per die for a total of 7040 stream processors. The MI210 is housing 6656 stream processors.

In addition to the core count, the AMD Instinct MI210 also rocks 64 GB of HBM2e memory which is half the amount of the Instinct MI250X but twice the memory capacity over the Instinct MI100 and that was the flagship just a few months ago until it got replaced by the MI250 series. We don't have the exact Flops for this card but assuming it is clocked around the same 1700 MHz as the Instinct Mi250 accelerators are, we are looking at around 22-23 TFLOPs of FP64 and 44-46 TFLOPs of FP32 compute. This should give some heated competition to the NVIDIA A100 which isn't expected to get an update till GTC next year.

George has also shared that the AMD Instinct MI210 is around 40% faster than the Instinct MI100 in BabelStream with HIP. Given the cut-down specifications, we can expect the TDP to fall around 300-350W. And since this is a 1 GCD accelerator, we are also expecting to see a 4096-bit bus interface at 3.2 Gbps pin speeds for a total of 1.6 TB/s bandwidth. The MI210 accelerator should launch in both OAM and PCIe form factors and will start shipping to priority HPC customers and partners soon.

AMD Radeon Instinct Accelerators

Accelerator NameAMD Instinct MI400AMD Instinct MI350XAMD Instinct MI300XAMD Instinct MI300AAMD Instinct MI250XAMD Instinct MI250AMD Instinct MI210AMD Instinct MI100AMD Radeon Instinct MI60AMD Radeon Instinct MI50AMD Radeon Instinct MI25AMD Radeon Instinct MI8AMD Radeon Instinct MI6
CPU ArchitectureZen 5 (Exascale APU)N/AN/AZen 4 (Exascale APU)N/AN/AN/AN/AN/AN/AN/AN/AN/A
GPU ArchitectureCDNA 4CDNA 3+?Aqua Vanjaram (CDNA 3)Aqua Vanjaram (CDNA 3)Aldebaran (CDNA 2)Aldebaran (CDNA 2)Aldebaran (CDNA 2)Arcturus (CDNA 1)Vega 20Vega 20Vega 10Fiji XTPolaris 10
GPU Process Node4nm4nm5nm+6nm5nm+6nm6nm6nm6nm7nm FinFET7nm FinFET7nm FinFET14nm FinFET28nm14nm FinFET
GPU ChipletsTBDTBD8 (MCM)8 (MCM)2 (MCM)
1 (Per Die)
2 (MCM)
1 (Per Die)
2 (MCM)
1 (Per Die)
1 (Monolithic)1 (Monolithic)1 (Monolithic)1 (Monolithic)1 (Monolithic)1 (Monolithic)
GPU CoresTBDTBD19,45614,59214,08013,3126656768040963840409640962304
GPU Clock SpeedTBDTBD2100 MHz2100 MHz1700 MHz1700 MHz1700 MHz1500 MHz1800 MHz1725 MHz1500 MHz1000 MHz1237 MHz
INT8 ComputeTBDTBD2614 TOPS1961 TOPS383 TOPs362 TOPS181 TOPS92.3 TOPSN/AN/AN/AN/AN/A
FP16 ComputeTBDTBD1.3 PFLOPs980.6 TFLOPs383 TFLOPs362 TFLOPs181 TFLOPs185 TFLOPs29.5 TFLOPs26.5 TFLOPs24.6 TFLOPs8.2 TFLOPs5.7 TFLOPs
FP32 ComputeTBDTBD163.4 TFLOPs122.6 TFLOPs95.7 TFLOPs90.5 TFLOPs45.3 TFLOPs23.1 TFLOPs14.7 TFLOPs13.3 TFLOPs12.3 TFLOPs8.2 TFLOPs5.7 TFLOPs
FP64 ComputeTBDTBD81.7 TFLOPs61.3 TFLOPs47.9 TFLOPs45.3 TFLOPs22.6 TFLOPs11.5 TFLOPs7.4 TFLOPs6.6 TFLOPs768 GFLOPs512 GFLOPs384 GFLOPs
VRAMTBDHBM3e192 GB HBM3128 GB HBM3128 GB HBM2e128 GB HBM2e64 GB HBM2e32 GB HBM232 GB HBM216 GB HBM216 GB HBM24 GB HBM116 GB GDDR5
Infinity CacheTBDTBD256 MB256 MBN/AN/AN/AN/AN/AN/AN/AN/AN/A
Memory ClockTBDTBD5.2 Gbps5.2 Gbps3.2 Gbps3.2 Gbps3.2 Gbps1200 MHz1000 MHz1000 MHz945 MHz500 MHz1750 MHz
Memory BusTBDTBD8192-bit8192-bit8192-bit8192-bit4096-bit4096-bit bus4096-bit bus4096-bit bus2048-bit bus4096-bit bus256-bit bus
Memory BandwidthTBDTBD5.3 TB/s5.3 TB/s3.2 TB/s3.2 TB/s1.6 TB/s1.23 TB/s1 TB/s1 TB/s484 GB/s512 GB/s224 GB/s
Form FactorTBDTBDOAMAPU SH5 SocketOAMOAMDual Slot CardDual Slot, Full LengthDual Slot, Full LengthDual Slot, Full LengthDual Slot, Full LengthDual Slot, Half LengthSingle Slot, Full Length
CoolingTBDTBDPassive CoolingPassive CoolingPassive CoolingPassive CoolingPassive CoolingPassive CoolingPassive CoolingPassive CoolingPassive CoolingPassive CoolingPassive Cooling
TDP (Max)TBDTBD750W760W560W500W300W300W300W300W300W175W150W

News Source: Tomshardware

Share this story

Deal of the Day

Comments