NVIDIA Details Ada Lovelace GPU Block Diagram, Streaming Multi-Processor, DLSS 3 & GeForce RTX 40 Founders Edition Cooler

Hassan Mujtaba
NVIDIA GeForce RTX 40 GPUs MSRP Drops In UK, Prices Lowered By Currency Changes 1

During its press tech talk, NVIDIA talked about several technologies surrounding the upcoming GeForce RTX 40 graphics cards based on the Ada Lovelace GPUs. Some of the technologies that were highlighted included the Ada Lovelace GPU itself, the latest DLSS 3 technology, and coolers featured on the brand new Founders Edition models.

NVIDIA Further Details Ada Lovelace GPUs, DLSS 3, GeForce RTX 40 Graphics Cards & More

NVIDIA will be launching its first GeForce RTX 40 series graphics card, the RTX 4090, on the 12th of October, followed by the RTX 4080 series in November. There's a lot to talk about so let's get us started.

Related Story Sand Land Review – Enticing Toriyama Tribute

NVIDIA's AD102 'Ada Lovelace' GPU - The Next-Gen Powerhouse

At the heart of the NVIDIA GeForce RTX 4090 graphics card lies the Ada Lovelace AD102 GPU. The GPU measures 608,4mm2 and will utilize the TSMC 4N process node which is an optimized version of TSMC's 5nm (N5) node designed for the green team. The GPU features an insane 76.3 Billion transistors.

The NVIDIA Ada Lovelace AD102 GPU features up to 12 GPC (Graphics Processing Clusters). These are 5 more SMs compared to the Ampere GA102 GPUs. Each GPU will consist of 6 TPCs and 2 SMs which is the same configuration as the existing chip. Each SM (Streaming Multiprocessor) will house four sub-cores which is also the same as the GA102 GPU. What's changed is the FP32 & the INT32 core configuration. Each sub-core will include 64 FP32 units but combined FP32+INT32 units will go up to 128. This is because half of the FP32 units don't share the same sub-core as the IN32 units. The 64 FP32 cores are separate from the 128 INT32 cores.

So in total, each sub-core will consist of 16 FP32 plus 16 INT32 units for a total of 32 units. Each SM will have a total of 64 FP32 units plus 64 INT32 units for a total of 128 units. And since there are a total of 144 SM units (12 per GPC), we are looking at a total of 18,432 cores. Each SM will also include two Wrap Schedules (32 thread/CLK) for 64 wraps per SM & their own L0 i-cache. This is a 33% increase in Wraps/Threads vs the GA102 GPU. The Register file size is 16,384 across a 32-bit lane. Each SM also carries its own 128 KB of L1 data cache and shared memory so that's 18 MB of L1 cache.

Moving over to the cache, this is another segment where NVIDIA has given a big boost over the existing Ampere GPUs. The L2 cache will be increased to 96 MB as mentioned in the leaks. This is a 16x increase over the Ampere GPU that hosts just 6 MB of L2 cache. The cache will be shared across the GPU. The GPU will also feature up to 192 ROPs for the full-die.

There are also going to be the latest 4th Generation Tensor and 3rd Generation RT (Raytracing) cores infused on the Ada Lovelace GPUs which will help boost DLSS & Raytracing performance to the next level. Overall, the Ada Lovelace AD102 GPU will offer:

  • 71% More GPCs (Versus Ampere)
  • 71% More Cores (Versus Ampere)
  • 50% More L1 Cache (Versus Ampere)
  • 16x More L2 Cache (Versus Ampere)
  • 71% More ROPs (Versus Ampere)
  • 4th Gen Tensor & 3rd Gen RT Cores

The full die has not been featured on any GPU so far, not even the L40 which has 2 SMs disabled. It is likely that as yields progress, we will eventually see a gaming and workstation product using the full-fat AD102. Till then, the RTX 4090 is the top gaming graphics card while the RTX 6000 Ada is the top workstation solution.

NVIDIA AD102 'Ada Lovelace' Gaming GPU Block Diagram:

NVIDIA AD102 'Ada Lovelace' Gaming GPU 'SM' Block Diagram:

NVIDIA Founders Edition Designed To Utilize Up To 600W of Power For Higher Overclocking

As for its brand new Founders Edition cards, the GeForce RTX 4090 24 GB and RTX 4080 16 GB, NVIDIA has produced a compact PCB, similar to the ones we saw on the previous generation & designing a PCB like this helps improve airflow and cooling performance.

NVIDIA says that they have further optimized the Dual Axial Flow Through system, increasing fan sizes and fin volume by 10%, offering 20% higher air-flow, and upgrading to a 23-phase power supply (20+3 Phase for RTX 4090). Memory temperatures are reduced, and the new, substantially more powerful Ada GPUs are kept cool in ventilated cases, giving gamers excellent overclocking headroom. NVIDIA went through a rigorous testing procedure and is said to have evaluated as many as 50 fan designs before finalizing the one we are getting on the new cards. The cooler is used to dissipate heat from the heatsink assembly that comprises a vapor chamber, a big jump from the previous design too.

The NVIDIA GeForce RTX 4080 also uses the same cooler as the RTX 4090 Founders Edition and since it has a lower TDP, it should deliver even better thermal performance.

geforce-rtx-4090-product-photo-001-1
geforce-rtx-4090-product-photo-004-1
geforce-rtx-4090-product-photo-002
geforce-rtx-4090-product-photo-003

Each GeForce RTX 40 Series Founders Edition graphics card reduces cable clutter by leveraging the new standard GPU power input of next-gen ATX 3.0 power supplies, the PCIe Gen-5 16-pin Connector. This enables you to power GeForce RTX 40 Series graphics cards with just a single cable, improving the aesthetics of your build. If you are using a previous-gen power supply, an adapter cable is included in the box, allowing you to plug in three 8-pin power connectors, with an optional fourth connector for more overclocking headroom. ATX 3.0 power supplies will be available in October from ASUS, Cooler Master, FSP, Gigabyte, iBuyPower, MSI, and ThermalTake, with more models to come.

One advantage that comes with the new 16-pin connector is that while the Founders Edition cards are designed at 450W & 320W, respectively, they can utilize the extra headroom provided through the new connector for extreme overclocking with the RTX 4090 going for that full 600W mark. The new power delivery also gives the RTX 40 series a 10x increase in response time to power transient management compared to the previous generation.

The new cards also feature DP 1.4a (4K 12-bit HDR @ 240Hz) and HDMI 2.1 (4K 120Hz HDR / 8K 60Hz HDR). All cards are compliant with the PCIe Gen 4 interface on existing motherboards and also feature full compliance with the Resizable-BAR technologies.

NVIDIA GeForce RTX 4090 Founders Edition PCB:

Next-Gen Micron GDDR6X Dies Run 10C Cooler Thanks To New Process Node

NVIDIA has also leveraged Micron's latest GDDR6X memory chips for its GeForce RTX 40 graphics cards which run 10C cooler, are more power efficient and since they are all 16Gb DRAM dies, they can be fused on one side of the PCB to be cooled better than dual-sided memory.

NVIDIA DLSS 3: Compatibility, Feature Set, Gaming Performance & More

Now, let's dive into the technological advancements that allow these incredible achievements. To begin with, NVIDIA engineers started with DLSS Super Resolution and added something called Optical Multi Frame Generation based on Ada's Optical Flow Accelerator. This accelerator analyzes two sequential frames from a particular game, capturing pixel details such as particles, reflections, lighting, and shadows.

On top of that, NVIDIA DLSS 3 also takes into account conventional game engine information such as motion vectors. The DLSS Frame Generation AI convolutional autoencoder network will then decide how to use each of the four inputs (current and prior frames, optical flow field, and motion vectors) to recreate intermediate frames in the best possible way.

NVIDIA DLSS 3 is said to reconstruct 3/4 of the first frame with DLSS Super Resolution and the full second frame with the help of the aforementioned DLSS Frame Generation. Overall, NVIDIA DLSS 3 reconstructs 7/8 of the two total frames displayed, which explains the massive performance uplift.

Additionally, the new version of the Deep Learning Super Sampling image reconstruction technique also includes the latency-lowering NVIDIA Reflex technology.

So talking about DLSS GPU support, the technology will feature full DLSS Frame Generation across all RTX 40 series GPUs. For the older RTX 20 & RTX 30 series, the technology will be available as the DLSS Super Resolution suite (also on RTX 40). Lastly, NVIDIA Reflex will be supported by GeForce 900 series and above.

nvidia-ada-lovelace-gpu-geforce-rtx-4090-rtx-4080-series-graphics-cards-_4
nvidia-ada-lovelace-gpu-geforce-rtx-4090-rtx-4080-series-graphics-cards-_5
nvidia-ada-lovelace-gpu-geforce-rtx-4090-rtx-4080-series-graphics-cards-_7

Cyberpunk 2077 has been shown running NVIDIA DLSS 3, the brand new Ray Tracing Overdrive, and NVIDIA Reflex with up to 4x improved performance and up to 2x reduced latency. That's not all, as NVIDIA is even promising benefits for CPU-bound games, which generally didn't run much faster with DLSS 2.0. For example, the notoriously CPU-heavy Microsoft Flight Simulator gets up to 2x improved performance with the new DLSS. Overall, NVIDIA said the following over 35 games and apps already pledged support to NVIDIA DLSS 3.

  • A Plague Tale: Requiem
  • Atomic Heart
  • Black Myth: Wukong
  • Bright Memory: Infinite
  • Chernobylite
  • Conqueror's Blade
  • Cyberpunk 2077
  • Dakar Rally
  • Deliver Us Mars
  • Destroy All Humans! 2 - Reprobed
  • Dying Light 2 Stay Human
  • F1 22
  • F.I.S.T.: Forged In Shadow Torch
  • Frostbite Engine
  • HITMAN 3
  • Hogwarts Legacy
  • ICARUS
  • Jurassic World Evolution 2
  • Justice
  • Loopmancer
  • Marauders
  • Microsoft Flight Simulator
  • Midnight Ghost Hunt
  • Mount & Blade II: Bannerlord
  • Naraka: Bladepoint
  • NVIDIA Omniverse
  • NVIDIA Racer RTX
  • PERISH
  • Portal with RTX
  • Ripout
  • S.T.A.L.K.E.R. 2: Heart of Chornobyl
  • Scathe
  • Sword and Fairy 7
  • SYNCED
  • The Lord of the Rings: Gollum
  • The Witcher 3: Wild Hunt
  • THRONE AND LIBERTY
  • Tower of Fantasy
  • Unity
  • Unreal Engine 4 & 5
  • Warhammer 40,000: Darktide

The NVIDIA GeForce RTX 4080 16 GB and RTX 4080 12 GB graphics cards will be launching in November and be priced at $1199 US and $899 US, respectively.

nvidia-ada-lovelace-gpu-geforce-rtx-4090-rtx-4080-series-graphics-cards-_8
nvidia-ada-lovelace-gpu-geforce-rtx-4090-rtx-4080-series-graphics-cards-_9
nvidia-ada-lovelace-gpu-geforce-rtx-4090-rtx-4080-series-graphics-cards-_7

NVIDIA GeForce RTX 40 Series Official Specs:

Graphics Card NameNVIDIA GeForce RTX 4090NVIDIA GeForce RTX 4090 DNVIDIA GeForce RTX 4080NVIDIA GeForce RTX 4070 TiNVIDIA GeForce RTX 4070NVIDIA GeForce RTX 4060 TiNVIDIA GeForce RTX 4060
GPU NameAda Lovelace AD102-300Ada Lovelace AD102-250Ada Lovelace AD103-300Ada Lovelace AD104-400Ada Lovelace AD104-250Ada Lovelace AD106-350Ada Lovelace AD107-400
Process NodeTSMC 4NTSMC 4NTSMC 4NTSMC 4NTSMC 4NTSMC 4NTSMC 4N
Die Size608mm2608mm2378.6mm2294.5mm2294.5mm2190.0mm2146.0mm2
Transistors76 Billion76 Billion45.9 Billion35.8 Billion35.8 Billion22.9 BillionTBD
CUDA Cores163841459297287680588843523072
TMUs / ROPs512 / 176TBD320 / 112240 / 80184 / 64136 / 48TBD
Tensor / RT Cores512 / 128456 / 128304 / 76240 / 60184 / 46136 / 34TBD
L2 Cache72 MB72 MB64 MB48 MB36 MB32 MB24 MB
Base Clock2230 MHz2280 MHz2210 MHz2310 MHz1920 MHz2310 MHz1830 MHz
Boost Clock2520 MHz2520 MHz2510 MHz2610 MHz2475 MHz2535 MHz2460 MHz
FP32 Compute83 TFLOPsTBD49 TFLOPs40 TFLOPs29 TFLOPs22 TFLOPs15 TFLOPs
RT TFLOPs191 TFLOPsTBD113 TFLOPs82 TFLOPs67 TFLOPs51 TFLOPs35 TFLOPs
Tensor-TOPs1321 TOPsTBD780 TOPs641 TOPs466 TOPs353 TOPs242 TOPs
Memory Capacity24 GB GDDR6X24 GB GDDR6X16 GB GDDR6X12 GB GDDR6X12 GB GDDR6X8-16 GB GDDR68 GB GDDR6
Memory Bus384-bit384-bit256-bit192-bit192-bit128-bit128-bit
Memory Speed21.0 Gbps21.0 Gbps23.0 Gbps21.0 Gbps21.0 Gbps18.0 Gbps17.0 Gbps
Bandwidth1008 GB/s1008 GB/s736 GB/s504 GB/s504 GB/s288 GB/s
(554 GB/s Effective)
272 GB/s
(453 GB/s Effective)
TBP450W425W320W285W200W160-165W115W
Price (MSRP / FE)$1599 US / 1949 EU12,999 RMB (China-Only)$1199 US / 1469 EU$799 US$599 US$399-$499 US$299 US
Price (Current)$1599 US / 1859 EU12,999 RMB (China-Only)$1199 US / 1399 EU$799 US$599 US$399-$499 US$299 US
Launch (Availability)12th October 202228th December 202316th November 20225th January 202313th April 202324th May / 18th July 202329th June 2023
Which NVIDIA GeForce RTX 40 series graphics card are you looking forward to the most?
Share this story

Deal of the Day

Comments