NVIDIA GeForce RTX 4090 Is The First Gaming Graphics Card To Deliver 100 TFLOPs of Compute Performance

Hassan Mujtaba • Oct 11, 2022 01:29 PM EDT

• Copy Shortlink

NVIDIA GeForce RTX 4090 Is The First Gaming Graphics Card To Deliver 100 TFLOPs of Compute Performance 1

NVIDIA's GeForce RTX 4090 is the first gaming graphics card to achieve over 100 TFLOPs of compute performance. You can also read our full review of the card here.

Breaking The 100 TFLOPs Barrier! NVIDIA GeForce RTX 4090 Becomes The Fastest Gaming Graphics Card For Compute & Fastest Gaming Graphics Card, Period!

Breaking the 100 TFLOPs barrier is no easy feat. Before today, NVIDIA's fastest gaming graphics card, the GeForce RTX 3090 Ti, only delivered 40 TFLOPs of compute horsepower. With the launch of the GeForce RTX 4090, we get close to the 100 TFLOPs barrier but not officially. NVIDIA states that the GeForce RTX 4090 Founders Edition offers 83 TFLOPs at default settings. This means that the card is 17 TFLOPs shy of that 100 TFLOPs mark.

So we decided it was time to test how far we can push the NVIDIA GeForce RTX 4090 Founders Edition with some overclocking. To get to 100 TFLOPs, we first pushed the power limit and temp limit slider all the way to the max and upped the Core and Memory clocks by +275 and +1100 MHz, respectively. This wasn't enough as the card was being limited by its power design. That is when we landed our hands on MSI's latest Afterburner which allowed us to raise the core voltages. At 100%, we saw some performance regression so we had to stick with +55% which showed us some good results.

With the overclock applied on our NVIDIA GeForce RTX 4090 graphics card, we saw a maximum GPU core clock of 3150 MHz on the AD102 Ada GPU, a maximum power draw of 547W and our temps peaked at 69C. All of this was done on air and with no exotic liquid cooling, chillers or LN2 were used.

And behold, we saw the magical number of not 100 but almost 101 TFLOPs right in front of our eyes. To put things into perspective, this is a 22% compute boost over the stock RTX 4090 and a 2.5x compute performance boost over the RTX 3090 Ti. The AD102 GPU also ripped apart the data-center-focused Hopper H100 GPUs by offering over 50% better FP32 performance. Ada Lovelace is truly a game changer and we can definitely see it become a popular compute and AI graphics card when Quadro variants of the said chip launch as the RTX 6000 ADA and L60.

FP32 Compute Horsepower Comparisons (Higher is Better)

Compute Power

120

160

200

240

120

160

200

240

RTX 4090 OC

RTX 4090 Stock

RTX 3090 Ti

RX 6900 XTX

Xbox Series X

PlayStation 5

NVIDIA GeForce RTX 4090 'Official' Specifications - $1599 US Pricing

The NVIDIA GeForce RTX 4090 will use 128 SMs of the 144 SMs for a total of 16,384 CUDA cores. The GPU will come packed with 72 MB of L2 cache and a total of 176 ROPs which is simply insane.

As for memory specs, the GeForce RTX 4090 will feature 24 GB GDDR6X capacities that will be clocked at 21 Gbps speeds across a 384-bit bus interface. This will provide up to 1 TB/s of bandwidth. This is the same bandwidth as the existing RTX 3090 Ti graphics card and as far as the power consumption is concerned, the TBP is rated at 450W. The card will be powered by a single 16-pin connector which delivers up to 600W of power. Custom models will be offering higher TBP targets.

The NVIDIA GeForce RTX 4090 GPU officially hits retail shelves tomorrow when NVIDIA and custom card partners' designs become available to the public. You can check out our review here.

NVIDIA GeForce RTX 40 Series Official Specs:

Graphics Card Name	NVIDIA GeForce RTX 4090	NVIDIA GeForce RTX 4090 D	NVIDIA GeForce RTX 4080	NVIDIA GeForce RTX 4070 Ti	NVIDIA GeForce RTX 4070	NVIDIA GeForce RTX 4060 Ti	NVIDIA GeForce RTX 4060
GPU Name	Ada Lovelace AD102-300	Ada Lovelace AD102-250	Ada Lovelace AD103-300	Ada Lovelace AD104-400	Ada Lovelace AD104-250	Ada Lovelace AD106-350	Ada Lovelace AD107-400
Process Node	TSMC 4N	TSMC 4N	TSMC 4N	TSMC 4N	TSMC 4N	TSMC 4N	TSMC 4N
Die Size	608mm2	608mm2	378.6mm2	294.5mm2	294.5mm2	190.0mm2	146.0mm2
Transistors	76 Billion	76 Billion	45.9 Billion	35.8 Billion	35.8 Billion	22.9 Billion	TBD
CUDA Cores	16384	14592	9728	7680	5888	4352	3072
TMUs / ROPs	512 / 176	TBD	320 / 112	240 / 80	184 / 64	136 / 48	TBD
Tensor / RT Cores	512 / 128	456 / 128	304 / 76	240 / 60	184 / 46	136 / 34	TBD
L2 Cache	72 MB	72 MB	64 MB	48 MB	36 MB	32 MB	24 MB
Base Clock	2230 MHz	2280 MHz	2210 MHz	2310 MHz	1920 MHz	2310 MHz	1830 MHz
Boost Clock	2520 MHz	2520 MHz	2510 MHz	2610 MHz	2475 MHz	2535 MHz	2460 MHz
FP32 Compute	83 TFLOPs	TBD	49 TFLOPs	40 TFLOPs	29 TFLOPs	22 TFLOPs	15 TFLOPs
RT TFLOPs	191 TFLOPs	TBD	113 TFLOPs	82 TFLOPs	67 TFLOPs	51 TFLOPs	35 TFLOPs
Tensor-TOPs	1321 TOPs	TBD	780 TOPs	641 TOPs	466 TOPs	353 TOPs	242 TOPs
Memory Capacity	24 GB GDDR6X	24 GB GDDR6X	16 GB GDDR6X	12 GB GDDR6X	12 GB GDDR6X	8-16 GB GDDR6	8 GB GDDR6
Memory Bus	384-bit	384-bit	256-bit	192-bit	192-bit	128-bit	128-bit
Memory Speed	21.0 Gbps	21.0 Gbps	23.0 Gbps	21.0 Gbps	21.0 Gbps	18.0 Gbps	17.0 Gbps
Bandwidth	1008 GB/s	1008 GB/s	736 GB/s	504 GB/s	504 GB/s	288 GB/s (554 GB/s Effective)	272 GB/s (453 GB/s Effective)
TBP	450W	425W	320W	285W	200W	160-165W	115W
Price (MSRP / FE)	$1599 US / 1949 EU	12,999 RMB (China-Only)	$1199 US / 1469 EU	$799 US	$599 US	$399-$499 US	$299 US
Price (Current)	$1599 US / 1859 EU	12,999 RMB (China-Only)	$1199 US / 1399 EU	$799 US	$599 US	$399-$499 US	$299 US
Launch (Availability)	12th October 2022	28th December 2023	16th November 2022	5th January 2023	13th April 2023	24th May / 18th July 2023	29th June 2023

Deal of the Day

NVIDIA GeForce RTX 4090 Is The First Gaming Graphics Card To Deliver 100 TFLOPs of Compute Performance

Breaking The 100 TFLOPs Barrier! NVIDIA GeForce RTX 4090 Becomes The Fastest Gaming Graphics Card For Compute & Fastest Gaming Graphics Card, Period!

FP32 Compute Horsepower Comparisons (Higher is Better)

NVIDIA GeForce RTX 40 Series Official Specs:

Deal of the Day

Comments

Popular Discussions

AMD Radeon RX 7000 & NVIDIA GeForce RTX 40 GPUs Available Below MSRP Across All Models In Germany

NVIDIA Acknowledges “Strong Competition” In AI Market, Reaffirms Company’s Business Not Just Hardware But Software Too

Intel Battlemage “Xe2” GPUs Might Be Limited To DisplayPort 2.0 UHBR13.5 Support

AMD Strix Point Halo “55W” Ryzen APU Spotted, Strix Point “28W” Benchmark Leaks Out

NVIDIA’s Monopoly Over The AI Markets Isn’t Sustainable, Analyst Worries About Increasing GPU Power Consumption

NVIDIA GeForce RTX 4090 Is The First Gaming Graphics Card To Deliver 100 TFLOPs of Compute Performance

Breaking The 100 TFLOPs Barrier! NVIDIA GeForce RTX 4090 Becomes The Fastest Gaming Graphics Card For Compute & Fastest Gaming Graphics Card, Period!

Related Story NVIDIA GeForce GTX 2070 GPU Engineering Sample Spotted: 2176 Cores Instead of 2304, Can Be Flashed With RTX 2070 vBIOS

NVIDIA GeForce RTX 40 Series Official Specs:

Deal of the Day

Further Reading

Comments

Trending Stories

Popular Discussions