Forging ahead with a chiplet design philosophy tied to an all-new RDNA 3 GPU architecture, AMD is taking a calculated risk by introducing two premium graphics cards to its 2022 roster. Radeon RX 7900 XTX 24GB and RX 7900 XT 20GB are the cards in question, priced at $999 (£999.99) and $899 (£899.99), respectively.
AMD Radeon RX 7900 XT
£899 / $899
Pros
- Great raster performance
- Sensible pricing
- 20GB memory
- DisplayPort 2.1
Cons
- RT not at Nvidia levels
- No frame generation (yet)
Club386 may earn an affiliate commission when you purchase products through links on our site.
How we test and review products.
Before we delve deeper into benchmarks posted by the Radeon RX 7900 XT in this review, it is important to understand and appreciate the motivations that have led AMD down this path.
Sit back, grab a cup of coffee, and let us take you through the wonderful world of chiplets, RDNA 3 architecture, and performance potential among the best graphics cards.
RDNA 3 – Faster And Smarter
Understanding the impetus behind AMD’s newest GPU architecture first requires an appreciation of two factors affecting the production of cutting-edge silicon. As we move down to ever-smaller nodes, the first is a lack of scaling in two important metrics: memory and analogue I/O, both of which are liberally used in GPU design.
The point here is it’s not worth having these two facets on the latest, most expensive nodes if there is no immediate benefit to doing so, because you’re paying over the odds for no discernible benefit. On the other hand, logic – the building blocks for the core blocks – appears to still scale well, meaning it’s worth investing in smaller processors for attendant gains.
The second point is related to the first insofar as the cost of moving between the latest nodes is becoming significantly more expensive. Only use, say, 5nm production if you absolutely have to, as while it makes powerful chips physically feasible, there comes a time when they’re not financially viable. For example, building a monster, monolithic 700mm² die on purely 5nm is an exercise in inflating end-user cost to arguably unpalatable levels.
a strategy that has worked wonders in the CPU space with chiplet-based Ryzen processors since 2017
Such thinking coalesces into AMD’s fundamental fabrication strategy for high-end graphics moving forwards. Knowing GPUs are essentially huge calculators processing instructions in parallel, why not break down the constituent parts of the design into small blocks, which naturally yield better, and lasso them together with high-speed smarts? It’s a strategy that has worked wonders in the CPU space with chiplet-based Ryzen processors since 2017. But if something was easy, it would have been implemented by now.
GPU Chiplets Provide Unique Challenges
Though brilliant on paper, there are myriad reasons why a chiplet-based GPU is a difficult proposition, exemplified by the graphic below.
Jumping from a monolithic CPU design to a chiplet-based approach requires invention of high-speed, low-latency interconnects between what are termed Core Complex Dies (CCDs) and a central hub. That link is AMD-developed Infinity Fabric, which connects said chiplets with 100s of wires between dies.
A fantastic feat of engineering, AMD cannot take the same approach when looking at a chiplet-based GPU, and the reason has to do with the massive amounts of data traversing a GPU at any moment. Having upwards of 5,000 cores means the bandwidth required is an order of magnitude higher, and a standard Infinity Fabric approach simply cannot carry that amount of information in a design that has to sell for a set fee to make money.
Putting this problem into intelligible context, AMD estimates a multi-chip GPU needs over 10x the bandwidth offered by the latest Epyc server processors. What to do and how to do it, eh? This is where smart engineering comes to the fore.
AMD’s approach is to flip the problem on its head, and by that we mean build a multi-core chip where the bandwidth-hungry memory partitions are manifested as smaller chiplets while the big computational engines are kept in a larger central die. If you recall, Ryzen processors do the reverse by moving cores to chiplets and memory to the IOD.
RDNA 3 is therefore built by having multiple Memory Cache Dies (MCDs), fabricated on a 6nm process, circling a larger, 5nm-based Graphics Core Die (GCD). Each MCD carries L3 cache and GDDR6 memory links to the GCD, and accumulating the contents of each MCD provides a good inkling of a card’s memory-side potential.
The GCD meanwhile, and just like Highlander there is only one, is home to familiar Compute Units (CUs) resident with shader cores, ray tracing cores, texture units, and the like.
Linking MCDs to the GCD with the requisite bandwidth is the real triumph for RDNA 3. AMD calls this special sauce Infinity Fabric Fanout Technology (IFFT), illustrated in the slide above. Going by AMD’s presentations, each wire is far, far smaller than its CPU counterpart. If we crunch the numbers, GPU-optimised IFFT is over 50x as dense and provides 10x higher bandwidth than Infinity Fabric On Package (IFOP) used for CPUs. Take a moment for that to sink in.
We needed a new package technology and a new link technology. And that’s what we have implemented.
Sam Naffziger, AMD Senior Vice President and Product Technology Architect
Without IFFT, make no mistake, there is no chiplet-based RDNA 3, as scaling incumbent CPU IFOP technology would result in huge amounts of power diverted solely for the interconnects, let alone any die-space ramifications. Nevertheless, building links between chiplets isn’t a free energy lunch. IFFT takes crucial space and power to perform its high-speed data shuttles at 9.2Gb/s. We estimate adding relevant links, as opposed to a purely monolithic design, increases die area by close to 15 per cent while adding precious watts to overall consumption.
As it is, each TSMC-fabricated 6nm MCD weighs in at around 37mm² and houses some 2bn transistors. That’s a combined 222mm² and 12bn transistors right there for Navi 31 silicon. CU-housing GCD is a significant 300mm², meaning the combined size of the Navi multi-chip GPU is 522mm².
An interesting question arises. Is it better to have a 5nm monolith chip at 450mm² – the size we estimate Navi 31 would be on a single piece of silicon; there would be no space- and power-taking IFFT links – or a multi-chip 5nm/6nm at 522mm²? Not an easy one to answer, and one we’re sure AMD’s bean counters wrestled with.
Another problem facing an IFFT solution is, on paper, one of latency. AMD acknowledges running Infinity Links will always create more latency than conducting transfers within one monolithic die. This issue is shown by the Navi 31 bars marginally higher than Navi 21. However, the combination of a 43 per cent higher IF clock and higher game clock ameliorates the issue. In fact, AMD contends Navi 31 has lower latency… but how much lower could it have been on a monolithic design?
We come away from the chiplet discussion with the thinking that AMD has spent a lot of time and resource in mitigating obvious issues emanating from going down a multi-chip route on a GPU. The pain is worth the long-term benefits, according to AMD, but it seems like a lot of effort.
The chiplet approach ought to play dividends further down the line when we have, say, 12 MCDs surround one or two GCDs. Breaking down the parts into manageable chunks, with presumably better yields than a monolithic monster, is where AMD is aiming. Whether it’s chiplets or tiles, we can foresee Nvidia adopt a similar approach in years to come.
RDNA 3 – Efficiency At The Core
Feast your eyes on that, will you? Having marked off the first part of our discussion, as it pertains to a chiplet design and the whys and wherefores of going down that road, the second is RDNA 3 architecture.
Compared to Navi 21, AMD claims a 54 per cent performance-per-watt increase
It’s hard to miss AMD’s claims of RDNA 3 ‘architected to exceed 3GHz.’ This certainly isn’t the case in the first two cards based on this design, namely Radeon RX 7900 XTX and XT, suggesting AMD hasn’t achieved lofty frequency aspirations in the first retail salvo. A misjudgement of the architecture or relatively poor frequency yields from foundry partner TSMC? Probably a bit of both.
Nevertheless, there is positive news. Compared to Navi 21, which you’ll know as high-end Radeon RX 6000 Series, AMD claims a 54 per cent performance-per-watt increase. That’s nothing to be sniffed at, of course, and the rest of our discussion focusses on how AMD has achieved this number.
Performance naturally goes up for Navi 31 as it fits in more Compute Units than its immediate predecessor, and the 80CU-96-CU notation gives a 20 per cent boost right off the bat. Shoving in more CUs is entirely expected when moving between architectures and processes. Low-hanging fruit and all that.
Yet the devil is very much in the details. Take another look at the two CU blocks, or pairs. AMD refers to it as a single CU, of which, if you recall, there are 96 in the Big Boy design. But it’s not one CU, is it, as the second is a mirror of the first.
This has important ramifications for performance, and why comparing generation-on-generation is fast bordering on pointless. AMD has effectively doubled each CU’s ALU capability – initialisms, ahoy – meaning there’s double the throughput for floating-point tasks on a shader-to-shader basis.
Breaking it down, each RDNA 3 64-ALU block, or CU, has twice as many FP32 pathways of RDNA 2. Here’s where terminology becomes difficult when referring to ALUs, FP32 units and INT32 units, but suffice to say, FP operations, which are a mainstay of gaming, are much improved.
On the integer side, AMD has all-new AI Matrix Accelerators which include bfloat16 and WMMA64 Dot4 support, primarily to help with convolution, which is one of the fundamental operations of deep neural networks with demanding matrix computation. In other words, it is AMD’s answer to Nvidia’s Tensor Cores. Little more is known of the technology other than basic specs and a purported 2.7x matrix speed increase compared to RDNA 2.
Cache Me In
Having a more robust, faster core and memory subsystem helps, but most iterative GPU architectures tend to devote increasing amounts of space for larger caches. Just see what Nvidia did with RTX 40 Series.
Going through the information, it’s easy to see where AMD has doubled sizes compared to RDNA 2. Certainly not the massive increases instigated by Nvidia, yet it stands to reason that caches need to be upgraded to deal with more innate processing power.
The one outlier from this approach is AMD Infinity Cache. Those familiar with how AMD builds recent GPUs will know Radeons have a slab of goodly cache residing between on-chip L2 and external GDDR6. Call it an L3 if you will. RDNA 2’s Infinity Cache topped out at 128MB on Radeon RX 6950 XT. AMD drops it to 96MB for RDNA 3. If larger caches are better and there’s been plenty of increase on L0, L1, and L2, what gives? Good question.
It’s not the size, it’s how you use it, seems to be AMD’s response. Rather than scale this apparent L3 to 128MB+, which takes up valuable real-estate space, AMD has increased the conduit between L2 and Infinity Cache by 2.25x. That’s no small potatoes. Aforementioned RX 6950 XT runs off a 1,024 bytes per clock and has an SoC operating at 1.93GHz achieving total bandwidth of 2TB/s. Navi 31’s best SoC operates at 2.3GHz. Do the math and we reach 5.3TB/s at this juncture alone, without taking the much wider, faster GDDR6 into account.
In that regard, Navi 31 (RX 7900 XTX) has a maximum 384-bit memory bus feeding GDDR6 memory operating at 20Gbps, whereas Navi 21 (RX 6950 XT) is only 256 bits wide and 18Gbps. 576GB/s cannot compete with 960GB/s on today’s champ.
Ray Tracing Overhaul
AMD introduced dedicated hardware ray tracing units in RDNA 2. Yet while rasterisation performance was roughly analogous to price-comparable Nvidia RTX 30 Series, RT lagged behind considerably. The gap has become a chasm with the introduction of RTX 40 Series, so if you want best-in-class lighting, Team Green is the way to go.
Looking to arrest this gulf, AMD has invigorated the RT units within RDNA 3. As RT hardware is tied to CUs on a one-to-one basis, there is more ray tracing potential through sheer numbers – 96 Ray Accelerators vs. 80.
we’d put the best-case RT scenario as one matching premium RTX 30 Series cards
AMD’s focus on RDNA 3 isn’t an entirely new RT unit – that would be too costly to implement at this stage – but one of improving what it has to work with. Part of this rests with efficiency, insofar as not doing work that’s of no use. RDNA 3 introduces an early subtree culling to remove unnecessary calculations by skipping parts of the acceleration structure during traversal.
There are further improvements, too, from having more rays in flight to natural uplifts caused by larger caches, CUs, and so forth, but the bottom line is that combined efforts are no silver bullet; we’re not going to see Nvidia RTX-like numbers anytime soon. In fact, though AMD contends an 80 per cent improvement for RDNA 3 over RDNA 2, we’d put the best-case RT scenario as one matching premium RTX 30 Series cards. RTX 40 Series will remain in a different league.
Bits And Bobs
Head back up and look across to the advancements in the pixel pipe. Complementing the increased top-end muscle, AMD ups ROP count by 50 per cent, from 128 to 192.
We touched on bandwidth hikes earlier, yet it’s prudent to recap. AMD’s Infinity Cache is 165 per cent faster and external memory affords 67 per cent over the previous generation. You gotta feed the CU beast.
On the display side of things, AMD keeps to HDMI 2.1a of the previous generation and adds in DisplayPort 2.1, which is a feature missing on rival Nvidia RTX 40 Series cards.
A Dual Media Engine brings hardware-accelerated support for AV1 encode and decode up to 8K60, among other niceties, and outputs are such that a single card can drive four 4K144 displays.
Enter Radeon RX 7900 XTX and XT
Radeon | RX 7900 XTX | RX 7900 XT | RX 6950 XT | RX 6800 XT |
---|---|---|---|---|
Launch date | Dec 2022 | Dec 2022 | May 2022 | Nov 2020 |
Codename | Navi 31 | Navi 31 | Navi 21 | Navi 21 |
Architecture | RDNA 3 | RDNA 3 | RDNA 2 | RDNA 2 |
Process (nm) | 5/6 | 5/6 | 7 | 7 |
Transistors (bn) | 57.7 | 57.7 | 26.8 | 26.8 |
Die size (mm2) | 522 | 522 | 520 | 520 |
Compute Units | 96 of 96 | 84 of 96 | 80 of 80 | 72 of 80 |
ALUs | 6,144 | 5,376 | 5,120 | 4,608 |
Boost clock (MHz) | 2,500 | 2,400 | 2,310 | 2,250 |
Peak FP32 TFLOPS | 61.44 | 51.61 | 23.65 | 20.74 |
RT cores | 96 | 84 | 80 | 72 |
AI cores | 192 | 168 | – | – |
ROPs | 192 | 192 | 128 | 128 |
Infinity Cache (MB) | 96 | 80 | 128 | 128 |
Memory size (GB) | 24 | 20 | 16 | 16 |
Memory type | GDDR6 | GDDR6 | GDDR6 | GDDR6X |
Memory bus (bits) | 384 | 320 | 256 | 256 |
Memory clock (Gbps) | 20 | 20 | 18 | 16 |
Bandwidth (GB/s) | 960 | 800 | 576 | 512 |
Power (watts) | 355 | 315 | 335 | 300 |
Launch MSRP ($) | 999 | 899 | 1,099 | 649 |
All of our discussion has rightfully centred on the overarching RDNA 3 architecture and consequent Navi 31 GPU. Both terms describe the available hardware tools from which AMD constructs retail cards. Those are Radeon RX 7900 XTX and Radeon RX 7900 XT, hewn from Navi 31, albeit differently.
Shifting gears to RX 7900 XT, it uses 57.7bn transistors spread over the five (out of six) MCDs and GCD. As each MCD contains 64-bit access to 4GB of GDDR6 memory, deactivating one reduces the framebuffer size to 20GB and bus to 320 bits.
Reductions in the MCD quantity are met with cuts to the GCD. Full-on Navi 31 is home to 96 Compute Units, which is what we see on Radeon RX 7900 XTX. The XT variant loses 12 CUs, dropping to 84 overall, and that means shaders also fall from a maximum 6,144 to 5,376. Get the feeling AMD wants this card to be up to 15 per cent slower? The Infinity Cache is also trimmed, down to 80MB from 96MB.
Peak boost speed of 2,400MHz translates to 51.61TFLOPS throughput, which is still more than double that of Radeon RX 6950 XT. Go stick that in your pipe and smoke it, Navi 21!
If we’re being critical, Radeon RX 7900 XT 20GB feels overpriced compared to its more powerful sibling. The ideal price is $799, leaving plenty of room for partners to build overclocked models occupying the larger gap between the pair. AMD doesn’t want to sell any Navi 31 silicon that cheap, understandably enough, but it’s forcing add-in board partners down a financial cul-de-sac.
The Card
Radeon RX 7900 XTX is covered over here, but this review’s interest is the lesser, but still very capable Radeon RX 7900 XT.
The MBA reference card follows general design cues laid down by its bigger brother, extending to a stealthy black aesthetic and Batman-style curves and accents. It’s a good-looking thing, all right, and partners are going to have a difficult time in making their designs more petite or more beautiful.
Speaking of the former first, AMD manages to shoehorn the 315W TDP second-rung Radeon RX 7000 Series GPU into a card measuring 276mm long, 113mm tall, and 50mm thick. Matching XTX’s 2.5-slot footprint, XT remains smaller in other dimensions, enabling easy fitment into space-restrained chassis.
All in all, Radeon RX 7900 XT’s volume extends to 1,394cm³, compared with 1,800cm³ for RX 7900 XTX and a whopping 2,540mm³ for Nvidia’s rival GeForce RTX 4080 Founders Edition. Mentioned tongue-in-cheek, AMD may well have a performance-per-card-volume graph in upcoming briefings.
Weighing in at 1,520g, or about 290g lighter than XTX, most of the comparative weight loss is through the use of a smaller heatsink. AMD also changes up the power delivery quality, which drops from XTX’s 20 phases to 17 here.
Build quality, we must add, is exemplary. There are no rattles, squeaks, or use of substandard materials anywhere. This is a premium card which befits the top-tier performance AMD is aiming for.
The full-coverage rear heatsink is a nice touch. Not only does it look good, but extra card rigidity is formed by having an all-round cooling solution. Being 10mm less tall than the XTX model means the rear heatsink, whilst looking almost identical, is bespoke for this model.
And this relative lack of height compared to its bigger brother manifests in other ways, too, as the three fans are 5mm smaller and there’s no space – or AMD chose not to include – the two lighting strips above and below the central spinner.
RX 7900 XT shifts the Radeon branding to a full-on horizontal orientation – XTX’s is angled – but otherwise the two pixel munchers look eerily similar.
The dual eight-pin power connectors can’t have escaped your attention, right? This 315W TDP card doesn’t go down the Nvidia single-connector route, and all the travails that have beset it, and trusts in the usual arrangement of today’s high-end GPUs. Nothing wrong with that.
AMD’s ‘Radiance’ Display Engine feeds the four I/O ports. Offering two DisplayPort 2.1 – compatible monitors coming in early 2023, we’re informed – there’s also HDMI 2.1a and USB-C which also supports DisplayPort 2.1. AMD makes a big deal out of this feature as rival GeForce RTX 40 Series, for some strange reason, persist with DisplayPort 1.4a.
How about some benchmarks, people.
Performance
Our 5950X Test PCs
Club386 carefully chooses each component in a test bench to best suit the review at hand. When you view our benchmarks, you’re not just getting an opinion, but the results of rigorous testing carried out using hardware we trust.
Shop Club386 test platform components:
CPU: AMD Ryzen 9 5950X
Motherboard: Asus ROG X570 Crosshair VIII Formula
Cooler: Corsair Hydro Series H150i Pro RGB
Memory: 32GB G.Skill Trident Z Neo DDR4
Storage: 2TB Corsair MP600 SSD
PSU: be quiet! Straight Power 11 Platinum 1300W
Chassis: Fractal Design Define 7 Clear TG
Our trusty test platforms have been working overtime these past few months, and though the PCIe slot is starting to look worse for wear, the AM4 rigs haven’t skipped a beat.
Knowing the XT vintage is missing 12 CUs, has lower peak clocks and less memory bandwidth to play with when compared to full-fat XTX, it does rather well in our first test.
Comfortably faster than a last-gen RTX 3090 Ti – which, if you recall, cost $1,999 – AMD’s $899 offering posts good value from the off.
It’s good to see a second-rung AMD card provide better ray tracing performance than the best model of the last generation, to the tune of 52 per cent. Problem is, rival Nvidia GPUs are ever so strong in this important area. A good step forward for AMD, but there is a long way to go.
Hmm. Something’s not quite right with the results from this test. Investigation hat on.
Assassin’s Creed Valhalla
It is right to look at RX 7900 XT as a capable 4K GPU. Beating RTX 3090 Ti and RX 6950 XT with comparative ease, the signs continue to look good.
Cyberpunk 2077
We stick the ray tracing wrench right into Cyberpunk 2077 by benchmarking the game with RT Ultra settings. Smashing AMD’s revised RT cores with massive load exposes that performance in this area remains an Achilles heel.
Let’s be pragmatic. AMD had very little chance in matching Nvidia RTX 40 Series’ performance; any gain over Navi 21 is looked upon favourably.
Far Cry 6
Mixing rasterisation with a modicum of RT, second-tier RX 7900 XT continues to best all cards from the last generation. That’s an achievement in itself.
Final Fantasy XIV: Endwalker
The way in which a game engine tickles the architecture predicates performance. RDNA 3 still does well here.
Forza 5
We’re not going to complain at 4K100 performance with all the bells and whistles turned on. Radeon RX 7900 XT, you’ve got some game.
Marvel’s Guardians of the Galaxy
Radeon cards have a hard time in maintaining good minimum framerates in this RT-heavy title. An easy Nvidia win here.
Tom Clancy’s Rainbow Six Extraction
Another case of RX 7900 XT being better than any card from the previous generation, and managing 4K120 to boot.
Power, Temps and Noise
A 315W TGP leads to system-wise power consumption in line with an RTX 4080. No complaints here.
RX 7900 XT’s cooler isn’t as massive as XTX’s. Nevertheless, under-load temps remain excellent.
As we mentioned in the XTX review, AMD could relax the temperature target and aim for a quieter-running card, though noise levels didn’t prove distracting during real-world gaming.
Comparisons
This graph is interesting as it takes in relative performance across our games. Our calculations suggest that even with three ray tracing titles thrown into the mix, which tend to diminish performance more on Radeons than GeForces, the RX 7900 XT offers performance that’s better than any previous-gen card.
Sure, it would be nice to have it closer to RTX 4080, but we’re comparing an $899 card to an $1,199 card in that instance.
Has AMD priced this model just about right? Our value chart, which divides average framerate by each GPU’s dollar MSRP, reckons so.
It’s important to note RX 7900 XT 20GB beats out last-gen RX 6950 XT 16GB in each of the three important graphs displayed above. That’s impressive given the overheads imposed by opting for a chiplet architecture.
Conclusion
AMD has taken on significant risk with Radeon RX 7900 Series GPUs. Shifting to a chiplet-based architecture is the largest of them all, as building the necessary glue to hold constituent parts together – multiple MCDs to a central GCD – is no small engineering beer.
Looking toward the future where exorbitant silicon costs and mediocre yields on large monolithic dies instigate problems in building cost-sensitive GPUs, AMD’s taken some of the pain now… and succeeded.
Radeon RX 7900 XT 20GB is a better all-round GPU than anything from the previous generation. That’s a genuinely impressive feat for an $899 graphics card endowed with brand-new chiplet technology, so kudos to the engineering effort.
Innate rasterisation performance is strong enough to offer a great 4K60 experience in titles where quality levels are turned up to 11, without even touching the sides of framerate-boosting technology such as FSR.
All that said, AMD still remains comparatively weak in ray tracing and doesn’t have an immediate answer to Nvidia’s impressive frame-generation technology, which are two factors that can elevate a games-playing experience from good to great.
And that’s the rub. No premium graphics card is perfect. Nvidia’s forward-looking tech is only available on GPUs starting at $1,199, at least for now, so it becomes easy enough to recommend the $899 Radeon RX 7900 XT 20GB GPU to users whose budget only runs that far.
Verdict: Chiplets meet RDNA 3 in sensible fashion to create a great gaming card priced south of $900.