AMD Radeon HD 7970 Review: Meet Tahiti XT from the Southern Islands

**siopao1984** · 12-22-2011, 03:11 PM

After a year of less than stellar CPU launches (Bulldozer) and the fallout from the massive company-wide layoffs and manufacturing delays, AMD and their fans finally have something to cheer about this holiday season. Built on the radical 28nm 'Graphics Core Next' architecture and numerous another innovations, the US$549 Radeon HD 7970 flexes it considerable muscles in the high end enthusiast GPU market. In the first of a three part review, supercomputing expert and in-house Serbian Nebojsa Novakovic gives his take on what the new architecture will bring.

When AMD unveiled its GCN - Graphics Core Next - new generation graphics processor architecture as a part of its FSA - Fusion System Architecture - at the last June's Fusion Developer Summit in Seattle, it was quite a surprise: the next-in-line future GPU was to be transformed into a general purpose compute coprocessor, with far more flexibility and aim at wider range of applications well beyond graphics only - and, not to forget, easier memory space sharing with the CPU. Half a year later, the first implementation of this new architecture already sees the light.

The Radeon HD7970 is the fastest single GPU card today in most of 3-D graphics runs, but its compute power impresses, too: 3.8 TFLOPs peak single precision FP and, more importantly, 947 GFLOPs double precision FP at the defauly 925 MHz clock. Since the card overclocks easily by another 10% or so without even changing any voltages, you can, for the first time, easily have a true (peak) Teraflop DP FP engine in the PC. The 384-bit GDDR5 memory path with 3 GB RAM gives some 260 GB/s bandwidth to feed all that performance, with sufficient local memory for larger datasets than before. And, in those moments when the GPU does nothing, the idle power drops to below 3 Watts - not bad at all.

How did it achieve all this performance? The first implementation of GCN is based on the same 'Compute Units' that we described post-AMD Summit in June, 32 of them in this chip. Each compute unit, able to execute code from multiple kernels at once, is a processor by itself, in a sense a full-fledged 'core', the way we describe cores in general purpose processors. So, each compute unit has a vector and standard scalar processor core, plus a texture block with filtering and fetch units, all this with local registers, 64 KB data share memory and 16 KB L1 cache with 64 bytes/clock bandwidth. The branching and scheduling units complete the picture.

The cache hierarchy is intriguing here: each set of four compute units shares an extra 32 KB scalar data cache plus 16 KB instruction cache, backed by a common L2 cache, which totals 768 KB in the HD7970 chip across all compute units. This L2 then interfaces to the memory controller. A Global Data Share unit is there to enable sync among all compute units at the L1 cache level. This is far more complex cache architecture than what you'll see in today's general purpose CPUs, by the way.

In addition to compute units, there are still dedicated processing elements for 3-D graphics. Dual Geometry Units, as well as eight render back ends, plus the usual video acceleration hardware, complete the chip here. And yes, the system interface is PCIe x16 v3, finally.

Now, how does GCN in HD7970 achieve, as AMD claims, much better actual use of the underlying resources compared to the previous generation? First, you'll notice that the chip architecture looks far more symmetric and simpler to understand than the GPUs of the past, which should translate to the programs being able to use it more efficently. Also, the separate dual asynchronous compute engines help schedule multiple tasks in parallel with the graphics command processor, while dual direct memory access (DMA) engines help in fetching and sending data fast, able to saturate 16 GB/s over PCIe 3 to the system itself.

In some tests, like AES256, the resulting improvements in actual performance are over four times, but even many others have speed jumps higher than the theoretical FP speed boost from the 6970 to the 7970. Two times is a regular occurence in many GPU compute benchmarks here. And, very critical for the computing usage, the FP here is fully IEEE754 compliant while, if aiming for workstation or compute server use, you have ECC protection all the way, for both DRAM and SRAM memory.

The new architecture should finally enable easy GPU multitasking, not just many processes on one GPU, but also one task being spread across multiple GPUs, something that HPC users would welcome a lot - and high end PC users as well, since, say, 4 GPUs in Quad CrossFire could be used for more than just gaming FPS.

In summary, GCN in AMD Radeon HD7970 went a step further than Nvidia did in its current 'Fermi' GeForce generation in getting the GPU to become a more versatile system compute coprocessor. There are still further steps to take in getting the GPU even closer to the CPU, including the memory sharing and interconnect, however the improvements seen in this brand new chip should be a good note to both Nvidia and Intel, the latter with its Knights Corner accelerators, on the way forward.

**siopao1984** · 12-22-2011, 03:14 PM

The dashing looking Radeon HD 7970 measures 11.5 inches long (29.21cm), which should fit in most modern day gaming cases

PCI-SIG approved 8+6 pin power connectors for up to 300W of power consumption

No PCB Backplate on this reference card - hopefully the actual retail units will have them as they do provide some shielding against accidental metal contact short circuits

More than a dozen screws binding the card to the heatsink - which was a nuisance when we disassembled it

3-way Crossfire connectors are still present, as are the dual BIOS Switch for flash experiments

Dual-link DVI, HDMI and Mini-DP ports for up to 6 displays. The half-height exhaust grill design from 58xx/69xx cards is gone and replaced with a full height panel, which should help reduce air turbulance.

**siopao1984** · 12-22-2011, 03:16 PM

For your naked viewing pleasure

28nm 'Taihiti XT' Die shot - we didn't see any markings after removing the thermal paste

A metal shim is back with the actual die being slightly shorter - so heatsink compatibility with older generation cards might be a problem

12 x Hynix GDDR5 256MB flash chips clocked at 1375MHz are used in a 384bit ring bus configuration, resulting in 264GB/s of memory bandwidth

Power Delivery Design

The popular CHiL CHL8228G (6+2 PWM Controller) makes its return on the 7970, which should be good news to software voltage modders

**siopao1984** · 12-22-2011, 03:18 PM

The blower fan is now significantly larger than previous generation cards, with higher rated CFM and improved blade design for better cooling and acoustics

Tearing off the plastic shroud

6th generation vapor chamber design - with thermal pads for the rams and the bump for core contact

Can you count the number of fins?

..

The blower fan is rated at 1.7A @ 12V DC and has 2 ball bearings

**siopao1984** · 12-22-2011, 03:21 PM

Overclocking

The overclocking potential of the Radeon HD 7970 has been much vaunted in the recent leaks and it certainly didn't disappoint. Even without any software or hardware voltage mods, we managed to bring the CPU core from the default 925MHz up to the maximum 1125MHz that CCC would allow. We also turned memory up to 1500MHz and Power Control to +20% to prevent throttling.

Testing with FurMark - with the fan left at auto, it was still spinning silently and temperatures maxed out at a comfortable 86 degrees celcius

The updated GPU-Z 0.5.7 allows you to read card information off the 7970's SMBIOS

Every increment on the way, we went the extra mile to ensure that there was actually scaling (and not bs throttling). With clocks like these on stock voltages and cooling, you'll wonder how much higher it can go when software or hardware voltage modifications together with better cooling are applied to this card. Looks like the 28nm process die shrink did pay off...

**siopao1984** · 12-22-2011, 03:22 PM

Test Setup

CPU: Intel Core i7-3960X @ 4679MHz 1.48v
Motherboard: ASUS Rampage IV Extreme (latest 1004 bios with CPU Microcode/PCIe 3.0 fixes)
Memory: G.Skills RipJawsX 2133Mhz DDR3 16GB Kit
Power Supply: Cooler Master Silent M Pro 1000W
Storage: Intel 520 series 250GB SSD
Graphics Cards / Drivers:
AMD Reference Radeon HD 7970 3GB (Catalyst 11.12 Dec 20 Build)
ASUS Radeon HD 6970 DirectCU II 2GB (Catalyst 11.12 WHQL)
AMD Reference Radeon HD 6990 4GB (Catalyst 11.12 WHQL)
MSI N580GTX Lightning Xtreme Edition 3GB (Nvidia 290.36)

**siopao1984** · 12-22-2011, 03:25 PM

3DMark11 Performance Preset (720p)

Right off the mark, the 7970 demostrates its architectural superiority in this DX11 synthetic benchmark, with a huge ~40% lead over its older sibling 6970 and about 10% faster than the GTX580. Once overclocked, it threatens to come close to the multi-GPU leaders.

3DMark11 Xtreme Preset (1080p)

Pretty much the same story here once the resolution is turned up.

3DMark06 (DX9)

With today's cards, 3DMark06 is more sensitive to CPU/memory bandwidth and driver optimizations. Nothing facinating here but the 7970 is finally able to keep up with the Nvidia cards, which is traditionally the world record holders of this benchmark.

Unigine Heaven Benchmark 2.5 (DX11)

This benchmark is one of the most punishing, featuring heavy use of antialiasing, antiscopic filtering, DX11 tesellation and ambient occlusion - NVIDIA's traditional strongholds. With the new GCN arch and improved tesellation routines, the 7970 is 23% faster than the last year's GTX580 and daylight from the 6970 (54% faster).

**siopao1984** · 12-22-2011, 03:28 PM

Battlefield 3 (DX11) 1080p Ultra Preset

Going Hunting Single Player Level Fraps Runthrough

BF3 is one of the games we like to play here at VR-Zone for its stunning visuals and engaging gameplay. It is also one of the few game titles that squeezes every drop of performance out from your system (walls the GPU at 99% in-game) and the developers DICE has worked closely with both AMD and NVIDIA to make sure that their hardware is optimized for the game. As expected, the HD7970 has almost a 50% lead over the HD6970 and 15% over the GTX580. Once overclocked, it predictably pulls away. We would like to add to this section soon with Multi-monitor Eyefinity and Stereo 3D benchmarks.

Batman: Arkham City (DX11) with Patch 1

Batman Arkham City is one of the newer Unreal Engine 3 based titles under NVIDIA's "The Way It's Meant To Be Played" programme, featuring proprietary PhysX support for bullet effects and other clutter.

Crysis 2 (DX11) 1080p Ultra Downtown Benchmark

Crysis 2 is another of NVIDIA's influenced titles, generating some controversy about the abusive use of tessellation everywhere when the delayed DX11 patch came out. Still, with the help of the new GCN shaders, the 7970 finally puts NVIDIA in its place, even entering multi-GPU territory when overclocked.

Alien vs Predator (DX11)

Resolution: 1920 x 1080

Texture Quality: 3
Shadow Quality: 3
Anisotropic Filtering: 16
SSAO: ON
Vertical Sync: OFF
DX11 Tessellation: ON
DX11 Advanced Shadows: ON
DX11 MSAA Samples: 4

Street Fighter 4 (DX9) 1920x1080 C16xQAA

Street Fighter 4 may be around for quite some time now, but it is still surprisingly sensitive to GPU power scaling and it shows. NVIDIA seems to have put more time into optimizing their drivers for this game and takes the performance lead over the AMD cards.

**siopao1984** · 12-22-2011, 03:30 PM

ComputeMark (DX11 Compute Shaders)

Things get very interesting here - the 7970 completely thrashes the competiton, even multi-GPU ones!

Power Consumption Tests

Predictably due to the 28nm process shrink, AMD has brought their power consumption and efficiencies to new heights.

**siopao1984** · 12-22-2011, 03:32 PM

Conclusion

Priced at an elitist US$549, the AMD Radeon HD 7970 is positioned to compete against Nvidia's 1 year old GTX580 3GB, which it bests considerably in most tests. We also saw the remarkable overclock frequency headroom and compute power that the 28nm 'Graphics Core Next' architecture delivered in our review, which is a promising baseline for AMD to build on for the next few product refreshes. Monthly driver updates over the next few months should also help increase the performance delta, especially in newly released titles like Batman: Arkham City/Skyrim and Crossfire scenarios.

What we really need now is some healthy competition from NVIDIA to bring prices down - their mysterious high end GK104/GK100 Kepler part isn't scheduled to come out anytime soon...

PROS
Fastest Single GPU on the planet now
Excellent Overclocking Headroom
Acoustically tuned vapor chamber heatsink design
CHiL CHL8228G PWM / Great power consumption
Fantastic GPU Compute performance

CONS
Expensive
Questionable availability

Editor's note: We didn't think it was important, but Crossfire, Eyefinity 2.0 and SteadyVideo 2.0 testing will follow soon...

Thread: AMD Radeon HD 7970 Review: Meet Tahiti XT from the Southern Islands

Thread Tools

Search Thread

AMD Radeon HD 7970 Review: Meet Tahiti XT from the Southern Islands

Re: AMD Radeon HD 7970 Review: Meet Tahiti XT from the Southern Islands

Re: AMD Radeon HD 7970 Review: Meet Tahiti XT from the Southern Islands

Re: AMD Radeon HD 7970 Review: Meet Tahiti XT from the Southern Islands

Re: AMD Radeon HD 7970 Review: Meet Tahiti XT from the Southern Islands

Re: AMD Radeon HD 7970 Review: Meet Tahiti XT from the Southern Islands

Re: AMD Radeon HD 7970 Review: Meet Tahiti XT from the Southern Islands

Re: AMD Radeon HD 7970 Review: Meet Tahiti XT from the Southern Islands

Re: AMD Radeon HD 7970 Review: Meet Tahiti XT from the Southern Islands

Re: AMD Radeon HD 7970 Review: Meet Tahiti XT from the Southern Islands

Similar Threads

AMD Radeon HD 7970 GHz Edition Review - Tahiti's Boost from Overclocking and Drivers

AMD Radeon HD 7970 X2 suspected to be in the works

AMD Radeon HD 7970 Review

AMD Radeon HD 7970 Pricing Revealed

AMD Radeon HD 7970 3GB Graphics Review - PC Perspective

Posting Permissions

about us

follow us