17 May 2023

The Power Scaling of the RTX 4070


Literally, again, exactly what happens...

As is my wont, I have decided to poke and prod at any and all hardware within my nefarious reach. Since I've picked up the RTX 4070, why should this particular product be spared? Because it's small and cute? 

"NO!", says the wise man. "They should be subject to the woes and wiles of mortals as much as any other product!"

And so the story goes, again and again... Join me within these pages* where I will outline the limitations of the new sacrificial lamb.
*Technically, no pages are present given the format of this blog...

A Game of Two Halves...


The RTX 4070 is a surprisingly complicated release. As I noted in the prior blogpost, The range in power delivery of this card is quite large, with the cheapest cards pulling a hard limit of 200 W and the higher end models drawing up to ~240 W. The problem here is that extra 40+ Watts only grants a 6.6% performance bonus (according to TechPowerUp, when using one of the strongest systems available in a synthetic benchmark).

In a rather unexpected turn of events, I'm going to start out with the summary of my findings, rather than the data:

My initial assessment was that the card is, in general, extremely power limited but I am questioning the logic of this. Aside from my own testing confirming this, TechPowerUp's reviews show a card that runs at a consistently high frequency, without thermal throttling. All differences (and there are a couple of fps average differences between the highest and lowest end models are really based on sustained core frequency differences, rather than any other limit such as data or power. 

As such, my current assessment is that the RTX 4070 is essentially a card pushed to it's almost absolute limit: there is virtually no headroom to be had, regardless of how much power you are able to push to it. The reason for this is the hard voltage limit on the core: you just can't push it at all! In that vein, we can see that the upper limit of core frequency is limited by the core voltage stability.. which I have found is extremely tight around the verified core target.


The RTX 3070 allowed decent variation of voltage settings when playing with the card...


Whereas, with the 30 series releases, you were able to drop core voltage by a fair margin, maybe 100 mV, and maintain target core frequencies without issue.

Perhaps the most surprising aspect, for me was the similarities between the behaviour of the RTX 4070 and the RX 6800 - okay, these are both chips that are manufactured on TSMC processes, but the scaling behaviour was more similar between the two than not: the 4070 would try and reach the maximum set core frequency, regardless of ability to do so (causing crashing) as the RX 6800 did. In comparison, the 3070 was easier to manage because it would automatically throttle core frequency instead of pushing it to a point of instability. 

Maybe this observation is a coincidence...

The end conclusion, from my part, appears to be that the RTX 4070 is an incredibly finely balanced part - neither power limited or memory bandwidth limited. The only potential limitation is the core voltage and, as I said, this appears to be either a limitation on the part of the cards... or of TSMC's process node.

So, with that out of the way, let's get to the data...


Power Scaling tests...


Metro Exodus: Enhanced Edition tasks the entire silicon, and I've found it was an excellent stability predictor during my RX 6800 scaling tests...


Going back to the mainstay of Metro Exodus: Enhanced Edition, I found that the RTX 4070 was able to clock around 180 Hz higher than at stock settings ~3000 MHz core frequency. Additionally, I was able to increase the memory throughput from 21 Gbps to 23 Gbps without performance regression or insertion of instability. 

Unfortunately, all of this effort was essentially for naught: a measly 4% gain from the stock settings in Metro Exodus. Additionally, I found (as stated above), that the core was not reliably stable when messing around with the voltage. Playing around with the RTX 3070 also gave me around a 6% increase in performance with a 10% power boost from stock (with core and memory frequency, and voltage optimisations) in Unigine Superposition... but the RTX 4070 had no further bonus to add to its 2% gain because of the lack of ability to add more power to the silicon. 

The end result is that both parts are better suited to undervolting/underclocking/power-limiting than trying to push past their stock performance.

At 80% power limit, with the overclock, we're getting near stock performance...


While unable to go past 100% power limit, it is clear that with a slight core and memory frequency overclock that the RTX 4070 is able to maintain stock performance at 80% the power!


One important item to note is that the RTX 4070's performance drops off at a faster rate with lower power limits, compared to the RTX 3070. The latter card is able to hold 10% more performance at 1080p and a 50% power limt when compared to the RTX 4070.

As for the RTX 3070, I retested that power curve with the knowledge I've gained from the RX 6800 and RTX 4070 and it appears that the card has had around a 2% performance increase, at stock (and also when overclocked) purely through driver updates since my original testing.

Conversely, Unigine Superposition shows that the RTX 4070 scales less well when not all of the silicon is put to the test - despite it being a ray tracing benchmark, the test is software-only, meaning that the advantages of the hardware are not put into question. The result of this can be observed in the flat profile that the scaling of this test generates.


Superposition shows the same trend in power/performance/overclocking...



While more linear in power usage w.r.t. performance, Superposition scaling shows the limitations of a pseudo synthetic test...



Conclusion...


This is a rather short and sweet post. 

The RTX 4070 doesn't have a lot of upward mobility within its sheathed design. However, it is a very power efficient part that is readily able to reduce power usage by 20%, whilst maintining the same performance. This is great from a user perspective, especially in the current period where electricity prices are at a premium in certain parts of the world.

Despite similar performance scaling curves to the RTX 3070, the RTX 4070 shows us that it is capable of a 28% performance increase with 20 W less power usage, at stock and when optimised. While the 30 series part is clearly limited from pushing forward by lack of available power, the 40 series part has no clear road to gaining more performance, other than new voltage regulators on the circuitboard - something that is not so simple to pull off.

What is clear, though, is that the 3070 has been very slightly improved by the Nvidia drivers since launch - meaning that any disparities between the launch performance of that card and the launch version of the 4070 have been very slightly closed. As such, I fully expect the performance of the RTX 4070 to extend beyond the current 28% limit at 1080p to around 30 - 32%, over time.

Yes, this isn't an exciting result but I am certain the performance figures will be improved on systems with more powerful processors that can better feed the GPU hardware... However, as it stands, in the mid-range, users are looking at a decent, if unexciting uplift in real-world computational performance.

No comments: