Hole in my head: RTX 40 series - aka "Did Nvidia jump the shark"...

Yes, I *splashed out*...

Now that Nvidia have essentially completed their consumer RTX 40 series cards, it's time to look back at the releases and take stock of what's happened.

We've seen the now usual range of responses: from cynical and jaded gamers and games media, to acceptance from those who are coming to terms with the price to performance ratio Nvidia is now asking.

Bundled up in all of this has been, for me, the question of whether Nvida pushed too far, too fast? Let's take a look...

Performance per generation relative to the 'top' card. Data taken from Jarred Walton's excellent performance charts over at Tom's Hardware...

The Backlash...

Although there was a big shock regarding the prices of the RTX 20 series relative to their raster performance compared to the prior generation parts, there was a good portion of the user base that were excited about ray tracing and what it could bring to gaming.

With the RTX 30 series, people were happy about the general uplift in performance for all three cards announced at launch - the 3070 giving the performance of the 2080 Ti for half the price was especially interesting, if not the best price we've ever seen historically for that class of card...

However, with the RTX 40 series, I think we have seen almost the entire media industry* and online vocal gamers shocked and a little outraged over what Nvidia have presented to them... and I believe this reaction is entirely warranted. Especially given the cash grabs lower down the stack.

I don't think we've seen such a consensus on how poor the relative value a new generation of GPUs is over the prior for quite a while**. However, what we might be seeing is that this might have been the time that Nvidia have pushed too far... their built-up goodwill spent cheaply.

*There were some less critical looks at the various announcements, drawing conclusions which, I believe, will change once the products are in reviewers' hands.

**People really didn't like the RTX 2080 and 2080 Ti release but driver improvements have shown that those products are better than they were at release - indicating that they were probably released before they were truly done and dusted! The RTX 2080 is now posting a 25% lead over the 1080 Ti, whereas at launch it was basically neck and neck... Essentially, the RTX 20 series had improvements to the structure of each SM over the GTX 10 series but the RTX 40 series has no such improvements that could be expected to be optimised for over time in the software/driver!

A little while ago, I predicted that this generation of cards would increase in cost by a tier +20-30%. That has come to pass with a couple of these cards: the RTX 4070 is 20% more expensive than the RTX 3070 and the 4070 Ti is 33% more than the prior generational equivalent - though cards with names matching those in the prior generation above and below those points scale in a random manner in either direction. I might say, if I was so-inclined, that I wasn't that far from the truth...

The problem here is that it isn't the case when you look at the actual relative performance and hardware on each product!! When you look at the actual hardware per SKU as a percentage of the '90 class card in each generation, the price increases by 50 - 100% per card tier under the 4090. Yes, it was all fun and games when people (including myself) said that the RTX 4070 should have been a 4060 Ti but it's worse than that on the technical front - most cards in the 40 series are two tiers down in performance whilst being a tier up in price.

Relative specs of each card to the 90 class in the generation...

Sure, I've been saying that the increase in performance relative to the top card per gen has been increasing and I was expecting that to come with an associated increase in price as well... with the lower end cards stagnating to a terrible extent. However, the RTX 40 series release has completely upended that trend: we are in a generation where the top-end card is around 1.60x the 3090/3090 Ti in performance at 4K resolution but the "80 class" card (in the form of the 4080) has taken a HUGE dive to around the 70 Ti class level of performance and that hasn't been seen for the last few generations from Nvidia*.

*I've gone back to the GTX 9 series in this analysis and the RTX 40 series is the worst performance uplift per generation when looking at the whole stack.

The performance uplift per Nvidia defined class of card is the lowest this generation though it has been dropping each generation since the GTX 10 series in 2016...

Unfortunately, when you look at the RTX 40 generation in this manner (and many others have before me - I'm not really doing anything new at this point), we can see that Nvidia have "downgraded" the performance of each class of card below the RTX 4090 to one or two tiers below where they should be. It's only because of the amazing performance uplift and effciency improvement in power usage and architecture* we have any uplift to speak of at all!

*That ~1 GHz improvement in clockspeed really appears to be doing wonders, along with that larger L2 cache!

What we can see is that if you purchase an expensive card, you really are not getting the best uplift in performance unless you shift up the resolution to 1440p or 4K. This is disappointing from my perspective because it was only two years ago that I was of the opinion that 1080p gaming should be dying by now and all cards within a generation should be targeting 4K as a standard. Now, I guess that technically this is true with the RTX 40 series but we're really scraping by and those are average FPS numbers, meaning that the minimum fps expereinced by the player** is going to be below that half the time.

**Beaing in mind that I mean the minimum of the fps metric and not the incorrectly converted minimum frametime value!

If we switch this generational performance analysis around and instead look at where each card "should be", with regards to resource tier, we see that this generation really could have been a good one, indeed! Yes, the RTX 4090 looks weak at both 1080p and 1440p but there are a couple of reasons for that:

At 1080p the GPU will be heavily CPU bottlenecked, meaning that users of this SKU should be upgrading their platform for the next couple of generations and seeing performance uplifts.
The RTX 4090 doesn't scale as well with resources as it should do. What I mean by this is that looking at the number of shaders it should be performing another 20 - 30 % faster than it is in real-world scenarios. This means three things to me:

It's power-limited.
It's frequency-limited.
It's voltage-limited.

I've seen various commentators doubting that the rumoured Blackwell GB102 die could possibly improve performance by any reasonable margin* but the fact is that the RTX 4090 was potentially planned to use much more power and generate more heat (hence the over-specced heatsink assemblies on the finally launched product). Originally, everyone - seemingly including AMD since their originally projections didn't match the performance of the released product - thought that the RX 7900 XTX was going to perform way better than it actually did and so Nvidia prepared, with their partners, a top-end GPU that would wring every last little bit of performance out of the silicon, efficiency be damnned! At least, these are the rumours of the sequence of events...

*Just one example...

However, in today's world, we know that Nvidia scaled-back the performance targets of at least the 4080 and 4090 and so the larger coolers were never required. Of course, it's also difficult to separate out the "heatsink inflation" caused by the AIBs from Nvidia requirements but given the fact that they routinely utilise coolers between products with different power limits and thermal output, it seems fair to say that it's a combination of both...

Looking at the uplift we could have had if FP32 resource tier was respected this generation and it could have possibly been the best generation ever released for lower-end gamers... for higher-end gamers, the uplift is basically as expected.

The third act switcharoo...

Nvidia's behaviour in this aspect is not surprising: People have been saying for a long time that Moore's Law is dead or dying and that free performance deriving from manufacturing process improvements is decreasing with each generation of hardware to be released, while also increasing the cost to manufacture the chips in the first place.

I've also separately noted several times that the best "value" purchases are either super high-end or super low-end graphics cards (I don't mean premium models, I mean the class of card). This is because a cheap card can be replaced often and you get relatively a good bang for the buck. The expensive card, while terrible "value" in the moment, will keep that value for longer and have more scaling and resources to throw at future game titles.

Now, I'm the epitome of "listen to what I say and don't watch what I do"... with the money I've spent on mid-range test systems, I could have just bought an RTX 4090 outright. But then I can't really do any analysis on a 4090-based system... and it's what everyone and their dog is presenting when they do performance analysis. So... yeah, there's that!

But here's the kicker that I think that everyone is missing here and I've not seen this mentioned a single time in any mainstream press or technical circles:

Nvidia have set up the RTX 50 series to be amazing.

Seriously, I'm not even joking!

Unfortunately, I need to admit that Nvidia is potentially being incredibly canny here: they have taken the huge performance uplift with the Ada Lovelace designs and production silicon and parlayed that into products which still (mostly) provide a small-ish performance uplift over the prior generation whilst simultaneously resetting which silicon goes into which product. This helps them save money because they're now getting more usable chips per SKU* and each chip will cost less to produce - even going into the future.

*Because smaller dies are more effectively harvested from the silicon wafers they are produced from...

Realistically, Nvidia don't have much more space to further decrease the size of the chips they assign to each class of GPU so, we are going to end up with the next generation providing the "normal" uplift in performance that we are used to seeing.

In summary - Nvidia are geniuses: They have "reset" their consumer GPU price and performance expectations with the RTX 40 series, allowing them to have a typical generational uplift with the RTX 50 series whilst saving themselves a lot of money in the process.

AMD have gone the other way - they've also moved to manufacturing smaller chips but they're focussing on combining the chips to make up their products. That adds expense and causes headaches in implementation in both hardware and software, which means that some of that cost saving is then lost.

As a result, with a monolithic RTX 50 series, Nvidia are positioned to do incredibly well and, if by some miracle, AMD pulls a rabbit out of the hat in terms of performance, Nvidia can just switch back to the more performant silicon dies they would have traditionally assigned to each class of card, granting them a doubling of the performance they would give the consumer. They are in a win-win situation.

Very savvy!

Hole in my head

14 October 2023

RTX 40 series - aka "Did Nvidia jump the shark"...

The Backlash...

The third act switcharoo...

No comments:

People who don't trust me...