22 November 2020

Analyse This: Next Gen PC Gaming 2...

I think I should probably extend that L1 cache, I think I got it in the wrong orientation...

Last time I looked at Nvidia's Ampere architecture from their release presentation. Then I looked at the potential performance of the RDNA 2 cards from AMD. Now that the RX 6000 series has been revealed and, sort of, launched and we have a real idea of what sort of performance they have (until drivers and game code begins to take better advantage of the architecture when utilising ray tracing), I thought I'd cover what I believe to be PC equivalents of each next gen. This post was inspired by PC World's video series covering those builds but also as a follow-on from my prior posts covering which parts you should buy as a minimum for PC going forward.


Graphically speaking... (I made this joke before)


Now, let me get this out of the way - I think that the builds from PC world are actually over-specced. So let's jump into it:

I had come up with the performance from the following components to build a PC which will still be able to game at 30 fps*, with RT enabled, at 1080p in 2025.
  • i7-10700K
  • 16 GB DDR4 RAM
  • RTX 2080 Ti
*All of my analysis was performed on "recommended" game specifications which guaranteed 30 fps.
In comparison, PC world have specced-up the following:

PS5/Xbox Series X
  • R7 3700X
  • 16 GB DDR4 3.6 GHz RAM
  • RTX 2080 Ti
Xbox Series S
  • R7 3700X
  • 16 GB DDR4 3.6 GHz RAM
  • RX 5700 XT*
*They specify that they'd switch out the 5700 XT for a 6700 XT for this build when they're available...
We're still awating a Series X video but I'd be surprised to see something which wasn't an RTX 3080 or an RX 6800 in that build.

Now, let me explain what I mean here - when I say "overspecced", I'm specifically speaking about actual graphical performance. Let me put it this way, a PS5 has 36 CUs @ 2.23 GHz and the XSX has 52 CUs @ 1.825 GHz while the XSS has 20 CUs @ 1.565 GHz and, from what we know at this point in time, none of these pieces of hardware has any infinity cache or boost clocks like the discrete graphics cards have (though the infinity cache is hinted at in The Road to PS5 presentation).

Unfortunately, it seems that the RT performance of RDNA 2 is really rather poor. I had thought that an XSX would be around an RTX 2080 Ti with 52 CUs but that appears to have been rather optimistic!

Breaking it down, the XSX and PS5 are mostly targetting 1400p - 1800p native resolutions when having ray tracing activated. When not using RT, they are targetting native 4K (where possible). In comparison, the XSS is targetting 1080p without RT and lower resolutions with RT enabled. Looking at the PC sphere, the RX 6800 excels at 4K and 1440p resolutions at around an RTX 2080 Ti level of performance at high/ultra without RT enabled but are limited to around RTX 2080 / RTX 2080 Super ray tracing performance at high settings and 1440p.

Putting that into perspective, the RX 6800 has 60 CUs at a game clock of 1.815 GHz and boost clock of 2.105 GHz (which I've read is quite easy to sustain), meaning that both the XSX and PS5 do not have the performance of an RTX 2080 Ti in anyone's wildest dreams, let alone when performing ray tracing. Similarly, even though there's no RT acceleration on an RX 5700 XT, that card has 40 CUs @ a game clock of 1.755 GHz and a boost clock of 1.905 GHz - WAY faster and wider than the XSS.

Doing a simple ratio calculation, the XSX therefore has RT performance at 1440p of around an RTX 2070 Super. In the same way, the PS5 has an RT performance of between an RTX 2060 Super and an RTX 2070. This comparison breaks down, entirely, for the Series S because I calculate RT performance below an RTX 2060... this is, of course, ignoring all DLSS shenanigans.


Would this be the best price/performance card if it held to RRP? I think so... 

Further to this, there is no sign of any "IPC" improvement for the RDNA 2 architecture compared to RDNA 1 per CU/clock. A 6800 XT is around 1.73x the actual performance in games (averaging the Eurogamer results) which equates to 92-96% of the expected performance when just going by the increase in the number of CUs from the RX 5700 XT. Meanwhile the 6900 XT is expected to manage 1.92x the performance with a similar efficiency. 96% out of the theoretical 100% performance increase is actually a very good number as it's been stated many times by various technical sources (most notably Mark Cerny in The Road to PS5 presentation) that widening the GPU pipeline is more difficult in terms of utilisation than increasing the clock speed for fewer resources.

However, saying that, the widened architecture on the SX grants access to many more ray calculation accelerations, meaning that the comparison to Nvidia hardware is more difficult.

Without RT, an RX 5500 XT 8 GB is actually more performant (22 CU @ 1.717 GHz / 1.845 GHz) than the XSS, meaning that we're expecting bad things for the console at 1080p.

In light of this, while I am fine with the CPU of the PC World builds, the graphics cards that I would use when utilising ray tracing in next gen games are as follows:

Xbox Series X: RX 6700 XT (40 CU) / RTX 2060 Super

PS5: RX 6700 (36 CU) / RTX 2060 Super

Xbox Series S: RX 6500 XT / (No RTX equivalent, performance is too low)

If you want to put that into non-RT performance, then you're looking at something RTX 2070 / RX 5700 XT for the PS5 and an RTX 2080 Super for the Series X - ignoring all VRS and other new-gen DX12 feature sets. For the Series S, we're looking at an issue: no prior gen or current gen cards are compatible with the performance target of this console. I also doubt that an RTX 3050 (assuming RT cores are included) and an RX 6500 will stoop as low as the graphics core included in that console.

Given that there's no evidence of IPC improvement for gen-on-gen RDNA 1 to RDNA 2 compute units, I really do believe that this console SKU will hold back the next generation of consoles immensely.


SSD talk...


There was a recent article from PC Gamer speaking about SSDs and how SATA SSDs are perfectly fine for gaming. I've said many times before that fast SSDs are an optimisation for restricted console architecture limitations and this applies even to the use of HDDs on PC. However, I also think that this sort of analysis is completely misplaced - as are the SSDs used in the builds of PC World. I know I keep saying it time and time again, but the sequential throughput of SSDs is irrelevant.

Is sequential read speed that important for gaming? I think not...


If sequential throughput were important then a SATA SSD would not be enough to get the sorts of performance improvements we see for gaming in the consoles and on PC. The reason for this is that they're only able to provide a 4-5x improvement over HDDs and would not show "equivalent" loading times to a PCIe interfaced SSD with multiple GB/s faster transfers - and this is speaking about an SSD that was released 8 years ago!

The Samsung 840 Pro used in the PC Gamer article is very old and you can actually see that in the QD1T1 4k and QD32T1 results from Crystaldiskmark in the table above. Are we really saying an early gen SATA SSD released in 2012 is equivalent to modern PCIe gen 4 SSDs? Clearly it's not!

So what's going on?

Well, a lot has been said and made of the fact that games are not designed to take advantage of the new interfaces. That's certainly true to an extent but it's not the whole picture. Also, as i mentioned above (again) random IOPS are a better indicator or performance for gaming (and this is where the 840 Pro is similar to modern SSDs in terms of transfer speeds) and this is entirely down to the controller on the SSD drive itself - higher quality controllers have better random access speeds than cheaper ones and you see this in every "Pro" variant of a drive above. 

Finally, there is one more aspect that everyone ignores. Processing time.

Seriously, do you think that you need gigabytes of data loaded into memory in order to get to the menu screen of any game? For the vast majority of games (the XCOM series notwithstanding), you don't need much data to get to the menu. However, you need to actually run the executable file, draw in all the disparate libraries into memory and get the game engine running on the system and, quite frankly,  we've been at the point where THIS time has been relatively static for a number of years because it's linked to single-threaded processes.

In fact, now that SSDs are in play and head seek latency and other factors that physically slow down access from storage are essentially eliminated, the time to process the binary code and load in those dependent libraries that make the game run is a significant fraction of the time to get into a game. That's why you don't see a big initial load performance difference between SATA SSDs and PCIe drives - and this is ignoring all the jostling and posturing of various companies to get their names in front of the player (who most of the time doesn't even care) before they get to the menu screen.

Every. Single. Time.

That's why we're not seeing huge initial load improvements on the new consoles. Yes, an i7-10700K and R5 5600X are very fast processors for single-threaded applications and will load things quite a bit more quickly than an R7 3700X... but that's the processing performance that's in a new gen console (well, actually, it's more like an R7 4800H, which might fall slightly behind but it's pretty close). Let me put it another way - single-threaded performance has not improved much over the last 10 years. We're looking at around 79% performance improvement to the new generation of consoles from an i7-930, an impressively sounding averaged 8% per year uplift. 

To put that progress level into perspective though, the improvement to an R5 5600X is 125% (an extra 58% in one generation for AMD) and 112% for the i7-10700K when we've been at 105% since 2018 with the i7-9700K (~6% over a period of 2 years). This is not to lambast AMD and Intel, this is meant to show that processing improvements are harder and harder to come by and that, generally, the improvements have been coming more slowly each generation (and they're only going to get more difficult!).

This is important because game code is notoriously difficult to parallelise and, quite frankly, just getting the engine and its libraries into active memory doesn't look like a very parallelisable problem to my untrained eyes - so I could be completely wrong on this point.


I do wonder, "Just how big of an impact will things like RTX IO and Smart Access Memory have on games?"

So, if load times are fine for SATA SSDs in the next generation, what about HDDs? Well, I haven't been able to find a video of load times for any next gen games from an HDD but I did find a video showing transfers whilst streaming data in Assassin's Creed: Valhalla. You can clearly see, there's no impact on the performance of the open world streaming in Valhalla for a game residing on an HDD - though that's not to say that other games will not have an issue down the line. 

Yes, the HDD is working a lot harder than the SATA SSD but it's not breaking the game and I don't expect it to break any other games going forward. Yes, initial load times from an HDD will be pretty terrible by comparison (as will scene/level loads [e.g. fast travel]) but that's not the point at which people are unforgiving during play. 

If anything, as I have suggested previously, developers will (or, IMO, should) just move more data upfront into system RAM and at that point it makes no difference whether you're on an HDD, SATA SSD or PCIe SSD... RAM speeds are just so much faster than any storage solution, it isn't even funny.


Seriously, why are we worried about drive speeds? 16 - 32 GB DDR4 would allow almost instant loading for any game loaded into it and the majority of games could then stream more data in as needed...


No comments:

Post a Comment