19 December 2020

In Defence Of: cores... (and the future of gaming)

The original threadripper didn't have the centralised I/O, hurting performance of non-highly parallelised tasks...
 
There is still a narrative within the industry that is incorrect. Actually, scratch that, there are still two narratives within the industry that are incorrect! The first of these is that higher frequency makes a better gaming experience. The second is that more cores equals a better gaming experience.

You know what? Scratch THAT. There is also a third narrative that is also incorrect: relative processor performance is the most important metric for a better gaming experience.

Let’s get into each of these and why they are all correct whilst being incorrect… (Yeah, I KNOW! Why can’t things be simple?!)

As part of my analysis of the required performance over the last 10 years, I looked at processor performance as the main metric when assessing processing power. However, I also tried to correlate this with the number of processor cores (not including hyperthreading/simultaneous multithreading). The reason for this is that processor “performance” is application dependent. For a long time, the commonly accepted wisdom was that higher single-threaded performance was king for gaming. Then, later on, many were saying in late 2019 and early 2020 a Ryzen 3600 was good enough for gaming because you didn’t require more cores than that.

These aspects of gaming performance were all true but are all historical artefacts in much the same way the three core CPUs were. I was calling for people to expect to have 8 cores (a 3700X/3800X or 10700K) as a minimum for gaming in the years going forward and that was based partly on the next (now current) generation of consoles and my trending of performance required for gaming per year. 

You see, the way things were for the whole of the PS4/Xbox One generation of consoles was that CPU performance was terribly limited by the relatively weak Jaguar cores available to developers in the base and mid-gen refresh consoles. However, since we’re upgrading to (essentially) top-of-the-line CPUs in 2020 from AMD’s side of things, developers are bound to actually take all available performance from those cores and threads in the years to come. 

In parallel to this, over the last 7 years, there has been push, after push to actually promote parallelism of game code. This is not a simple thing and, given the limitations on the last generation console CPUs and lack of I/O throughput, it wasn’t so much of an option as it was a rarity.

Now, both next generation consoles have appeared and addressed the I/O bottleneck (in different ways) and the CPU deficiency so completely and thoroughly that I do wonder whether it has left developers scrambling to cover the open ground between where last gen engines sat and truly next gen engines will sit in terms of hardware requirements.

In fact, I have been expecting a huge leap in terms of PC hardware requirements in order to keep up with console performance.


The Threadripper 3000 series fully embraced the Zen 2 chiplet design paradigm...


So, coming back to the three leading ideas in this blogpost:
  1. Frequency matters, but not absolutely.
  2. Cores matter with increasing parallelism.
  3. CPU performance matters overall.
All of those factors are important as a whole and should be considered together.

Frequency...

Frequency is important within a CPU architecture: it makes all things faster - conenctions, data transfer, calculations, etc. However, it's not that important overall, past a given point. At the end of the day, a supremely fast 6 GHz single or dual core CPU may have the highest possible framerate in a game but the 0.1 - 1.0% lows will be much worse than a 3 GHz 4 core CPU because a game thread is not the only process running on any PC. Higher frequency will lead to better single-threaded performance but it's not the only metric that matters to a game.

And I think that this is where the misconception comes from - historically, CPU and application performance was inextricably tied to clock speed and this was, for many years, intrinsically tied to single-core performance. However, as the parallelism of application processes increases to match the hardware environment that programmes increasingly find themselves within and the frequency/power/heat limitations of silicon increasingly reach their limit on each node, it's apparent that it's not possible to increase core frequency indefinitely - something Intel has found back in the Pentium 4 days... and the Core architecture... We're basically at the limit of our ability to scale performance with frequency so where do we go from here?

Cores...

The reason I tried to take core count into consideration when picking the CPUs that met the average recommended requirements each year in my analysis was that the metric of "processor performance" is not necessarily linked to the metric of "system performance". Let me put it this way - there's a very good reason that reviewers and benchmarkers try their very best to remove background tasks and disable windows update: these things hurt performance numbers!!

Data management (both for the game and other programmes working on the system, as well as the OS) contributes to some serious overheads within a system and those will have to interrupt any serialised process running on the thread(s) of a core in order to get their processes updated and for the system not to crash.

What happens when the game in question needs to pull new compressed data (and unconpress it) into system memory or transfer that to the GPU memory for the GPU core to operate on? What happens when positional data needs to be sent to the network or sounds are required to be played, with output being sent to the hardware on the motherboard (or even a soundcard - for those who still have them)? You get stuttering, dropped frames, clipped sounds, lower quality mip-levels on textures leading to texture pop-in and geometric pop-in due to level of detail management within the game engine...

A CPU with more cores can manage more of this parallel work without dropping frames. Yes, it might not manage to get the ultra high fps numbers but it will deliver a more consistent experience which many people value more than the highest of high numbers. Of course, some games put less strain on a system's resources than others but until we start seeing dedicated silicon in PC systems to manage much of this programme overhead like we do in the consoles, more cores will generally equal more capable systems for gaming across a broad range of game types. Though, this increase does have diminishing returns after a certain point (at this point in time, it's around 6-8 physical cores).

Of course, Hyper Threading or Simultaneous Multithreading (HT/SMT) is also a further consideration in this discussion and Dr. Ian Cutress, over at Anandtech, has looked at the effect of SMT utilisation for gaming within the Zen 3 architecture.

I think the analysis is quite interesting but, as a study, is incomplete. The main reason behind this is that they analysed a core-heavy environment when gaming. In this scenario, I would not expect a big performance difference between an application with SMT on or off because the application has access to excess resources and is not competing for them with other processes (as explained above!).

The accompanying (and, in my opinion, important to gaming) data would pit the performance of the applications and games with fewer cores (e.g. the 5950X vs 5600X or the 3300X vs the 3700X). Ideally, this testing would all be performed at a fixed clock speed and on the same microarchitecture in order to rule these variables out.

As a first effort, the study is not bad but is lacking important context so, at this point in time, doesn't heavily influence the debate surrounding the effectiveness of SMT in gaming instances.

Of course, there are commentators out there such as Tech over at TechDeals that agree with my point of view - 4 core and 6 core CPUs are relics of the past and will not likely be great performers in the coming  years for gaming.

Performance...

This isn't a consideration that we see mentioned very often. I pointed out Hardware Unboxed's take on the situation in the intro because that's the most recent source (note, not video) that I've seen mention it. The issue with this approach is that you need a base level of technological know-how to apply the principle correctly.

The performance of a CPU is heavily dependent on the application that is being used upon it. In the case of gaming, this is also specifically dependent on which game engine is running and, even worse, dependent on which specific game is running.

So, while HWU is technically correct, it's actually very difficult to give advice or make predictions on this metric. Worse still, you need to decide on a benchmarking tool in order to gauge "performance" numbers. It could be that the benchmarking tool you choose (e.g. userbenchmark, geekbench, [Insert game here]) accurately reflects your intended usage but it may not...

So, I agree that overall performance of a CPU is a good indicator... I disagree, in the sense that it's not necessarily a helpful piece of advice without hand-holding the user or pointing them towards a benchmark which represents their use-case...


Predicted "recommended" CPU performance over the next 5 years...


An example...


We can compare a Ryzen 3 3300X to a Ryzen 5 3600 and a Ryzen 7 3700X, all at stock settings - which is best for gaming?

If we look at frequency, the 3300X is supposedly the best CPU - it has the highest base clock and second highest boost clock compared to the 3600 and the 3700X.

If we look at cores/threads, the 3700X has them all beat with 8/16 C/T.

If we look at performance, the Ryzen 3 3300X (1T/MT of 1271/5392) to a Ryzen 5 3600 (1T/MT of 1150/6350) to a Ryzen 7 3700X (1T/MT of 1266/7960), the 3700X wins based on multithreaded performance but the 3300X wins on single threaded performance.

Which was correct? 

Well, we can see that, the 3700X won in gaming. Overall, the number of cores won-out but it really, seriously, depends on more than that. Relative efficiency of the processor architecture in question, processor frequency, which resolution you're gaming at and it also really depends on the game in question. Some games, at some resolutions, it's a matter of a couple of frames per second between each CPU. Taking the lesson that "more cores equals better performance" is wrong. See below...

But is that the correct answer?

Looking at the performance metrics is not the whole story because those numbers will not necessarily be repeatable in a real-world environment (i.e. your PC, using a different monitor and a different RAM/GPU configuration). Many benchmarkers utilise the very highest tier GPUs, motherboards and RAM - at which point, we're talking about completely different market segments between a 3300X and a 3900X/3950X. A person buying a 3300X will likely not be buying an X570 motherboard with 4000MHz RAM, booting games from a $200 PCIe gen 4 NVMe drive.

Worse still, we can take this a step further - A brand new Ryzen 5 5600X (1T/MT of 1609/8373) has a much better single threaded and multithreaded metrics than the 3700X. Does that make the 5600X better than the 3700X?

The then-current, last gen, gaming benchmarks all pointed to the 5600X. But what about next gen games with high levels of parallelism in their engine design (e.g. Cyberpunk)? Well, the 3800X (1T/MT of 1278/8456) has worse single threaded performance than the 5600X but very slightly better multithreaded performance. This translates to effectively equivalent performance in the game. In fact, another aspect of "performance" comes into play here - the GPU. It's clear that the GPU being used has a large effect on in-game performance. So, does all this pontificating really mater?


You can only really begin to see a difference once you start paring the very highest-end graphics cards with the 5600X...

Essentially, yes, it does... 

The 5600X is a new architecture which can't directly be compared against the prior gen. That 3300X manages a 70 fps average with a minimum of 58 fps across all three graphics cards which loses out to both the 3600 and 3700X because of the increase in multithreaded performance. Turning towards the new arhictecture, the 5600X performs well, but it's not massively outperforming the 3700X and 3800X - at least not in relation to the relative "performance" metrics I've picked here from Geekbench: from the single threaded performance gains, you'd expect the 5600X to do WAY better.

This is exactly why relative performance is not a good indicator for the layperson.


More 4-core performanc than any current gen processor is predicted to be required by games in five years time...


In conclusion...


I think that one of the reasons that people have become confused about the difference between frequency and number of cores is because of the way that average silicon quality tends to degrade along with the increased number of defects within that same silicon. i.e. a 3800XT has a higher operating frequency than a 3700X despite having the same number of cores because the 3700X silicon happens to be of a lower quality for those particular chips but if you happen to get a good chip, you could probably push a 3700X up to the frequencies of the 3800XT but the chances of that are quite low.

Similarly, a 3600 has even lower average silicon quality than the 3700X but it also has fewer active cores because there is some defect in the chip from the manufacturing process. It's still possible you could find yourself holding a 3600 that could be overclocked to higher frequencies but the odds are much smaller than they were for the 3700X situation I mentioned above.

Regardless of this frequency issue, the number of cores was not historically important because our gaming consoles and low-end PC systems did not have enough parallelisation to actually matter. Now, we're in a situation where the consoles have "mid-range"* CPUs in them, meaning an 8 core 16 thread chip. This will result in more developers deciding to implement more CPU-heavy systems in their games - world simulation, AI, etc. These systems will need multithreaded performance in order to excel and we must also take into account the additional CPU demands of the background processes that are accounted for by dedicated silicon in the consoles (such as data management) for more heavily graphical scenes.
*I'm saying "mid-range" because, in all honesty, 4 core, 8 thread or 4 core, 4 thread CPUs are going away. I believe the limited availability of the 3100X and 3300X is a sign that TSMC's process is mature and efficient enough that defects are not numerous enough to make those sort of SKUs viable as a continual product - there are just not enough precisely defective dies to enable those products to exist. This situation is only going to get worse over time. Similarly, APUs are now mostly in the 6-8 core range. 4 core CPUs are a thing of the past and 6 core is now the bare minimum, going forward...
All of this means that more cores, at a higher frequency, will be required in the future for gaming - even at 1080p because games will become more demanding, on-average than they were in the past. This means that gaming "performance" of a CPU isn't necessarily indicative of their actual performance in a given game title and it isn't necessarily indicative of the real-world (i.e. your specific PC) performance, either. Much like the consoles, it's good to have extra CPU cores to be available for background processes such as OS operations, data management and network activities, at a minimum

If you wish to run multiple monitors or have multiple programmes running in parallel, then you'd better have those extra cores available to be utilised, otherwise your gaming performance will likely suffer!

Don't buy a 6 core system - yes the 5600X is a beast on paper. It won't remain one...

No comments: