5 November 2022

Analyse This: Does RAM speed and latency make a difference for gaming...? (Part 2)


Bear with me...


Last time I took a look at the effect of memory tuning and frequency scaling on playing games on a Ryzen 5 56600X system in response to naysayers questioning the benchmarking methodology of various hardware reviewers. I essentially found no benefit for doing so much work on it and, in fact, under my test conditions, found that performance was worse at higher frequencies - even when operating at a 1:1 Infinity Fabric/memory controller frequency. 

That was only half of the equation though, these complaints have also been levied at reviewers for Intel systems, with people criticising them for not pairing Intel 12th gen systems with 4000 MHz or higher Samsung B-die kits.

Well, today we're going to take a look into that...

First off, I'd like to pour one out for my first draft of this post, which I wrote over the course of a week or so and had Outlook* eat it when closing the browser window. *Shakes fist at Microsoft*. So, if this post appears a little incoherent, just imagine a better, funnier version previously existed and I've had to write this for a second time from scratch...
*Yes I'm a barbarian!
Anyway, on with the show!


Latency and bandwidth of the memory, as determined by Memtest. (I'll explain what those highlights are for, later.)

The good, the bad, and the data...


I'm going to be upfront about the overall nature of my findings: it's mostly a big fat nothingburger. But, just to dot all the i's and cross all the t's, here's the test system:

  • Intel i5-12400
  • Gigabyte B660i (only 2 dimm slots)
  • RTX 3070 (undervolted)
  • Western Digital 1 TB SN750 NVMe
  • 2x 8GB Corsair LPX DDR4 3200 / 2x 8GB Patriot Viper DDR4 4400

Once again, I'm making the data public for you to peruse and be able to perform your own analyses... and also to judge me... I think it's only fair. I complained last time about memory tuning and overclocking being a bit of a black box and I've seen that many commentators and youtube personalities hide their settings so that "no one else can steal them". This is against my general philosophy and so I'm choosing to not hide anything - good or bad. Hey, maybe I'm making a mistake here in my methodology and someone can set me straight! That'd be great - we'd all learn in that scenario. As it stands, my results fly in the face of the conventional wisdom surrounding memory tuning and speeds.

Since this testing is incredibly time-consuming, and I have a full time job that has been kicking my ass the last couple of months, I made the executive decision to cut some of the applications I tested last time that showed zero sensitivity to any changes in latency and bandwidth to the system memory. I can always decide to go back and add the data in at a later point but, at this stage, I'm basically done with this testing.


These results somewhat surprised me and got me thinking about Intel's performance in game applications...

Just to remind anyone who's fresh coming into the series, I'm basically a memory tuning noob and had zero experience in such things before starting out on the Ryzen system. However, I actually had a relatively easy time over on AMD - once I installed the Samsung B-die dimms from Patriot, I was able to tweak and mess around with settings with relative impunity. Sure, I had errors and instabilities but I was able to boot without much of a problem on any configuration. Tuning memory on Intel was not so easy.

The 12th gen platform, despite now being relatively mature, was a bit of a pain: It was finicky, pedantic, controlling and sensitive. On the one hand, the motherboard would override values I manually set whenever it felt like, which meant that I had to monitor the primary timings via HWInfo to ensure I wasn't being tricked by the BIOS. The secondary timings? Who knows! Maybe they were what I set, maybe they weren't!

In fact, I just wasn't able to check those and followed a lot of forum threads, advising the installation of ASRock's Timing Configurator which, despite the obvious logic of people NOT using ASRock motherboards, was continually suggested. This is one of those things that I was railing about last time, with uninformed people giving ill-advise to other people wanting to know more or get better.

Don't install this Timing Configurator on a non-ASRock system.... it gave me some problems with my display drivers and I had to reinstall Unigine's Superposition due to it randomly corrupting it. Not a good time. Any other suggestions such as AIDA64 or whatever were also fruitless because they have some sort of free trial check which, unless I wanted to completely nuke my Windows installation in order to reinstall it a second time, wanted a significant amount of money to be able to use them to just check system settings. I'm amazed there is no free utility for checking sub-timings and system settings.

On the other hand, the system would fail to boot a LOT of the time, not even returning to a known good configuration, meaning that I had to disconnect and re-seat the RAM in order to force that to occur. This was a massive PITA!

Anyway, for this testing, I have updated the BIOS to improve memory compatibility (it does help!) and it allowed me to reach higher frequencies at a 1:1 memory controller to RAM ratio than the previous version I had from April 2022, but it did not help me overcome the innate limits of the Intel platform.

That was when I discovered this thread.

Nice straight lines... but what's with those lower overall scores?!


But there's only one such dip in score on a newer benchmark... likely indicating that it might be an idiosyncrasy on the new CPU architecture.


Turns out that I'm probably limited in what I can accomplish at higher frequency DDR4 settings - something I noticed when running at 3600 C15 and at frequencies higher than that. Gear 1 was unable to be set in any stable configuration so Gear 2 was the order of the day for effectively half of the charts you'll see here. The interesting thing is that, although this resulted in much higher system latencies, it didn't actually affect memory latency very much and did not negatively affect the bandwidth obtained when going to higher frequencies either.

So, I'm not really sure what to take away from that. Sure, running Gear 1 appears to give a lower system latency reading but, looking at the applications I tested, Gear 2 doesn't actually result in worse performance. Would I have gotten much better performance at higher RAM frequencies if I'd been able to run in Gear 1? It's a question for someone else to answer because it appears I am unable on my locked CPU and platform.

In fact, speaking about latencies and bandwidths, it's a good time to address the differences between the Intel and AMD platforms. Intel has, in general, much lower latency: I was able to reach somewhere around 45-50 ns on the 5600X but the maximum memory latency on the Intel platform was in the high twenties. The same story repeated itself for system latency, with 55 ns being easily achievable for the 12400 and 70 ns appearing to be a hard limit on the 5600X. Of course, as I mentioned before and last time, the latency did mostly not correlate with better scores and these numbers are basically academic for each platform as an intrinsic property. i.e. The 12400 is not better because it has lower latency, it just has lower latency and the architecture is built with it.

One interesting aspect I observed in all of this was the effect of memory subtimings on the bandwidth available. This differed quite strongly on the 12400 from the 5600X: the Intel platform was much stronger for mixed read/write workloads, displaying higher bandwidths in the 3:1, 2:1 and 1:1 read:write tests. The AMD platform, in contrast, is always strongest in read-only operations.

Mixed read/write operations have much higher potential bandwidth on the Intel system...


I've never seen this aspect between the two competing architectures mentioned anywhere before, on any Tech site. I'm not actually qualified to answer this question but it does have me wondering if this affects game performance? Logically, games need to read from RAM and write to it in a time/latency sensitive manner. Whether that's drawing from the storage to load into RAM, or moving data from/to the VRAM on the graphics card, etc. Is this part of the reason why Intel has historically better gaming performance than AMD? In my small sample of testing, Intel won-out overall but there were quite a few draws in performance in various applications or metrics in an application (e.g. 1% lows in Spider-man), as well.


There are no real trends here but some dips at stock RAM settings with Gear 2 enabled...


The same trend is observed when RT is enabled...


So, getting back to those blue markers: why are they highlighting those results?

These are memory timing settings where we had low latency at those two speeds, in Gear 1 mode, and high bandwidths. They also correspond to slightly better scores in both Arkham Knight and AC: Valhalla than the average across all other testing. However, the correlation breaks down in the two synthetic benchmarks from Unigine, as well as in Spider-man with and without raytracing enabled. To me, this demonstrates that there is not a strong link between latency on this Intel platform in the same way that it wasn't a strong consideration in the results obtained for the AMD platform in the last entry.

While these three settings are among the lowest latency and highest bandwidths measured in Memtest86 and Intel's MLC, better results are obtained with worse latency and less bandwidth when all tested applications are taken into account. Latency is clearly not a panacea to improve performance in games.

It also seems that Spider-man, with RT enabled, is the only application where the bandwidth is actually important - and we see that dip in performance after switching to Gear 2 decrease as we approach 70 GB/s at DDR4 3800 CL18, but even then - it's a minor effect.


Arkham Knight benefits the most from the RAM timing with the lowest latency but it's only a slight increase in performance...

Whereas Valhalla just doesn't care at all; Gear 1, 2, DDR4 3200 to 4000, tight or loose secondary timings, it's all good... except for that result at 4000 CL19 stock?


However, we do see quite a lot of fluctuation between the max fps across the various tests in both Spider-man and Valhalla. For Spider-man, this can be explained by it being a manual benchmark, including web-swinging and, looking at the results in aggregate, we can see that they cluster around a central value (with the exception of the first two tests). For Valhalla, the reason is not clear but we can see that it does have a central tendency (again, ignoring the first two values) - unlike what we observed in the Ryzen 5 5600X testing, where a hump appeared to be present around DDR4 3600.



Rounding Up...


Looking at all the data I've collected over the last two entries, there just isn't a big difference to be had in a normal gaming scenario from having a) faster RAM past DDR4 3200 or, b) lower CAS latencies than 16 (as long as you have a tighter tRFC than stock). In fact, even the latest video from Hardware Unboxed shows essentially zero difference in the majority of applications using the highest-end GPU available. Using a more reasonable GPU (such as the RTX 3070) negates even that slight benefit.

Sure, at artificially low resolutions, you might be able to extract a larger performance difference with tighter RAM timings but then that's a condition that is not reflective of the real world.

So, after all this testing, it seems clear to me that the people calling for reviewers to use RAM kits like DDR4 4000 or whatever just don't know what they're talking about... and the call for them to use non-XMP, tightened settings is just as useless as asking for overclocked or tuned CPUs and GPUs to be used as well.

There is just one more thing to explore on this journey and it's something that I've been talking about before: Just how well do these static Average and minimum numbers reflect the true differences in testing? I've been going on about process performance but I haven't looked at it once in this series.

Well, next entry, I'll go more in-depth on that side of things and we'll see if I'm wrong for calling for more statistical analysis of benchmark data than just simple, static numbers.

See you next time!




No comments: