Author Topic: Lies, damned lies and benchmarks  (Read 674 times)

CommuteTooFar

  • Inadequate Randonneur
Lies, damned lies and benchmarks
« on: February 04, 2020, 01:03:14 pm »
A few years ago AMD designed a new architecture called Bulldozer. This was tested and subsequently ridiculed as hopeless. The top chip Fx-3850 should have competed with the contemporary i7 but it was built on a less dense chip process so ended up as a competitor to the i5-2500. The benchmarks were done. In those days the benchmarkers held the false notion that if you use a very fast graphics card and low (VGA) resolutions you will reveal the performance of the cpu.
Every gamer read the benchmarks and when they felt they needed a better machine chose the Goldilocks priced i5-2500.

Modern times. Neither of these two chips are relevant today but sometimes appear in benchmarks.
Modern benchmarking practice is to show best performance so a 'good' tester will test at 1080p, 1440, 4K with highest graphics quality.
They also test the older processors to see where they are with modern games. In this new difficult world the fx-3850 has jumped ahead of the i5-2500.

What has happened here?  Quite simple, in the harsh modern games the performance is limited by the processor. Game engines are compiled and optimised for Intel's cpu architecture.
So if you look at a lesser load the Intel chip is putting everything into. An activity monitor will show 100% on one or more threads.  Poor old Bulldozer is running at 80%.
So when a bigger challenge is offered to the processor the i5 cannot go any faster. But the fx-3850 has another 20% to give and outperforms the i5-2500. It seems that Bulldozer was not such a bad design after all.

I am not suggesting people who chose the i5-2500 got it wrong. Clearly it ran contemporary games and resolutions better than the fx-3850.

Currently most commentators are suggesting the i9-9900KF is the best processor for gaming.  Are they making the same error again? The price equivalent AMD part is the 12-core Ryzen 9 3900. This may become very obvious this year.
In the next few months Sony and Microsoft are launching new consoles. Both are using custom AMD Ryzen 3000 chips and Navi graphics. It seems likely that game companies will start optimising for AMD and more threads. The conclusions from the benchmarks will change miraculously


Re: Lies, damned lies and benchmarks
« Reply #1 on: March 07, 2020, 02:31:57 pm »
Benchmarks are generally synthetic and don't really reflect real world conditions.
I spend a considerable part of my working life monitoring servers and investigating performance issues. The benchmarks are a good indication when choosing CPUs for some workloads, less so with others.

The Bulldozer CPUs shared components between cores so the usefulness of a core depended on the workload. This was reflected in Microsoft Licensing of SQL Server where, for only 2/3rds of the cores in AMD CPUs were considered for licensing purposes.

The Bulldozer running at 80% suggests there's a performance constraint somewhere else in the system. Or that the CPU is getting thermally throttled. There's also a question of how do you really measure CPU utilisation? Non-idle time, which is usually how it is derived is not the only measure and in many cases not appropriate, e.g. http://www.brendangregg.com/blog/2017-05-09/cpu-utilization-is-wrong.html

A Few Apples Short of a Strudel

CommuteTooFar

  • Inadequate Randonneur
Re: Lies, damned lies and benchmarks
« Reply #2 on: July 05, 2020, 05:12:07 pm »
This time it is AMD being naughty sometime ago declared to their users and investors that the would improve their Performance Efficiency by 25% by 2020.  This measure is defined by AMD. That is fair enough, it their target.  So this week they proudly pronounced that they had achieved their goal.  And you can follow the working on Anandtech.  Seems straight forward.

Not so fast back in 2014 half the external benchmark they used for the performance test was Specint. This benchmark was swapped to Cinebench 15. This is a little deceitful.  In no way can an integer arithmetic heavy benchmark be swapped with a floating point heavy benchmark.

I think they would probably have met the target with Specint but substituting with Cinebench exaggerated the performance of "Renoir" over "Kaveri"