Ryzen: Strictly technical

Page 53 - Seeking answers? Join the AnandTech community: where nearly half-a-million members share solutions and discuss the latest tech.
Status
Not open for further replies.

Atari2600

Golden Member
Nov 22, 2016
1,409
1,655
136
The cache values are very confusing.

8-core Ryzen:

L1 cache: 768 KB
L2 cache: 4.0 MB
L3 cache: 16.0 MB

Indeed. Now does it reflect a fake image or does it reflect windows not properly reading the cache sizes? But that wouldn't explain the 1MB L1.

I suppose a possible explanation would be that Ryzen isn't a fully enabled Zeppelin.

Currently its...

L1: 96 KB/core
L2: 512 KB/core
L3: 8MB/4cores (2MB/core)

So scaling by 16 gives (approx):
L1: 1.5 MB
L2: 8 MB
L3: 32 MB

Which doesn't make sense --- so I'd have to assume the screen grab is a fake.
 
Reactions: Space Tyrant

PhonakV30

Senior member
Oct 26, 2009
987
378
136
I checked cpu-world's Database and found all Intel's CPUs with 40Mb L3 cache and 4M L2 cache but all are 512Kb L1 cache.Highest clock for Xeon is E5-2698A v3.
 

KompuKare

Golden Member
Jul 28, 2009
1,221
1,571
136
I checked cpu-world's Database and found all Intel's CPUs with 40Mb L3 cache and 4M L2 cache but all are 512Kb L1 cache.Highest clock for Xeon is E5-2698A v3.
I didn't find it on ark.intel.com so I looked up the Wiki on Broadwell-E which said 64KB per core (when why I wrote 16 x 64KB = 1024KB).
https://en.wikipedia.org/wiki/Broadwell_(microarchitecture)
Obviously those Xeons don't run anywhere close to 4GHz but that could have been faked?
 

looncraz

Senior member
Sep 12, 2011
722
1,651
136
And going by fotoforensics which I actually don't full understand but it looks clearly fake:

Look at the Text for example of CPU Name: Completely different (ELA) than other parts.
Also the image has a metadata creation date in November 2016 and a last modified on march 22 2017.

Yep, someone put a lot of work into it... I'm guessing they modified much of the test just to hide the usage of a different font.
 
Reactions: Drazick

PhonakV30

Senior member
Oct 26, 2009
987
378
136
I didn't find it on ark.intel.com so I looked up the Wiki on Broadwell-E which said 64KB per core (when why I wrote 16 x 64KB = 1024KB).
https://en.wikipedia.org/wiki/Broadwell_(microarchitecture)
Obviously those Xeons don't run anywhere close to 4GHz but that could have been faked?
by using search in CPU-world First
1) Set Manuf = Intel then Click "show Data"
2) Set Family = Xeon then again Click "show Data"
3) L2 = 4M , L3= 40960 then show data

https://ibb.co/hSuVov
 

coercitiv

Diamond Member
Jan 24, 2014
7,075
16,290
136
And going by fotoforensics which I actually don't full understand but it looks clearly fake:

Look at the Text for example of CPU Name: Completely different (ELA) than other parts.
Also the image has a metadata creation date in November 2016 and a last modified on march 22 2017.
The person who faked that SS made two obvious mistakes.
1) Used JPEG compression for the resulting file. This allows for the kind of detection used by fotoforensics.
2) Used Photoshop or other similar image editor to edit text in the image using the image editor text renderer. This introduces a different kind of flaw, that can be easily seen with the naked eye once zoomed in.



Notice the colored pixels around most of the text? That text is rendered using subpixel rendering, a tehnique used throughout the OS to increase the apparent resolution of the font at the expense of a small but otherwise indistinguishable quality loss. Normally all the text is rendered this way, including this one.

Notice the text with no colored pixels around it? That text is rendered using an image editor which only uses classic anti-aliasing.

Here's what can be done with a bit more care, although most of this stuff is not worth the effort.
 
Last edited:

PhonakV30

Senior member
Oct 26, 2009
987
378
136
The person who faked that SS made two obvious mistakes.
1) Used JPEG compression for the resulting file. This allows for the kind of detection used by fotoforensics.
2) Used Photoshop or other similar image editor to edit text in the image. This introduces a different kind of flaw, that can be easily seen with the naked eye once zoomed in.



Notice the colored pixels around most of the text? That text is rendered using subpixel rendering, a tehnique used throughout the OS to increase the apparent resolution of the font at the expense of a small but otherwise indistinguishable quality loss. Normally all the text is rendered this way, including this one.

Notice the text with no colored pixels around it? That text is rendered using an image editor which only uses classic anti-aliasing.
 

CatMerc

Golden Member
Jul 16, 2016
1,114
1,153
136
Can't confirm but


Infinity Fabric can have a bandwidth up to 100GB/S


Wow
That would mean DF running at double memory clock would it not?
3200MT/s memory, AKA 1600MHz, would be 51.2GB/s right now. I don't see 6400MT/s memory coming anytime soon, so we have to be talking higher data fabric speeds.
 
Reactions: looncraz

tamz_msc

Diamond Member
Jan 5, 2017
3,865
3,729
136
Regarding that fake SS, the Arial-type font of the utilization numbers is a dead giveaway.
 

IEC

Elite Member
Super Moderator
Jun 10, 2004
14,560
5,952
136
I cant believe someone would fake a task manager screen shot. To what end? I have a SS of my task manager saying my CPU is running at 6.45GHz, and it isnt fake.

It'd be easier to hook into the process and change some numbers to make it appear legit, anyways.
 

looncraz

Senior member
Sep 12, 2011
722
1,651
136
You sure it was not software you tested it with tripping up? Because checking AIDA results over internets, it sits around that 40ns region for 2600ks. In better cases.

AIDA always showed about 40ns, but Sandra, PCMark, and a couple other benchmarks reliably show ~19ns. I suspect the ~40ns number in AIDA is round-trip and the ~19ns elsewhere is one-way.

Those other benchmarks also show Ryzen as having 60~70ns latencies instead of 80~100ns latencies.

I just now realized this might be a difference between read and write latency. I already know that Ryzen has a write coalescing buffer which will increase latency for writes (which really isn't such a big deal).

Great, more code to write... I'm gonna put that on my list and just keep working on getting http://zen.looncraz.net/ live (I have yet to generate even half the charts, but I have all but 8C/8T numbers).
 

looncraz

Senior member
Sep 12, 2011
722
1,651
136
So, I just ran my Handbrake test (a full DVD rip of Masterminds, 1200kbps MKV, AC3 PassThru). On Sandy Bridge the choice of reading the disc from a mechanical hard drive wasn't an issue - the CPU was consistently maxed. It slowed Ryzen down some in stage 1 (but that is a short stage, anyway).

i7-2600k @ 4.5GHz, DDR3-2133 19-19-19-? 2T
Time: 13:41
Power: 282W (full system, including monitors)

Ryzen 1700X @ 3.925GHz, DDR4-2667 16-16-16-38 1T
Time: 7:41
Power: 241W (Full system, including monitors)

Yeah... that's insane. ~40W less and 6 minutes faster. I would have probably shaved 30 seconds by reading from a DVD instead of from a mechanical hard drive... but I didn't run it that way on the 2600k.
 

imported_jjj

Senior member
Feb 14, 2009
660
430
136
So, I just ran my Handbrake test (a full DVD rip of Masterminds, 1200kbps MKV, AC3 PassThru). On Sandy Bridge the choice of reading the disc from a mechanical hard drive wasn't an issue - the CPU was consistently maxed. It slowed Ryzen down some in stage 1 (but that is a short stage, anyway).

i7-2600k @ 4.5GHz, DDR3-2133 19-19-19-? 2T
Time: 13:41
Power: 282W (full system, including monitors)

Ryzen 1700X @ 3.925GHz, DDR4-2667 16-16-16-38 1T
Time: 7:41
Power: 241W (Full system, including monitors)

Yeah... that's insane. ~40W less and 6 minutes faster. I would have probably shaved 30 seconds by reading from a DVD instead of from a mechanical hard drive... but I didn't run it that way on the 2600k.

Figured that we need a new metric, MT IPC.
With substantial differences in SMT yield and platforms with many cores, ST IPC is of little importance on these platforms but MT IPC is relevant.
One can test SMT yield but doing it by running a MT task on a single core with 2 threads would make it easier for folks to understand the metric. Ofc as an alternative one can do MT results per number of cores to compute MT IPC.
 

looncraz

Senior member
Sep 12, 2011
722
1,651
136
Figured that we need a new metric, MT IPC.
With substantial differences in SMT yield and platforms with many cores, ST IPC is of little importance on these platforms but MT IPC is relevant.
One can test SMT yield but doing it by running a MT task on a single core with 2 threads would make it easier for folks to understand the metric. Ofc as an alternative one can do MT results per number of cores to compute MT IPC.

I already have done some of those tests 1C/2T

SMT scaling is pretty insane (~40%). I've also found very little in the way of an SMT penalty (sometimes 3% or so of a loss with SMT enabled and a single thread locked to a single core without the SMT logical core being parked)... but it's a pretty tricky thing to reliably quantify... it's basically inside the noise floor.

Update:

I've found a few examples where the SMT penalty is rather large, going through the numbers in their final form... but I've not found this play out in real application... just two synthetic benchmarks.

3.925GHs, DDR4-2667 16-16-16-38 1T

AIDA64 PhotoWorxx:
SMT ON: 21,567
SMT OFF: 24,805
Penalty: 13%

AIDA64 VP8 (6700k scores 7,521 @ 4Ghz, so both results here are awesome)
SMT ON : 7,748
SMT OFF: 9,336
Penalty: 17%

Cinebench R10 Single Threaded
SMT ON: 4,774
SMT OFF: 4,923
Penalty: 3%

Cinebench R11.5 Single Threaded
SMT ON: 1.55
SMT OFF: 1.62
Penalty: 4%

Cinebench R15 Single Threaded
SMT ON: 150
SMT OFF: 160
Penalty: 6%

Most example, of course, show no change at all with or without SMT enabled.
 
Last edited:

tamz_msc

Diamond Member
Jan 5, 2017
3,865
3,729
136
I already have done some of those tests 1C/2T

SMT scaling is pretty insane (~40%). I've also found very little in the way of an SMT penalty (sometimes 3% or so of a loss with SMT enabled and a single thread locked to a single core without the SMT logical core being parked)... but it's a pretty tricky thing to reliably quantify... it's basically inside the noise floor.

Update:

I've found a few examples where the SMT penalty is rather large, going through the numbers in their final form... but I've not found this play out in real application... just two synthetic benchmarks.

3.925GHs, DDR4-2667 16-16-16-38 1T

AIDA64 PhotoWorxx:
SMT ON: 21,567
SMT OFF: 24,805
Penalty: 13%

AIDA64 VP8 (6700k scores 7,521 @ 4Ghz, so both results here are awesome)
SMT ON : 7,748
SMT OFF: 9,336
Penalty: 17%

Cinebench R10 Single Threaded
SMT ON: 4,774
SMT OFF: 4,923
Penalty: 3%

Cinebench R11.5 Single Threaded
SMT ON: 1.55
SMT OFF: 1.62
Penalty: 4%

Cinebench R15 Single Threaded
SMT ON: 150
SMT OFF: 160
Penalty: 6%

Most example, of course, show no change at all with or without SMT enabled.
PhotoWorxx uses AVX2 - so does it mean that the SMT penalty is more pronounced in such types of workloads?
The real mystery is the VP8 benchmark - given what it tests, the performance hit that SMT causes is weird.
 

looncraz

Senior member
Sep 12, 2011
722
1,651
136
PhotoWorxx uses AVX2 - so does it mean that the SMT penalty is more pronounced in such types of workloads?
The real mystery is the VP8 benchmark - given what it tests, the performance hit that SMT causes is weird.

Yeah, but VP8 almost always scales with frequency more than anything else. The 4Ghz quad-core 6700k beats the 3.6GHz six-core 6850k - 7521 to 6932... which is fully explainable with the frequency difference.

It's pretty strange considering the test nearly maxes out every core (16 threads at 85~90%)... but that might be explainable if the VP8 test itself is actually single threaded and it has locking overhead that slows down the Julia output, meaning performance is being restricted by locking contention... less threads = less contention = higher score.

RE: AVX, that actually makes sense -the uop, retire, and store queues are statically shared but rarely used for one work load as much as they can be with SIMD. AVX, in particular, could be like issuing twice as many uops for the same task as normal, making the penalty come to light in a very clear manner.
 

imported_jjj

Senior member
Feb 14, 2009
660
430
136
I
Update:

I've found a few examples where the SMT penalty is rather large, going through the numbers in their final form... but I've not found this play out in real application... just two synthetic benchmarks.

3.925GHs, DDR4-2667 16-16-16-38 1T

AIDA64 PhotoWorxx:
SMT ON: 21,567
SMT OFF: 24,805
Penalty: 13%

AIDA64 VP8 (6700k scores 7,521 @ 4Ghz, so both results here are awesome)
SMT ON : 7,748
SMT OFF: 9,336
Penalty: 17%

Cinebench R10 Single Threaded
SMT ON: 4,774
SMT OFF: 4,923
Penalty: 3%

Cinebench R11.5 Single Threaded
SMT ON: 1.55
SMT OFF: 1.62
Penalty: 4%

Cinebench R15 Single Threaded
SMT ON: 150
SMT OFF: 160
Penalty: 6%

Most example, of course, show no change at all with or without SMT enabled.


Is that with the latest AIDA64? And if it is , have results changed in any of the tests, aside from L3$.
 
Status
Not open for further replies.
sale-70-410-exam    | Exam-200-125-pdf    | we-sale-70-410-exam    | hot-sale-70-410-exam    | Latest-exam-700-603-Dumps    | Dumps-98-363-exams-date    | Certs-200-125-date    | Dumps-300-075-exams-date    | hot-sale-book-C8010-726-book    | Hot-Sale-200-310-Exam    | Exam-Description-200-310-dumps?    | hot-sale-book-200-125-book    | Latest-Updated-300-209-Exam    | Dumps-210-260-exams-date    | Download-200-125-Exam-PDF    | Exam-Description-300-101-dumps    | Certs-300-101-date    | Hot-Sale-300-075-Exam    | Latest-exam-200-125-Dumps    | Exam-Description-200-125-dumps    | Latest-Updated-300-075-Exam    | hot-sale-book-210-260-book    | Dumps-200-901-exams-date    | Certs-200-901-date    | Latest-exam-1Z0-062-Dumps    | Hot-Sale-1Z0-062-Exam    | Certs-CSSLP-date    | 100%-Pass-70-383-Exams    | Latest-JN0-360-real-exam-questions    | 100%-Pass-4A0-100-Real-Exam-Questions    | Dumps-300-135-exams-date    | Passed-200-105-Tech-Exams    | Latest-Updated-200-310-Exam    | Download-300-070-Exam-PDF    | Hot-Sale-JN0-360-Exam    | 100%-Pass-JN0-360-Exams    | 100%-Pass-JN0-360-Real-Exam-Questions    | Dumps-JN0-360-exams-date    | Exam-Description-1Z0-876-dumps    | Latest-exam-1Z0-876-Dumps    | Dumps-HPE0-Y53-exams-date    | 2017-Latest-HPE0-Y53-Exam    | 100%-Pass-HPE0-Y53-Real-Exam-Questions    | Pass-4A0-100-Exam    | Latest-4A0-100-Questions    | Dumps-98-365-exams-date    | 2017-Latest-98-365-Exam    | 100%-Pass-VCS-254-Exams    | 2017-Latest-VCS-273-Exam    | Dumps-200-355-exams-date    | 2017-Latest-300-320-Exam    | Pass-300-101-Exam    | 100%-Pass-300-115-Exams    |
http://www.portvapes.co.uk/    | http://www.portvapes.co.uk/    |