New Zen microarchitecture details

bjt2 · Feb 2, 2017

itsmydamnation said:
Because you have to SUSTAIN that cycle after cycle for it to make a difference, both can only decode 4 x86 ops and most x86 ops are 1 uop for both. So for all these extra ports to matter you have to be able to feed them and neither Zen or Skylake can Feed more ~6 uops from uop-cache or 4 from Decode.

They also both have the ~same amount of L/S and all the other structures i mentioned, when you have FP workloads for example you will see very high percentage of ops being Loads or Stores, with two threads that will bottleneck both Skylake and Zen before port congestion.

Now you need to find me this workload that is both scalar and SIMD heavy concurrently has a minimum ipc of 2 and is the perfect fit for SMT without bottlenecking the L/S system.

The perfect example of why all these theoretical super high cocurrent port usage doesn't matter is actually 256bit AVX SB/IB vs haswell. Both have the same amount of execution width but haswell is significantly faster because those FP heavy workloads because for 256bit ops it has twice the load and store bandwidth,

So answer me how is Zen going to SUSTAIN 8 128bit reads and 4 128bit writes a cycle when it can only get 2 reads and 1 write a cycle, its very common to see FP workloads with >50% of operations being loads or stores.

There are workload with more operations per each data. Example: there are sub bench of SPEC FP 2006 that have up to 2.4 of IPC (one thread). So 2 threads need 4.8 instructions per clock, mean. Even not counting conflicts, INTEL pipelines are not enough for the peak, just enough for the mean. Zen pipelines probabily yes.

itsmydamnation said:
You dont need to tell me how an X86 processor works, you also do not have, 10uops a cycle. you have 6. upto 6 to int and upto 4 to FP.

edit: before you try to claim its additive to 10uops please explain then why Micheal Clake says they have wider retire then dispatch because it helps to clear out the retire queue and get more instructions in flight, is he lying?

10 uops because we have 4 alu, 2 agu and 4 fp. I was talking of the execute stage. I know that the dispatch can be up to 6 uops of which max 4 FP. Also 10 can't be sustained because retire rate is 8 uops/cycle. But 6 uops cycle can be sustained. For INTEL it depends, because due to int/fp pipe conflicts, there could be slowdown that on Zen can't happen.

EDIT: Zen is designed for the peak. SKL will clog near maximum pipeline utilization. Zen will go smoothy. Do you know queue theory, when approaching max capacity the service time increase exponentially?

itsmydamnation · Feb 2, 2017

bjt2 said:
There are workload with more operations per each data. Example: there are sub bench of SPEC FP 2006 that have up to 2.4 of IPC (one thread). So 2 threads need 4.8 instructions per clock, mean. Even not counting conflicts, INTEL pipelines are not enough for the peak, just enough for the mean. Zen pipelines probabily yes.

10 uops because we have 4 alu, 2 agu and 4 fp. I was talking of the execute stage. I know that the dispatch can be up to 6 uops of which max 4 FP. Also 10 can't be sustained because retire rate is 8 uops/cycle. But 6 uops cycle can be sustained. For INTEL it depends, because due to int/fp pipe conflicts, there could be slowdown that on Zen can't happen.

Yes this is exactly the point, Peak numbers aren't meaningful, sustained numbers are, while there will always be corner cases where these different architectures will have advantages (BD had some great int performance if you used the int ALU + the int SIMD at the same time) in a general normalized sense they wont be and on average the usable resources for both a Zen and a Skylake core are comparable.

Agent-47 said:
Like you agreed, it can go either ways

So? aMD have demoed a 8c16t zen having less lag while streaming than an i7 6900k and attributed it to better multithreading gains due its infinity fabric. I.e. gains from multicore and SMT.

Now your just jamming words together, the infinty fabric is the collection of IP that connects different IP blocks together coherently, it plays no part within a CCX. it also wont have lower latency across the infinity fabric between two CCX's then broadwell-E ringbus. Now better streaming performance could be because of better predictive selection of thread priority but that also doesn't mean its throughput and thus performance is higher.

Now you are cherry picking results. I said on average. Also amd said average.

I didn't do any such thing, i broke them down into two broad categories and then referenced the actual Zen design.

They are better than BD, but we don't know how they stack up against Intel.

We do have limited data which on a early ES form CPC on game benchmarks which have a far higher weighting to core clock then thread count. if we linear scale ( i know thats unlikely) the 3.15 Ghz Zen to 3.6 then its beating the 6900k quite comfortably.

Those are just theory note written down by the architect. On paper BD was also a monster which led to it hype before. I think you should not try to fly so high on arbitrary details with no benchmark to back you.

Actually is wasn't a monster, It was narrow, had high latency caches and bad miss penalties. It never clocked high enough to offset those issues. As we learnt more about the Core/SOC we found out about its glass jaws. If you are aware of these glass jaws you will find amd have specifically mentioned the improvements in those area's.

Trying to Compare the two architectures and say its the same situation is just being disingenuous.

Lol. Yes they did. Compitative. To a server socket with 32 core. Yes indeed. but:
1. servers usually have better SMT.
2. At 32 core its clocked lower, so amd may be referring to its better TDP for lower clocked parts. That will allow them to clock better than Intel at lower TDP. Otherwise they could have picked any CPU. Most importantly, it has nothing about IPC.

What do you mean by they have "better SMT" they are the exact same core.
its also looking like both AMD and intel will have 180watt TDP 32core servers, we have no data to say Zen core is lower perf watt then skylake EP core.

Dont be so arrogant all the time sir, there are far smarter people in this thread. don't make arbitrary connections to make your case like juan does it for Intel.

What arbitrary connections have i made, things dont just happen auto magically because of "insert buzz word" , From all we know there is nothing to say Zen doesn't have higher single thread perf per clock then broadwell. But there are architectural limits that are far more likely to be reached in SMT that limit both broadwell/skylake and Zen to around the same maximum throughput.

Agent-47 · Feb 2, 2017

itsmydamnation said:
Now your just jamming words together, the infinty fabric is the collection of IP that connects different IP blocks together coherently, it plays no part within a CCX. it also wont have lower latency across the infinity fabric between two CCX's then broadwell-E ringbus. Now better streaming performance could be because of better predictive selection of thread priority but that also doesn't mean its throughput and thus performance is higher.

For the last time, AMD itself attributed it to infinity fabric.
You don't have to define infinity fabric for us

itsmydamnation said:
I didn't do any such thing, i broke them down into two broad categories and then referenced the actual Zen design.

I.e. presenting "alternate facts"

itsmydamnation said:
We do have limited data which on a early ES form CPC on game benchmarks which have a far higher weighting to core clock then thread count. if we linear scale ( i know thats unlikely) the 3.15 Ghz Zen to 3.6 then its beating the 6900k quite comfortably.

Yes we do have. And they show clock for clock zen was 8pc slower than 6900k. I am talking about IPC from the beginning.

itsmydamnation said:
Actually is wasn't a monster, It was narrow, had high latency caches and bad miss penalties. It never clocked high enough to offset those issues. As we learnt more about the Core/SOC we found out about its glass jaws. If you are aware of these glass jaws you will find amd have specifically mentioned the improvements in those area's.

Those facts about clocks and latencies were only known after the release of BD.

Prelease it was a monster on paper. Go read the threads from pre release.

itsmydamnation said:
Trying to Compare the two architectures and say its the same situation is just being disingenuous.

Lol. I am not comparing arch, but the situation.

But let me I qoute you back when you claim "there are architectural limits that are far more likely to be reached in SMT that limit both broadwell/skylake ". disingenuous indeed

Don't try too hard to read between lines please

itsmydamnation said:
What do you mean by they have "better SMT" they are the exact same core.
its also looking like both AMD and intel will have 180watt TDP 32core servers, we have no data to say Zen core is lower perf watt then skylake EP core.

Lol. Higher clock with lower TDP I meant. Meaning higher clocks for the same wattage.

Again, I am talking about IPC in ST.

itsmydamnation said:
From all we know there is nothing to say Zen doesn't have higher single thread perf per clock then broadwell.

But there is from CPC.
Its 8pc behind BWE clock for clock. So IPC is behind intel even if we assume similar SMT gains and multicore gain.
Also 8 pc lower throughout on the CPC micro architectural benchmark

I am done making my point.

bjt2 · Feb 2, 2017

itsmydamnation said:
(BD had some great int performance if you used the int ALU + the int SIMD at the same time)

Zen is even better: BD has 2 ALU per thread and an FPU with 3 pipelines for 2 threads, Zen has 4 ALU and 4 FP pipelines for 2 threads. Probabily better...

dullard · Feb 2, 2017

swilli89 said:
1) I used 10% as a mixed use case including all other types of processing beyond gaming. I'm assuming the workloads not shown off by amd will be around that number. But thank you very much for pointing out how small a difference 59 and say, 57fps is. It's indistinguishable.

An 8c16t Ryzen will compete with the 8c16t Intel equivalent, not a $350 7700k. Your entire post is invalid.

You are correct that 8c16t Ryzen isn't a gaming CPU. But you were the one mentioning FPS on games with Ryzen (other than the last throwaway line, your whole post was Ryzen and gaming). I was just responding with how invalid and ridiculous your post was.

Enigma- · Feb 2, 2017

lolfail9001 said:
Here are the 3 red flags:
1. Version 1.50 cannot exist.
2. Build number is too high for ANY 1.xx version. And is not on the list of 2.xx either.
3. CPU name is NOT "AMD Eng Sample ZD36....". The "Ryzen" in the name is the clean give away something is wrong.
3.5. Oh, and the user name... It exists but it is clear as air. Maybe it is legit, but first 2 flags have to be addressed first.

Thanks for clarifying. I am not used to those aots benches at all, but still felt strange so better share it here.

swilli89 · Feb 2, 2017

dullard said:
You are correct that 8c16t Ryzen isn't a gaming CPU. But you were the one mentioning FPS on games with Ryzen (other than the last throwaway line, your whole post was Ryzen and gaming). I was just responding with how invalid and ridiculous your post was.

That's now 2 posts that you are straight up putting words in my mouth. That's bordering on trolling in my book. Round and round and round we go trying to define certain markets to suit our arguments. To some an 8c16t is THE gaming processor to have for ensuring stable FPS for the next handful of years. To some a 4c/8t CPU is a suitable gaming processor (though I'd say this is a foolish investment).

I doubt you are truly misreading what I'm saying and more likely intentionally re-framing my statements to create a strawman so let me reiterate my message.

For both their commercial success (revenue and margin growth) and consumers' real value (finally driving high performance 8C designs to the mainstream), AMD merely needs to come within 5-10% of Intel's CPU's in mixed use aggregate performance and price accordingly. Its incredible to me that people use the fact that Intel artificially segmented the market with their "HEDT" platform for anything more than 4 cores as the same argument to say that an 8C SKU isn't a gaming platform. Just because someone wants more than 4 cores (mainstream in 2009) that doesn't mean that they also need 48X PCIE lanes and quad channel memory with 12 SATA ports.

dullard · Feb 2, 2017

swilli89 said:
That's now 2 posts that you are straight up putting words in my mouth.

I'll put more words in your mouth, the EXACT words that you said.

swilli89 said:
Most people (I'm assuming) will gladly take 57 FPS for $$$ hundreds less than a system giving 59 FPS. The CPU game for gamers is now all about ensuring the GPU is the bottleneck, as it should be. As we've seen in a lot of benchmarks where pretty much any 8+ thread processor maintains the same minimums given the same video card, I think most gamers want a CPU that can promise not to be a bottleneck for 5+ years which by all indications from what we've been shown, an 8-core Ryzen will accomplish. Of course we need to see some thorough gaming benchmarks to further clarify this point but who wouldn't want to save $300 on their CPU and spend that on a higher quality PSU and a videocard upgrade?

Read and reread the underlined and then ask yourself why people think you are talking about gaming with a 8 core Ryzen. If those exact (unedited) words you posted are not about gaming, can you please tell me what they are about? I can't seem to interpret it any other way. The gaming CPU from intel is the 7700k, not the HEDT processors as the 7700k blows HEDT processors away in games for far less money.

If you are talking about HEDT, then yes, you have a point. But for some odd reason HEDT didn't appear in your own words, yet the word "gaming" or "gamers" or "FPS" appears a half dozen times.

swilli89 · Feb 2, 2017

dullard said:
I'll put more words in your mouth, the EXACT words that you said.

Read and reread the underlined and then ask yourself why people think you are talking about gaming with a 8 core Ryzen. If those exact (unedited) words you posted are not about gaming, can you please tell me what they are about? I can't seem to interpret it any other way. The gaming CPU from intel is the 7700k, not the HEDT processors as the 7700k blows HEDT processors away in games for far less money.

If you are talking about HEDT, then yes, you have a point. But for some odd reason HEDT didn't appear in your own words, yet the word "gaming" or "gamers" or "FPS" appears a half dozen times.

Yes I am talking about gaming. Also talking about total performance across applications. I'm also talking about cost of ownership versus value provided. What I don't think I understand is what point you are trying to make.

The funny part about your bolded comment is you are saying HEDT processors cost more and are thus not gaming processors because they cost more. Guess what? Ryzen won't have this huge price disparity and will thus be high core count processors people may indeed buy for gaming.

You are being quite hyperbolic in your statement. Let's examine your own words "7700k blows HEDT processors away in game".

Lets use every single game Anandtech tested on the 7700K.

So you're flat out wrong, the 7700K doesn't blow ANY 6 or 8 core CPU out of the water. The only thing "HEDT" CPUs accomplish here is being much more expensive WHICH Ryzen should have an answer for in the form of their lowest end $400 (my guess) 8c/16T CPU.

lolfail9001 · Feb 2, 2017

Agent-47 said:
So? aMD have demoed a 8c16t zen having less lag while streaming than an i7 6900k and attributed it to better multithreading gains due its infinity fabric. I.e. gains from multicore and SMT.

1. Factually wrong: they were comparing it to 6700k with GPU encoding disabled.
2. They never referenced infinity fabric there.

swilli89 said:
You are being quite hyperbolic in your statement. Let's examine your own words "7700k blows HEDT processors away in game".

That statement is mostly right but requires much faster GPUs than a pitiful 980. So using AT benches is at best... useless.

Lepton87 · Feb 2, 2017

So you're flat out wrong, the 7700K doesn't blow ANY 6 or 8 core CPU out of the water. The only thing "HEDT" CPUs accomplish here is being much more expensive WHICH Ryzen should have an answer for in the form of their lowest end $400 (my guess) 8c/16T CPU

I don't disagree I also think that 7700K doesn't blow HEDT CPUs away but the benchmarks you quoted are basically useless due to being GPU limited.

realibrad · Feb 2, 2017

lolfail9001 said:
1. Factually wrong: they were comparing it to 6700k with GPU encoding disabled.
2. They never referenced infinity fabric there.

That statement is mostly right but requires much faster GPUs than a pitiful 980. So using AT benches is at best... useless.

But the games were run at 1080 which shows the game to be gpu limited. Even going to a GTX 1080 would show the same trend. Quite clearly the top batches of CPUs do not bottleneck the games. Its not at best useless.

dahorns · Feb 2, 2017

realibrad said:
But the games were run at 1080 which shows the game to be gpu limited. Even going to a GTX 1080 would show the same trend. Quite clearly the top batches of CPUs do not bottleneck the games. Its not at best useless.

Wait, what? Once GPU limited, always GPU limited -- is that your position? How do we know the games would be GPU limited by a 1080?

I mean, your argument would suggest that had the benches been run on a 660m and been GPU limited, we would have no need to test with a GTX 980. That's obviously not correct.

CentroX · Feb 2, 2017

Is there anything in the zen architecture that has advantage over skylake? List pro and cons vs skylake architecture.

realibrad · Feb 2, 2017

dahorns said:
Wait, what? Once GPU limited, always GPU limited -- is that your position? How do we know the games would be GPU limited by a 1080?

If you have a game that is GPU limited @ 1080p using a 980, then going to a higher resolution, or getting a faster GPU will not make the game CPU limited. I mean, unless the game was gpu limited at like 30fps but that would be pedantic. Saying at best its useless is incorrect.

Agent-47 · Feb 2, 2017

lolfail9001 said:
1. Factually wrong: they were comparing it to 6700k with GPU encoding disabled.
2. They never referenced infinity fabric there.

Really?
https://youtu.be/vMfNz2SXVLk
Skip to 2:50

Its say " this 8 core vs that 8 core" with a subtext labeling intel 6900k and amd's claim of the reason being infinity fabric. Yes hardware acceleration was not available but that's a fair thing to do since they were trying to prove a point, I.e. better multicore processing.

Now granted, Linus is not the best technical guy, but he was very emphatic of saying he was in touch with Amd regarding this.

dahorns · Feb 2, 2017

realibrad said:
If you have a game that is GPU limited @ 1080p using a 980, then going to a higher resolution, or getting a faster GPU will not make the game CPU limited. I mean, unless the game was gpu limited at like 30fps but that would be pedantic. Saying at best its useless is incorrect.

Obviously going to a higher resolution wouldn't make the game CPU limited. You would want to go to a lower resolution. But I really don't understand your point. Are you trying to say that performance is good enough that additional increases in FPS won't matter? If so, (1) obviously one of the games is below an acceptable level for many people (<40fps), (2) your point is largely irrelevant in the context of determining which CPU will offer better performance not just now, but 4 years from now.

realibrad · Feb 2, 2017

dahorns said:
Obviously going to a higher resolution wouldn't make the game CPU limited. You would want to go to a lower resolution. But I really don't understand your point. Are you trying to say that performance is good enough that additional increases in FPS won't matter? If so, (1) obviously one of the games is below an acceptable level for many people (<40fps), (2) your point is largely irrelevant in the context of determining which CPU will offer better performance not just now, but 4 years from now.

I am not saying that we have good enough perf. What I am saying is that the charts are not useless. Its showing that the 7700k is not blowing away all other CPUS.
As for what will be better 4 years from now, you can only look at trends and statements. Nothing has show a big shift in CPU usage. CPUs have not been as much of a factor as GPUs for a while now.

frozentundra123456 · Feb 2, 2017

swilli89 said:
Yes I am talking about gaming. Also talking about total performance across applications. I'm also talking about cost of ownership versus value provided. What I don't think I understand is what point you are trying to make.

The funny part about your bolded comment is you are saying HEDT processors cost more and are thus not gaming processors because they cost more. Guess what? Ryzen won't have this huge price disparity and will thus be high core count processors people may indeed buy for gaming.

You are being quite hyperbolic in your statement. Let's examine your own words "7700k blows HEDT processors away in game".

Lets use every single game Anandtech tested on the 7700K.

So you're flat out wrong, the 7700K doesn't blow ANY 6 or 8 core CPU out of the water. The only thing "HEDT" CPUs accomplish here is being much more expensive WHICH Ryzen should have an answer for in the form of their lowest end $400 (my guess) 8c/16T CPU.

Those tests that you show are clearly gpu limited. Seriously, AT, your gaming tests are worthless--a gtx 980--wtf. I am not trying to say a 7700k would or would not blow away a HEDT chip (probably fairly close either way depending on the game) but one can tell nothing about relative cpu performance from that data.

dahorns · Feb 2, 2017

realibrad said:
I am not saying that we have good enough perf. What I am saying is that the charts are not useless. Its showing that the 7700k is not blowing away all other CPUS.
As for what will be better 4 years from now, you can only look at trends and statements. Nothing has show a big shift in CPU usage. CPUs have not been as much of a factor as GPUs for a while now.

Ok . . . if we moved up to a 1080 and now the 7700k improved from 74 fps to 100 fps and everything else remained the same, would that matter to you? Are you honestly suggesting that increasing the power of the GPU cannot reveal differences in CPU performance?

realibrad · Feb 2, 2017

dahorns said:
Ok . . . if we moved up to a 1080 and now the 7700k improved from 74 fps to 100 fps and everything else remained the same, would that matter to you? Are you honestly suggesting that increasing the power of the GPU cannot reveal differences in CPU performance?

That is the pedantic situation I was talking about. You are going to see far more gains in moving up GPU power, not CPU power. Saying the 7700K is not blowing away anything for gaming. That is because the move to the 1080 with a 6700k looks almost equal except maybe 1fps difference.

Abwx · Feb 2, 2017

dahorns said:
Obviously going to a higher resolution wouldn't make the game CPU limited. You would want to go to a lower resolution. .

A 980 coupled with a i7 and playing the game at 720p...?..

To summarize one has to use unrealistic settings to show some CPUs in good light, how it performs on real
condition doesnt matter since this would render moot the most expensive CPUs, hence the rethoric of GPUs being allegedly limited when the real limitation is the irrelevance of such a "logic"..

frozentundra123456 said:
Those tests that you show are clearly gpu limited. .

These are CPU limited for whom want to play at 720p with a GTX980, but then the framerate would still be more than adequate with whatever CPU...

dahorns · Feb 2, 2017

realibrad said:
That is because the move to the 1080 with a 6700k looks almost equal except maybe 1fps difference.

My whole point is that we don't know that. You can believe that. It may very well be true. But the chart provided doesn't tell us that information. It is possible, maybe not probable, but possible, that going to a 1080 significantly boosts the 7700k's performance relative to the others on the list. I like my evidence to be relevant to my conclusions.]

realibrad said:
That is the pedantic situation I was talking about. You are going to see far more gains in moving up GPU power, not CPU power. Saying the 7700K is not blowing away anything for gaming.

That's probably true. But I'm not sure the relevance. We aren't discussing the better use of money. We are discussing how to compare processors and what type of performance (for a processor) matters more in the gaming context.

dahorns · Feb 2, 2017

Abwx said:
A 980 coupled with a i7 and playing the game at 720p...?..

To summarize one has to use unrealistic settings to show some CPUs in good light, how it performs on real
condition doesnt matter since this would render moot the most expensive CPUs, hence the rethoric of GPUs being allegedly limited when the real limitation is the irrelevance of such a "logic"..

These are CPU limited for whom want to play at 720p with a GTX980, but then the framerate would still be more than adequate with whatever CPU...

I'm just talking about isolating CPU performance when trying to compare CPUs. I agree that from an economic perspective it is also important to consider your needs and the relative gain you'll get compared to spending more on some other component. Both sets of analysis are good to have. I don't think we have to exist in an "either . . . or" world. We can have both. We should have both.

lolfail9001 · Feb 2, 2017

Agent-47 said:
Really?

Why not use original source, then?
https://www.youtube.com/watch?v=4DEfj2MRLtA from 47:00 or about that.
On one side they had 6900k that did identically to Ryzen rig and on the other they had 6700k that lagged. Infinity fabric was never mentioned.

realibrad said:
But the games were run at 1080 which shows the game to be gpu limited. Even going to a GTX 1080 would show the same trend. Quite clearly the top batches of CPUs do not bottleneck the games. Its not at best useless.

They do bottleneck the games but you do need to go to at least 1080 on most games in 1080p to see that happen. When that happens, 6700/7700+fast memory do tend to beat out HEDT. Now, practical usability of that is questionable, but that does not invalidate statement itself. Plus, that tends to rear it's head more clearly with minimum framerates, see Fallout4 for reference.

New Zen microarchitecture details

Senior member

Platinum Member

Senior member

Senior member

Elite Member

Junior Member

Golden Member

Elite Member

Golden Member

Golden Member

Platinum Member

Lifer

Senior member

Senior member

Lifer

Senior member

Senior member

Lifer

Lifer

Senior member

Lifer

Lifer

Senior member

Senior member

Golden Member