AMD “Next Horizon Event" Thread

Page 7 - Seeking answers? Join the AnandTech community: where nearly half-a-million members share solutions and discuss the latest tech.

Abwx

Lifer
Apr 2, 2011
11,162
3,858
136
This reflects a 20% clock speed increase at the same TDP (300W for both cards).

The original Zen on GloFo 14nm tops out at about 4.0 to 4.1 GHz (the "12nm" process which is really a tweaked and refined 14nm can do a few hundred MHz above that). Interpreting the process gains based on what we've seen announced with Vega so far, this would indicate to me that we're probably looking at peak boost clocks of 4.8 to 5.0 GHz for consumer-focused Ryzen 3000 products. .

This cant be transposed, at 4GHz+ it wont be the same, we can expect 250-300MHz higher fequencies but 10% seems a stretch.

It s likely that AMD has anticipated that they would be lacking process wise for years, and that the only mean they have to compensate is higher density to allow more transistors and improve the IPC and throughput, hence the big pushes they planned in this matter.

We ll see how the things materialize in the DT/NBook environment, but if anything the bench they displayed point eventualy to very healthy gains in FP.
 

DownTheSky

Senior member
Apr 7, 2013
787
156
106
We don't know what the shipping clocks for Zen 2 products will be, but we do have some figures released for Vega. The existing Radeon MI25 (Vega 10 @ GloFo 14nm) has a peak boost clock of 1500 MHz. The upcoming Radeon MI60 (Vega 20 @ TSMC 7nm), announced today, has a peak boost clock of 1800 MHz. This reflects a 20% clock speed increase at the same TDP (300W for both cards).

That's not counting the memory. Old card has 16GB, new one 32GB higher clocked.
 

TheGiant

Senior member
Jun 12, 2017
748
353
106
so the predictions on memory latency (and thus the gaming performance) ? how come that AMD claims better latency with de-integrated memory controller ? the L4 (/wave Broadwell) ?
 

itsmydamnation

Platinum Member
Feb 6, 2011
2,860
3,407
136
so the predictions on memory latency (and thus the gaming performance) ? how come that AMD claims better latency with de-integrated memory controller ? the L4 (/wave Broadwell) ?
They could have much improved cache/home agent/directory controller. That is one of the weaker areas in Zepplin.
 

Atari2600

Golden Member
Nov 22, 2016
1,409
1,655
136
so the predictions on memory latency (and thus the gaming performance) ? how come that AMD claims better latency with de-integrated memory controller ? the L4 (/wave Broadwell) ?

As I pointed out elsewhere, the time (on Zen1+) to get data from foreign CCX to local CCX via IF is still an order of magnitude less than the time taken to get data from DRAM.

In Zen2, it is very likely that the links from CCX to IO Controller are not running at MEMCLK = lower latency, so that difference (between going direct vs. going indirect) decreases further.


They have also stated they intend to support higher memory speeds. Given this presentation was all about Rome, I can only assume their statement is in the context of EPYC - i.e. server EEC memory - so an increase from 2933 to > 3000 should see latency from IO Controller to DRAM drop. 10% improvement there would dwarf any loss due to travel from CCX to IO Controller.
 

arandomguy

Senior member
Sep 3, 2013
556
183
116
Official memory for current gen Epyc is 2400/2666.

https://www.amd.com/system/files/2017-06/AMD-EPYC-Data-Sheet.pdf

Were they specific with regards to better memory latency? Better memory latency with respect to Rome vs Naples does not necessarily mean the statement will apply to other platforms.

A general thing I've noticed with respect to Rome predictions/theorizing is that it seems like many people were applying bias towards the desktop with respect to interpreting information with regards to Rome.

Now I'm not predicting this either way but I don't see why the idea that Zen 2 (at least in this format) might have been primarily targeted for a different work load then desktop/gaming should be dismissed as the possibility. Intel's Skylake and Skylake-E differentiation for example is a clear example of optimization for different work loads. Likewise don't be dismissive that desktop Zen 2 might have more divergence, APU Zen itself had some divergence for example.
 

Gideon

Golden Member
Nov 27, 2007
1,708
3,919
136
IMO AMD has plenty of room to improve the memory latency. Sure, chiplets will degrade it somewhat, but Ryzens current latency is already 20+ ns worse than Intel's, despite being monolithic. Whatever latency they lose from going MCM, they could certainly gain back just by improving the memory controller and decoupling the Fabric clocks from the memory-clock (or even just allow faster memory, say 4000 Mhz+). L4 might help somewhat in addition to that.
 

Gideon

Golden Member
Nov 27, 2007
1,708
3,919
136
Hmm, servethehome has a few interesting dibits (and they are usually very accurate):

1. The I/O chip will handle all I/O, including PCIe. (in fact no NVLink required for any GPUs connected to a single socket)
2. Infininty-Fabric speeds greatly improved (probably uses PCIe Gen4 under the hood)
3. Rumors and Hints from AMD about increased clock speeds

From highlighted parts below (but i suggest reading the entire article):
AMD EPYC 2 Rome Details
Here is the quick summary of what we learned today about the AMD EPYC 2 “Rome” generation:
  • Up to eight 7nm x86 compute chiplets per socket.
  • Each x86 chiplet up to 8 cores
  • 64 cores confirmed AMD EPYC Rome Details Trickle Out 64 Cores 128 Threads Per Socket
  • There is a 14nm I/O chip in the middle of each package
  • This I/O chip will handle DDR4, Infinity Fabric, PCIe and other I/O
  • PCIe Gen4 support providing twice the bandwidth of PCIe Gen3
  • Greatly improved Infinity Fabric speeds to be able to handle the new I/O chip infrastructure including memory access over Infinity Fabric
  • Ability to connect GPUs and do inter-GPU communication over the I/O chip and Infinity Fabric protocol so that one does not need PCIe switches or NVLink switches for chips on the same CPU. We covered the current challenges in: How Intel Xeon Changes Impacted Single Root Deep Learning Servers. This can be a game changer for GPU and FPGA accelerator systems.
  • Socket compatible with current-generation AMD EPYC “Naples” platforms.
  • Although not confirmed by AMD, we will state that most if not all systems will need a PCB re-spin to handle PCIe Gen4 signaling. So existing systems can get Rome with PCIe Gen3 but will require higher-quality PCB for PCIe Gen4.
  • Claimed significant IPC improvements and twice the floating point performance per core.
  • Incrementally improved security per core including new Spectre mitigations

This is a long list. We now have a fairly good idea about what the next-generation will offer. Cache sizes, fabric latencies, clock speeds, I/O chip performance, DDR4 speeds and other aspects have not been disclosed, so there is still a long way to go until we have a full picture. We have heard rumors of, and AMD hinted at the notion that with 7nm they would be able to get increased clock speeds as well.
 

exquisitechar

Senior member
Apr 18, 2017
666
904
136
IIRC Charlie Demerjian has said on Twitter that the turbo on a few cores will be significantly higher than before for Ryzen 3xxx CPUs. I wonder about final 64c Epyc 2 clocks, the one that they benchmarked probably wasn't clocked all that high.
 
Reactions: TheGiant and JimmyH

krumme

Diamond Member
Oct 9, 2009
5,956
1,595
136
Why not just use same 8c chiplets for everything?
I mean you have current 14nm apu to cover mobile low-end and current Zen plus for initial desktop low end.
They just different 14nm io die eg. with and without gpu. If gpu is needed at all and they don't glue it on also.
 

TheGiant

Senior member
Jun 12, 2017
748
353
106
IIRC Charlie Demerjian has said on Twitter that the turbo on a few cores will be significantly higher than before for Ryzen 3xxx CPUs. I wonder about final 64c Epyc 2 clocks, the one that they benchmarked probably wasn't clocked all that high.
this is what I am looking for and missing in current zen implementations
 

piesquared

Golden Member
Oct 16, 2006
1,651
473
136
IIRC Charlie Demerjian has said on Twitter that the turbo on a few cores will be significantly higher than before for Ryzen 3xxx CPUs. I wonder about final 64c Epyc 2 clocks, the one that they benchmarked probably wasn't clocked all that high.

Having a couple or a few highly binned chiplets for Xtreme boost is a nice option that this design gives them.

The clocks were almost certainly lower than final release. I think it was Papermaster who might have even said that this was a prototype package? Is it even possible to run benchmarks on a prototype?
 

Atari2600

Golden Member
Nov 22, 2016
1,409
1,655
136
Why not just use same 8c chiplets for everything?
I mean you have current 14nm apu to cover mobile low-end and current Zen plus for initial desktop low end.
They just different 14nm io die eg. with and without gpu. If gpu is needed at all and they don't glue it on also.

This is exactly what I expect them to do.

(i) Mask cost for 7nm is extremely arduous.
(ii) Limited 7nm wafer capacity (given expected demand).
(iii) AMD have indicated IO does not scale so well on the smaller process and performance is not so sensitive to the smaller process.
(iv) Reduced resources (all 3) to qualify new parts.
(v) More flexibility in how they use chiplets to meet demand (depending on where market yield is greatest they can adjust package ratios).
 
Reactions: DarthKyrie

Despoiler

Golden Member
Nov 10, 2007
1,966
770
136
Why not just use same 8c chiplets for everything?
I mean you have current 14nm apu to cover mobile low-end and current Zen plus for initial desktop low end.
They just different 14nm io die eg. with and without gpu. If gpu is needed at all and they don't glue it on also.

I was just thinking about this in combination with the 1x vs 2x CCX. If AMD keeps the same overall strategy for consumer and enterprise we would have 8 core 1x CCX as the base chip. They can bin or fuse to get less cores sure. Depending on how cheap the chiplet strategy is 8 core could possibly be the lowest core count offered. I think it gives AMD an advantage to continue ramping up core counts because basically starves Intel. Intel is stuck on process. The bigger the chips they have to pump out the less chips they can make and the less profits they can reap. The other cool thing is if 8 core is the new norm that means something bigger like 16 core has to take the place on the high end desktop. Can you imagine what devs could do with an extra 8 core chip? A new best AI mode. Run actual simulations. It would be completely new territory. Probably dreaming. Hahaha
 

coercitiv

Diamond Member
Jan 24, 2014
6,378
12,768
136
I think it gives AMD an advantage to continue ramping up core counts because basically starves Intel. Intel is stuck on process.
Let's not forget this mainstream consumer product needs to keep cost down and also work efficiently with dual channel memory. The more chiplets they use, the bigger the size of the IO chip, and the higher the cost of the entire package.
 

Atari2600

Golden Member
Nov 22, 2016
1,409
1,655
136
Let's not forget this mainstream consumer product needs to keep cost down and also work efficiently with dual channel memory. The more chiplets they use, the bigger the size of the IO chip, and the higher the cost of the entire package.

Yep, but a base 8C product does not need anything beyond dual channel memory.
 

AtenRa

Lifer
Feb 2, 2009
14,003
3,360
136
Is 3 seconds a big deal? I honestly don't know.

Since those CPUs are for servers, the first thing we are interested to know is how much power each system used to finish the C-Ray benchmark and secondly, how much space each system will use because rack space is gold in server rooms.

So if EPYC 2 is 10% faster than dual XEONs but using 30-40% less power and can fit the same amount of cores at half the rack space then we are talking about a major advantage for the AMD product vs the competition.
 

ub4ty

Senior member
Jun 21, 2017
749
898
96
Hmm, servethehome has a few interesting dibits (and they are usually very accurate):

1. The I/O chip will handle all I/O, including PCIe. (in fact no NVLink required for any GPUs connected to a single socket)

From highlighted parts below (but i suggest reading the entire article):
  • Ability to connect GPUs and do inter-GPU communication over the I/O chip and Infinity Fabric protocol so that one does not need PCIe switches or NVLink switches for chips on the same CPU. We covered the current challenges in: How Intel Xeon Changes Impacted Single Root Deep Learning Servers. This can be a game changer for GPU and FPGA accelerator systems

Single root PCIE complex on such a massive core count CPU is a YUGE game changer indeed.
Currently with threadripper, you have a PCIE complex per CPU die which causes issues with various applications involving GPU<->GPU.
This problem is further impacted by added latency.

With a single I/O chip handling all of the I/O, PCI-E .. The scalability and performance is massive.
Looking forward to Zen2 Threadripper !
Glad I got the first gen and road it out to gen2.

Looks like a sell and upgrade in 2019/2020 with some retiring to servers.

Also of note is that Intel broke the single PCIE root complex paradigm with their new line.
Also of note is the encoded jab at Nvidia w.r.t to the new Radeon GPUs being able to run infinity fabric protocol over the PCIE 4.0 lanes. AMD truly went for an open and scalable approach and its beginning to payoff bigly !
 

Atari2600

Golden Member
Nov 22, 2016
1,409
1,655
136
If base is 8c, what's at the $300 price point?

Same 8C chiplet. Right down to the basement. They might not even offer a 4 core Zen2 product (unless harvesting duds I guess).

7nm design cost is ~300m USD and cost per unit is pretty much flat going from 12nm to 7nm - so its as cheap to make the IO on 12nm as it is to incorporate it onto 7nm.

AMD had a revenue of around 1.8B USD in Q2 2018 - how much of that would have been from Ryzen3? Judging from mindfactory proportions, very little.

Basically, I don't believe the design cost will justify its existence by lowering manufacturing cost enough.


https://www.extremetech.com/computing/272096-3nm-process-node
https://www.icknowledge.com/news/Technology and Cost Trends at Advanced Nodes - Revised.pdf
https://www.pcgamesn.com/wp-content/uploads/2018/09/Mindfactory-AMD-vs-Intel-580x326.jpg
 
sale-70-410-exam    | Exam-200-125-pdf    | we-sale-70-410-exam    | hot-sale-70-410-exam    | Latest-exam-700-603-Dumps    | Dumps-98-363-exams-date    | Certs-200-125-date    | Dumps-300-075-exams-date    | hot-sale-book-C8010-726-book    | Hot-Sale-200-310-Exam    | Exam-Description-200-310-dumps?    | hot-sale-book-200-125-book    | Latest-Updated-300-209-Exam    | Dumps-210-260-exams-date    | Download-200-125-Exam-PDF    | Exam-Description-300-101-dumps    | Certs-300-101-date    | Hot-Sale-300-075-Exam    | Latest-exam-200-125-Dumps    | Exam-Description-200-125-dumps    | Latest-Updated-300-075-Exam    | hot-sale-book-210-260-book    | Dumps-200-901-exams-date    | Certs-200-901-date    | Latest-exam-1Z0-062-Dumps    | Hot-Sale-1Z0-062-Exam    | Certs-CSSLP-date    | 100%-Pass-70-383-Exams    | Latest-JN0-360-real-exam-questions    | 100%-Pass-4A0-100-Real-Exam-Questions    | Dumps-300-135-exams-date    | Passed-200-105-Tech-Exams    | Latest-Updated-200-310-Exam    | Download-300-070-Exam-PDF    | Hot-Sale-JN0-360-Exam    | 100%-Pass-JN0-360-Exams    | 100%-Pass-JN0-360-Real-Exam-Questions    | Dumps-JN0-360-exams-date    | Exam-Description-1Z0-876-dumps    | Latest-exam-1Z0-876-Dumps    | Dumps-HPE0-Y53-exams-date    | 2017-Latest-HPE0-Y53-Exam    | 100%-Pass-HPE0-Y53-Real-Exam-Questions    | Pass-4A0-100-Exam    | Latest-4A0-100-Questions    | Dumps-98-365-exams-date    | 2017-Latest-98-365-Exam    | 100%-Pass-VCS-254-Exams    | 2017-Latest-VCS-273-Exam    | Dumps-200-355-exams-date    | 2017-Latest-300-320-Exam    | Pass-300-101-Exam    | 100%-Pass-300-115-Exams    |
http://www.portvapes.co.uk/    | http://www.portvapes.co.uk/    |