AMD Zen - Key Dates and Information

Page 9 - Seeking answers? Join the AnandTech community: where nearly half-a-million members share solutions and discuss the latest tech.

AtenRa

Lifer
Feb 2, 2009
14,003
3,361
136
So you add 408/512GB/sec memory bandwidth for something whos TDP would at best use 80-100GB/sec?

And the mobile one? Oh that was funny!

Sounds like another fairy tale and unicorns product.

You only need a single stack of HBM2 with 2-4-8GB and only 128GB/s to 256GB/s bandwidth.

 

Glo.

Diamond Member
Apr 25, 2015
5,765
4,670
136
So you add 408/512GB/sec memory bandwidth for something whos TDP would at best use 80-100GB/sec?

And the mobile one? Oh that was funny!

Sounds like another fairy tale and unicorns product.
Previous versions of GPUs were designed to handle specific type of bandwidth. Thats why everybody called GPU memory memory pool. With Vega, AMD changed approach. HBM is used as a cache, and the more bandwidth it has, the better for the results.

Secondly, the memory will not be used only by GPU, but, in case you missed it, also by CPU. AMD designed Zen with APUs in mind. You should've done better job at doing research, and analysis of the architecture of both AMD products.

End of off-topic.
 
Reactions: raghu78

ariknowsbest

Junior Member
Jun 20, 2016
12
0
6
Moving away from games - if AMD can build an APU that can function sufficiently for CATIA/Autocad/Maya etc, then professionals will eat it up. If they could build a hardware+network+software* solution that would allow for designers to use during the day, then analysts to use for distributed HPC overnight, they'd be onto a massive winner.
Even current Pro series APUs runs rather well with multitaskin, I have 2-3 heavy financial/analytics programs up on one screen and Excel, Outlook, Skype, Browser 5 tabs on the other. I first assumed it was a I5 but later found out that it was actually A10 Pro underneath.

Zen can become popular in enterprise after all business laptops tend to have short lifespan as they get lost/stolen/dropped...
 

KTE

Senior member
May 26, 2016
478
130
76
Even current Pro series APUs runs rather well with multitaskin, I have 2-3 heavy financial/analytics programs up on one screen and Excel, Outlook, Skype, Browser 5 tabs on the other. I first assumed it was a I5 but later found out that it was actually A10 Pro underneath.

Zen can become popular in enterprise after all business laptops tend to have short lifespan as they get lost/stolen/dropped...
Enterprise hardware lasts 3-5 years, minimum. With laptops, 5-7 more so.

I don't mean SMBs. They're far smaller.

Then, after use, a company tends to pass on older hardware to their partners for use.

In my experience, government sector work, previous AMD APUs with 4GB RAM run such programs really poorly in enterprise environments. Specifically Excel based.

And with one complex macro on macro, even a DT Ivy Bridge 16GB can grind to a halt!

I have ~40 odd machines currently using AMD APUs, and about 30 laptops at work (at my site).

You have to bare in mind the amount of background junk on enterprise systems for security, accessibility and control. Also that users barely restart the systems!

I haven't tried A10 desktop machines for work so can't comment.

For CNC/CATIA/CAD use with an APU, minimum you need 6 cores. 4 cores will struggle in professional environments (I worked as a fighter aircraft engineer for some years).

Sent from HTC 10
(Opinions are own)
 

imported_jjj

Senior member
Feb 14, 2009
660
430
136
What makes the most sense for "APU" would be a CPU die, 1 or 2 GPUs and 1 memory stack per GPU but not with the current silicon interposer.

So lest say on 7nm you would have a 60-80mm2 quad or hexa CPU die, each GPU slice would be similar to the CPU slice in size (lets say 1024 Navi "cores").
But they need to sort out the software to use dual GPUs well and the interconnect between the 2 GPUs, something they might be able to by 2018..
A silicon Interposer and HBM with TSV is too costly. A cheaper solution like Intel's silicon bridge or organic interposer and budget HBM would work much better and that's likely by the time 7nm is available from GloFo.
That way they would have much better yields, save a lot on development costs, they can use 1 or 2 or no GPU slices in different SKUs and address a very wide market. If the CPU can access the HBM, great the OEM can save on DRAM costs.
If you build an APU (monolithic CPU+GPU) with HBM,you kinda waste the potential offered by advanced packaging while spending quite a bit for such a solution..
 
Last edited:

itsmydamnation

Platinum Member
Feb 6, 2011
2,868
3,419
136
So just eyeballing on 28 bulk it looks like a single HBM interface takes up around 1/2 the size of a 64/72bit DDR3 interface. if we assume the same scaling on 14nm for the two interface types then i can totally see with the changes AMD have made with Vega that the Consumer APU would have a single HBM interface and based on packaging disable or enable it. Assuming a ~160mm APU that 1 stack HBM interface would only be ~2-3% of the die.
 

ShintaiDK

Lifer
Apr 22, 2012
20,378
145
106
Previous versions of GPUs were designed to handle specific type of bandwidth. Thats why everybody called GPU memory memory pool. With Vega, AMD changed approach. HBM is used as a cache, and the more bandwidth it has, the better for the results.

Secondly, the memory will not be used only by GPU, but, in case you missed it, also by CPU. AMD designed Zen with APUs in mind. You should've done better job at doing research, and analysis of the architecture of both AMD products.

End of off-topic.

There is no real change besides just renaming it to cache. Vega still sits with 16GB/sec to main memory.

Its no different than how GP100, Xeon Phi etc works. However unlike the 2 others, Xeon Phi got 102GB/sec to main memory.
 

itsmydamnation

Platinum Member
Feb 6, 2011
2,868
3,419
136
There is no change besides just renaming it to cache. Vega still sits with 16GB/sec to main memory.

Its no different than how GP100, Xeon Phi etc works. However unlike the 2 others, Xeon Phi got 102GB/sec to main memory.
sigh, except for when its on an interposer or in a SOC, LIKE WHAT WE ARE TALKING ABOUT!

Also got any proof that Vega is only a 8x pci-e gen 3 device?

Also you might want to tell people like sebbi @ B3D they dont know what they are talking about ok . Don't you get tired of this?


edit: a quote for good measure

Most awesome news in long time. Someone is finally talking about games and fine grained automated memory paging from CPU memory. Hopefully Nvidia follows the suit. Professional (5000$+) Pascal P100 supports this already in CUDA. Link: http://www.techenablement.com/key-aspects-pascal-commercial-exascale-computing/. Now we just need consumer NV GPU support and graphics API support.

Future looks bright
 
Last edited:

ShintaiDK

Lifer
Apr 22, 2012
20,378
145
106
sigh, except for when its on an interposer or in a SOC, LIKE WHAT WE ARE TALKING ABOUT!

Also got any proof that Vega is only a 8x pci-e gen 3 device?

Also you might want to tell people like sebbi @ B3D they dont know what they are talking about ok . Don't you get tired of this?


edit: a quote for good measure

What does an interposer change in this matter? And who ever mentioned PCIe 3.0 x8?
 

itsmydamnation

Platinum Member
Feb 6, 2011
2,868
3,419
136
What does an interposer change in this matter? And who ever mentioned PCIe 3.0 x8?
You did by saying it only has 16gb to main memory ( or 2.0 x16 take your pick, because Zen APU will have far more memory bandwidth then 16gb/s) , as to an interposer GMI can be used to scale up bandwidth as needed.
 

ShintaiDK

Lifer
Apr 22, 2012
20,378
145
106
You did by saying it only has 16gb to main memory ( or 2.0 x16 take your pick, because Zen APU will have far more memory bandwidth then 16gb/s) , as to an interposer GMI can be used to scale up bandwidth as needed.

Now you talk about some fabled APU again? The one with the vague slides 2 years ago and not heard from since?

PCIe 1.0 250MB/sec per lane.
PCIe 2.0 500MB/sec per lane.
PCIe 3.0 1000MB/sec per lane.

Vega 10 doesn't have GMI, that's a Vega 20 feature.



 
Reactions: CHADBOGA

itsmydamnation

Platinum Member
Feb 6, 2011
2,868
3,419
136
Now you talk about some fabled APU again? The one with the vague slides 2 years ago and not heard from since?

PCIe 1.0 250MB/sec per lane.
PCIe 2.0 500MB/sec per lane.
PCIe 3.0 1000MB/sec per lane.

Vega 10 doesn't have GMI, that's a Vega 20 feature.
per lane Per DIRECTION, now what does a cache do .......

Also nothing on those slides says Vega 10 doesn't have GMI or that it wont appear on a Server APU. it just says Vega 20 has xGMI which up until those posts that name has never been seen.

You also completely ignore the on SOC component which is what we having been talking about you know a potential Raven Ridge config. But keep trying to move those goal posts, any thing to Derp Derp AMD hey?
 

ShintaiDK

Lifer
Apr 22, 2012
20,378
145
106
per lane Per DIRECTION, now what does a cache do .......

Also nothing on those slides says Vega 10 doesn't have GMI or that it wont appear on a Server APU. it just says Vega 20 has xGMI which up until those posts that name has never been seen.

You also completely ignore the on SOC component which is what we having been talking about you know a potential Raven Ridge config. But keep trying to move those goal posts, any thing to Derp Derp AMD hey?

So 16GB/sec is correct. You wouldn't claim your 1Gbit connection is 2Gbit either would you.

Raven Ridge is a consumer APU.
 

Dresdenboy

Golden Member
Jul 28, 2003
1,730
554
136
citavia.blog.de
Take this with a grain of salt, but a youtuber has said the clocks have been leaked. "RedGamingTech"

https://www.youtube.com/watch?v=fLIhsr7q9UA&t=232s

3.6 base/4.0 Turbo


Same video, on reddit

https://www.reddit.com/r/gaming/comments/5mdo2z/ryzen_clock_speed_leaked_out_base_clock_36_ghz/
Source: @CPCHardware on Twitter: https://twitter.com/CPCHardware/status/817044837358780416

This was part of our conversation about the Hardwareluxx CES discovery (3.6/3.9 RyZen ES), which had a "F3" OPN code for stepping (likely B0), while Canard has/saw a "F4" (B1?).
 

itsmydamnation

Platinum Member
Feb 6, 2011
2,868
3,419
136
So 16GB/sec is correct. You wouldn't claim your 1Gbit connection is 2Gbit either would you.
Not when your talking about storing data like a cache does you talk about totals, we talk about the maximum throughput when we talk about memory speeds or cache throughput. You can achieve 32gb/s of throughput between the HBM on a pci-e 3 x16 device and main memory.

But the real funny thing is Knights landing has 2x 16 + 1x4 PCI-E gen 3 lanes in the best case and 1x 16 in the worst so your number aren't even close to correct (surprise surprise) (http://www.hotchips.org/wp-content/.../HC27.25.710-Knights-Landing-Sodani-Intel.pdf)
 

ShintaiDK

Lifer
Apr 22, 2012
20,378
145
106
Not when your talking about storing data like a cache does you talk about totals, we talk about the maximum throughput when we talk about memory speeds or cache throughput. You can achieve 32gb/s of throughput between the HBM on a pci-e 3 x16 device and main memory.

But the real funny thing is Knights landing has 2x 16 + 1x4 PCI-E gen 3 lanes in the best case and 1x 16 in the worst so your number aren't even close to correct (surprise surprise) (http://www.hotchips.org/wp-content/.../HC27.25.710-Knights-Landing-Sodani-Intel.pdf)

Its still 16GB/sec, no matter how much you try and twist it

And Xeon Phi got 6 DDR4 channels attached directly.
http://ark.intel.com/products/94033/Intel-Xeon-Phi-Processor-7210-16GB-1_30-GHz-64-core
It doesn't use the PCIe channels for memory access.
 

Dresdenboy

Golden Member
Jul 28, 2003
1,730
554
136
citavia.blog.de
There is no real change besides just renaming it to cache. Vega still sits with 16GB/sec to main memory.

Its no different than how GP100, Xeon Phi etc works. However unlike the 2 others, Xeon Phi got 102GB/sec to main memory.
Bandwidths like 128GB/s from the HBM stack should be enough for a DT/NB APU (not Snowy Owl). But it wouldn't be a renamed frame buffer (which seems to be wrong based on Vega info), but truly a cache (what can be done with Xeon Phi too) with cache lines, tags, prefetches, and a dedicated, transparent management.

Working as a cache not only reduces the actual memory requirements (they've shown ~50% utilization for 2 typical games), but also reduce first hit latency in case of good prefetching.

Not to forget (old, but an indication) 50GB/s bus (not 16) and 128GB/s from a HBM stack:


Likely config for Snowy Owl:


It might even look like this:
 

CatMerc

Golden Member
Jul 16, 2016
1,114
1,153
136
Bandwidths like 128GB/s from the HBM stack should be enough for a DT/NB APU (not Snowy Owl). But it wouldn't be a renamed frame buffer (which seems to be wrong based on Vega info), but truly a cache (what can be done with Xeon Phi too) with cache lines, tags, prefetches, and a dedicated, transparent management.

Working as a cache not only reduces the actual memory requirements (they've shown ~50% utilization for 2 typical games), but also reduce first hit latency in case of good prefetching.

Not to forget (old, but an indication) 50GB/s bus (not 16) and 128GB/s from a HBM stack:


Likely config for Snowy Owl:


It might even look like this:
Taking this into account, how likely do you think it is for Raven Ridge to include some form of L4 cache? Perhaps not HBM, but something more akin to Crystalwell.
 

beginner99

Diamond Member
Jun 2, 2009
5,223
1,598
136
Its still 16GB/sec, no matter how much you try and twist it

And Xeon Phi got 6 DDR4 channels attached directly.
http://ark.intel.com/products/94033/Intel-Xeon-Phi-Processor-7210-16GB-1_30-GHz-64-core
It doesn't use the PCIe channels for memory access.

We aren't talking about PCIe cards. OMG. An APU can use 1 stack of HBM2 as cache + DDR4. In case of AMD APU probably only 2 channels but these APUs won't compete with KL anyway. So no, it won't be limited by PCIe bandwidth.

Also given the uArch overview of Vega with the Cache Controller and that it can address terabytes of memory I strongly assume it can also access DDR4 directly. So something similar to a KL PCIe card could be possible.
 
sale-70-410-exam    | Exam-200-125-pdf    | we-sale-70-410-exam    | hot-sale-70-410-exam    | Latest-exam-700-603-Dumps    | Dumps-98-363-exams-date    | Certs-200-125-date    | Dumps-300-075-exams-date    | hot-sale-book-C8010-726-book    | Hot-Sale-200-310-Exam    | Exam-Description-200-310-dumps?    | hot-sale-book-200-125-book    | Latest-Updated-300-209-Exam    | Dumps-210-260-exams-date    | Download-200-125-Exam-PDF    | Exam-Description-300-101-dumps    | Certs-300-101-date    | Hot-Sale-300-075-Exam    | Latest-exam-200-125-Dumps    | Exam-Description-200-125-dumps    | Latest-Updated-300-075-Exam    | hot-sale-book-210-260-book    | Dumps-200-901-exams-date    | Certs-200-901-date    | Latest-exam-1Z0-062-Dumps    | Hot-Sale-1Z0-062-Exam    | Certs-CSSLP-date    | 100%-Pass-70-383-Exams    | Latest-JN0-360-real-exam-questions    | 100%-Pass-4A0-100-Real-Exam-Questions    | Dumps-300-135-exams-date    | Passed-200-105-Tech-Exams    | Latest-Updated-200-310-Exam    | Download-300-070-Exam-PDF    | Hot-Sale-JN0-360-Exam    | 100%-Pass-JN0-360-Exams    | 100%-Pass-JN0-360-Real-Exam-Questions    | Dumps-JN0-360-exams-date    | Exam-Description-1Z0-876-dumps    | Latest-exam-1Z0-876-Dumps    | Dumps-HPE0-Y53-exams-date    | 2017-Latest-HPE0-Y53-Exam    | 100%-Pass-HPE0-Y53-Real-Exam-Questions    | Pass-4A0-100-Exam    | Latest-4A0-100-Questions    | Dumps-98-365-exams-date    | 2017-Latest-98-365-Exam    | 100%-Pass-VCS-254-Exams    | 2017-Latest-VCS-273-Exam    | Dumps-200-355-exams-date    | 2017-Latest-300-320-Exam    | Pass-300-101-Exam    | 100%-Pass-300-115-Exams    |
http://www.portvapes.co.uk/    | http://www.portvapes.co.uk/    |