Vega/Navi Rumors (Updated)

Page 122 - Seeking answers? Join the AnandTech community: where nearly half-a-million members share solutions and discuss the latest tech.
Status
Not open for further replies.

Krteq

Senior member
May 22, 2015
993
672
136
The first slide here is talking about packed math instructions (and increased clock speeds), which is basically packing two FP16 instructions into an FP32 SP. Such an instruction can be an FMA instruction which is equal to two FLOPS, so in other words you then get a total of 4 FP16 FLOPS. NCU does indeed have double the FLOPS rate of GCN when it comes to FP16, but not for FP32 FLOPS
Nope, that slide isn't talking about packed math. That slide is about higher IPC of NCU compared to CU - see a whole presentation and endnotes -> Exclusive: AMD VEGA Presentation
 

CatMerc

Golden Member
Jul 16, 2016
1,114
1,153
136
Just a question to CatMerc and antihelten:

In that case, what they meant by this slide? Are they referring to SP ops/sec throughput with or without FMA?


End notes:


Isn't that what I just described?
It either means higher performance per core per clock due to various improvements like scheduling and cache improvements, or it's referring to the FP16 packed math, resulting in double the IPC.
 

caswow

Senior member
Sep 18, 2013
525
136
116
What's the problem? A Memory catalog means absolutely nothing, especially for customers like AMD and nvidia.

Just look at samsung and nvidia, Samsung's memory catalog doesn't even contain HBM2 at all. How is it possible for nvidia to use HBM2 since last year?

Not only that. That type of HBM AMD used on fiji was never in their catalog too.
 

Actaeon

Diamond Member
Dec 28, 2000
8,657
20
76
I agree to a point but I'm pretty sure all 1080tis can sustain 1900mhz or more consistently with a slightly noisy fan setting. Both my Titan XPs and my 1080TI were/are able to run consistently at 1950-1980 ('FE' all of them). I do really hope Vega kicks all their a$$es though.

Also to your point, I wouldn't call 1900mhz 'overclocked' either. 2Ghz sure, but 1900mhz is pretty standard boost clocks right out of the box and the card doesn't need any tweaking to get there.

Separate but related question, I know the Linux driver log states maximum clock speed of 1600mhz but is that realistic of what the card would actually ship with? Seems like only the very best 580s just barely get to 1500mhz so I would think with a die that is nearly 3x larger and with 3x more transistors, they'd have trouble matching and OC'd Polaris 10, yet alone beating it. Nvidia had to drop clocks a bit on GP102 vs GP104 so I am thinking something similar may need to happen for Polaris 10 and Vega. Maybe GloFo has another revision of their 14nm process for Vega?

Looking forward to the announcements and specs. Seems like AMD has been out of the picture for far too long.
 
Reactions: ButtMagician

CatMerc

Golden Member
Jul 16, 2016
1,114
1,153
136
Also to your point, I wouldn't call 1900mhz 'overclocked' either. 2Ghz sure, but 1900mhz is pretty standard boost clocks right out of the box and the card doesn't need any tweaking to get there.

Separate but related question, I know the Linux driver log states maximum clock speed of 1600mhz but is that realistic of what the card would actually ship with? Seems like only the very best 580s just barely get to 1500mhz so I would think with a die that is nearly 3x larger and with 3x more transistors, they'd have trouble matching and OC'd Polaris 10, yet alone beating it. Nvidia had to drop clocks a bit on GP102 vs GP104 so I am thinking something similar may need to happen for Polaris 10 and Vega. Maybe GloFo has another revision of their 14nm process for Vega?

Looking forward to the announcements and specs. Seems like AMD has been out of the picture for far too long.
You're assuming Vega is Polaris. It isn't.
It was architected for higher clocks, think Kepler vs Maxwell.
 

Glo.

Diamond Member
Apr 25, 2015
5,765
4,667
136
It either means higher performance per core per clock due to various improvements like scheduling and cache improvements, or it's referring to the FP16 packed math, resulting in double the IPC.
The easiest way to achieve higher performance from the same amount of cores, is to feed them properly, or increase amount of resources, like for example through increased Registry File Size.
 
Last edited:

lobz

Platinum Member
Feb 10, 2017
2,057
2,856
136
The easiest way to achieve higher performance from the same amount of cores, is to feed them properly, or increase amount of resources, like for example through increased(doubled) Registry File Size.
And clock speed

Sent from my VTR-L09 using Tapatalk
 

Actaeon

Diamond Member
Dec 28, 2000
8,657
20
76
You're assuming Vega is Polaris. It isn't.
It was architected for higher clocks, think Kepler vs Maxwell.

Hmmm, good point. Well I know Vega and Polaris are different, but they are designed by the same company, on the same manufacturing process, and with what appears to be a pretty similar architecture as previous GCN cards. Though to your point, GCN 5.0 is new so maybe we could see something like Kepler to Maxwell.
 

Glo.

Diamond Member
Apr 25, 2015
5,765
4,667
136
Hmmm, good point. Well I know Vega and Polaris are different, but they are designed by the same company, on the same manufacturing process, and with what appears to be a pretty similar architecture as previous GCN cards. Though to your point, GCN 5.0 is new so maybe we could see something like Kepler to Maxwell.
It is not the same architecture, to the degree that it requires abstraction level to be backwards compatible with previous versions of GCN.

The applications have to be rewritten to take use of features that are apparent in Vega.
 

Magic Hate Ball

Senior member
Feb 2, 2017
290
250
96
It is not the same architecture, to the degree that it requires abstraction level to be backwards compatible with previous versions of GCN.

The applications have to be rewritten to take use of features that are apparent in Vega.
Not to mention they have emphasized clock gating and Infinity Fabric with Vega that is highly reminiscent of the talk regarding Zen.
 

OatisCampbell

Senior member
Jun 26, 2013
302
83
101
While the Steam Hardware Survey is hardly representative of forum users here, the vast majority even on this forum (although probably not the most active users) are using 1080p or 1440p monitors at 60Hz. 4K60 is gaining popularity, albeit slowly (the monitor upgrade cycle is 5-6 years if not more for most users), while high refresh rates at lower resolutions are only slightly ahead of this. And I'd say video cards that cost $400 and up are a niche market, not $800 and up, even if Nvidia probably sells 10x as many 1070s as 1080s (not to mention 1080Tis). And who here is talking about AMD having a "halo product" as fast as the 1080? I certainly don't expect that. Besides, CF and SLI scaling is bad enough that for it to make sense even in games that make use of it, your GPU budget needs to be significantly above the price of a 1080Ti - otherwise, a single card will perform better in the vast majority of cases. This wasn't the case just a few years ago.

This is also entirely disregarding all the idiots with more money than sense out there, that buy $1000 dual GPU card without actually knowing what they're buying - and then whining about it when there's no SLI/CF profile and their $1000 card performs worse than something half the price.

The gist of it: dual GPU cards only make sense as a barely-advertised .0001% niche product for people who specifically know what they're getting into. In any other case, it's a recipe for customer dissatisfaction and bad press.

People with 1080p or the lower res 1440p at 60Hz have little reason to buy 1070-1080 level performance.

Presumably most people buying $500+ video card are doing it for reasons other than "New! Shiny!", they want some benefit.

The 4K, Eyefinity, lower 1440 res at high fps, and "maybe" the 3440 X 1440 at 60Hz (for sure at 100+Hz) folks are the only people that "need" more than 1070 level performance from what I've seen. (and maybe more than 1080)

You can say AMD doesn't need to compete with NV at the 1080Ti and up level, but I think that is a dangerous path for them to go down. When you become
known as the "middling performance, good price" vendor in the gaming world it's about time to look for other markets. Even to be the fastest at half the games for less money serves AMD best in my opinion.
 
Reactions: Bacon1

DisEnchantment

Golden Member
Mar 3, 2017
1,687
6,243
136
Could you link me to that?

https://cgit.freedesktop.org/~agd5f...drm/amd/amdgpu/gmc_v9_0.c?h=drm-next-4.13-wip

Code:
/**
 * gmc_v9_0_mc_init - initialize the memory controller driver params
 *
 * @adev: amdgpu_device pointer
 *
 * Look up the amount of vram, vram width, and decide how to place
 * vram and gart within the GPU's physical address space.
 * Returns 0 for success.
 */
static int gmc_v9_0_mc_init(struct amdgpu_device *adev)
{
    u32 tmp;
    int chansize, numchan;

    /* hbm memory channel size */
    chansize = 128;

    tmp = RREG32(SOC15_REG_OFFSET(DF, 0, mmDF_CS_AON0_DramBaseAddress0));
    tmp &= DF_CS_AON0_DramBaseAddress0__IntLvNumChan_MASK;
    tmp >>= DF_CS_AON0_DramBaseAddress0__IntLvNumChan__SHIFT;
    switch (tmp) {
    case 0:
    default:
        numchan = 1;
 
Reactions: Glo. and CatMerc

CatMerc

Golden Member
Jul 16, 2016
1,114
1,153
136
https://cgit.freedesktop.org/~agd5f...drm/amd/amdgpu/gmc_v9_0.c?h=drm-next-4.13-wip

Code:
/**
 * gmc_v9_0_mc_init - initialize the memory controller driver params
 *
 * @adev: amdgpu_device pointer
 *
 * Look up the amount of vram, vram width, and decide how to place
 * vram and gart within the GPU's physical address space.
 * Returns 0 for success.
 */
static int gmc_v9_0_mc_init(struct amdgpu_device *adev)
{
    u32 tmp;
    int chansize, numchan;

    /* hbm memory channel size */
    chansize = 128;

    tmp = RREG32(SOC15_REG_OFFSET(DF, 0, mmDF_CS_AON0_DramBaseAddress0));
    tmp &= DF_CS_AON0_DramBaseAddress0__IntLvNumChan_MASK;
    tmp >>= DF_CS_AON0_DramBaseAddress0__IntLvNumChan__SHIFT;
    switch (tmp) {
    case 0:
    default:
        numchan = 1;
Thanks!

Edit: Interesting that HBM mem channel size is 128bit. It should be 256 if memory serves (pun intended).
 
Last edited:
Reactions: Glo.

Snarf Snarf

Senior member
Feb 19, 2015
399
327
136
50 minutes until we find out something.... There's no way they brought Raja to this meeting so he could not mention Vega. Even if all he says is they're doing a full announcement at another event at least we'll know something
 

Krteq

Senior member
May 22, 2015
993
672
136
Interesting that HBM mem channel size is 128bit. It should be 256 if memory serves (pun intended).
Isn't I/O channel width same for HBM1/2?

Only prefeth is different I think



//Nevermind, prefetch is the same, so I/O have to be wider
 
Last edited:

Valantar

Golden Member
Aug 26, 2014
1,792
508
136
"Succeed Miraculously".
Makes you think... could a miracle really happen for AMD?
Not that that line says "or fail spectacularly" directly below it (partially obscured by the Ryzen logo). I take it it's a quote.

I'm more psyched for the "gain marketshare" part of that wall. Of course, I wouldn't mind them succeeding miraculously either.
 
Status
Not open for further replies.
sale-70-410-exam    | Exam-200-125-pdf    | we-sale-70-410-exam    | hot-sale-70-410-exam    | Latest-exam-700-603-Dumps    | Dumps-98-363-exams-date    | Certs-200-125-date    | Dumps-300-075-exams-date    | hot-sale-book-C8010-726-book    | Hot-Sale-200-310-Exam    | Exam-Description-200-310-dumps?    | hot-sale-book-200-125-book    | Latest-Updated-300-209-Exam    | Dumps-210-260-exams-date    | Download-200-125-Exam-PDF    | Exam-Description-300-101-dumps    | Certs-300-101-date    | Hot-Sale-300-075-Exam    | Latest-exam-200-125-Dumps    | Exam-Description-200-125-dumps    | Latest-Updated-300-075-Exam    | hot-sale-book-210-260-book    | Dumps-200-901-exams-date    | Certs-200-901-date    | Latest-exam-1Z0-062-Dumps    | Hot-Sale-1Z0-062-Exam    | Certs-CSSLP-date    | 100%-Pass-70-383-Exams    | Latest-JN0-360-real-exam-questions    | 100%-Pass-4A0-100-Real-Exam-Questions    | Dumps-300-135-exams-date    | Passed-200-105-Tech-Exams    | Latest-Updated-200-310-Exam    | Download-300-070-Exam-PDF    | Hot-Sale-JN0-360-Exam    | 100%-Pass-JN0-360-Exams    | 100%-Pass-JN0-360-Real-Exam-Questions    | Dumps-JN0-360-exams-date    | Exam-Description-1Z0-876-dumps    | Latest-exam-1Z0-876-Dumps    | Dumps-HPE0-Y53-exams-date    | 2017-Latest-HPE0-Y53-Exam    | 100%-Pass-HPE0-Y53-Real-Exam-Questions    | Pass-4A0-100-Exam    | Latest-4A0-100-Questions    | Dumps-98-365-exams-date    | 2017-Latest-98-365-Exam    | 100%-Pass-VCS-254-Exams    | 2017-Latest-VCS-273-Exam    | Dumps-200-355-exams-date    | 2017-Latest-300-320-Exam    | Pass-300-101-Exam    | 100%-Pass-300-115-Exams    |
http://www.portvapes.co.uk/    | http://www.portvapes.co.uk/    |