flopper
Senior member
- Dec 16, 2005
- 739
- 19
- 76
That's called hype. Based on the excuse of Polaris limited bandwidth instead of accepting Vega will be the real deal.
clueless reasoning
That's called hype. Based on the excuse of Polaris limited bandwidth instead of accepting Vega will be the real deal.
That's called hype. Based on the excuse of Polaris limited bandwidth instead of accepting Vega will be the real deal.
You said the PS4 Neo won't have doubling of the bandwidth but in reality it will be. Memory compression isn't hype, it's proven tech. It works on Maxwell and Tonga.
Polaris and Pascal improves it further.
Future consoles with HBM2, the APU/SOC will just go up in price since they don't have to pay for vram separately.
AMD has no need to eat any costs, it's pure margins going forward since they've locked all the major console players into x86/GCN for backwards compatibility.
ie, when you don't have a market and want to be the status quo, you can sacrifice margins for it. When you're the status quo and dominating, it's time to milk.
AMD needs mainstream HBM adoption, they are in no position to milk it especially given that GDDR5X will also be available as an option.
We're talking about the next-next gen consoles here. Maybe 2018 or 2019 time-frame.
These Polaris APU + GDDR5 consoles in late 2016 won't have to deal with high HBM costs.
What else is going to use it in the meantime? Only extremely low volume, high margin stuff like P100 and Vega. Something higher volume needs to bite faster than that, like early 2017 time.
It needs to be a console for the volume, that's why I believe AMD will need to eat margins on one of them soonish, which likely only leaves an updated XBox.
Gonna be a hard sell otherwise when GDDR5X will do the job and likely normal GDDR5.
How would they sell it though? The extra bandwidth of HBM2 will be useless for the APU of Polaris performance class.
Would MS pay heaps extra for a Vega APU on 14nm FF with HBM2? Hmm..
HDL is already used for GPUs, so there's no magic there.
28nm APU of those specs, would be huge and power hungry. That doesn't bode well for a slim console design.
Polaris is also designed for 14nm FF. It would cost AMD more to back port it to 28nm.
Without Polaris, they aren't going to get HDMI 2.0 or newer 4K decoders that these consoles have focused on. Importantly, without Polaris they aren't going to get the big improved performance, period, due to limited GDDR5 bandwidth they will need the enhanced memory compression. Re-designing GCN 1.1 to add these blocks would also cost AMD more $.
There's no logical scenario where these new high performance console APU is 28nm.
Rather, lets speculate about how much faster Polaris GCN will be vs GCN 1.1 or 1.2!
How? Polaris would be baked into the APU.
Suicidal? Maybe forward thinking is the correct response? Seems like at least on the PS4 side both the older and the newer architecture would be optimized for. Kind of sounds like a double whammy too me.
Using mature, stable, high yield tech with dwindling demand to secure market position was forward thinking. Using new, competitive dies to try and emulate that does not seem like a winning strategy, unless they have perfect yields and decent margins. There's only so much profit a company can sacrifice today for future returns.
When they first worked on the PS4/Xbone APU, it was an immature node, early days of 28nm.
The problem is Sony would never be able to afford such a huge and complex die size, even on 28nm. It makes way more sense to utilize a Polaris 10 on 14nm and enjoy the small die size/power consumption/DX 12 opportunities.
Pascal is already rumored to be more GCN like, I hope AMD can patent some of the elements of its design to make sure nvidia can't completely copy GCN going forward.
So AMD creating API designed for GCN. Forcing Nvidia to make architecture like GCN but turns of it's patented. Sounds like law suit.
ofc they will a unified ecosystem that essentially can kill the exclusives and having 3 consoles almost to the same level? just imagine the savings on developmentWill be interesting to see if MS will also do an update of their XB 1.
So AMD creating API designed for GCN. Forcing Nvidia to make architecture like GCN but turns of it's patented. Sounds like law suit.
Maybe this plays into your question? I am not sure exactly what this patent descibes maybe you or someone else can better understand this.Rather than 28nm talk... Discuss something more interesting...
Can the Primitive Discard Accelerator improve performance as well as reduce bandwidth requirements? How?
The provided method and storage medium have several beneficial attributes that promote increased performance of single program multiple thread code on SIMD hardware. For example, higher utilization of the SIMD hardware may be achieved. Furthermore, string comparison and other Standard Template Library (STL) like services within branchy code are improved and software prefetching performance in branchy code is improved. Furthermore, the impact of memory divergence on performance is reduced because workgroups are able to coordinate accesses instead of operating in separate logical execution streams. Additionally, permitting programmers to write more convergent code may improve power efficiency.
APD 104 may include compute units, such as one or more single instruction multiple data (SIMD) processing cores or SIMD arrays 121. In the example provided, the compute units are referred to collectively as shader core 122. In the embodiments described herein a SIMD is a pipeline or programming model where a kernel is executed concurrently on multiple processing elements. The processing elements have independent data and a shared program counter to execute an identical set of instructions. The use of predication enables work items to participate or not participate for each issued command. Each APD 104 compute unit may include one or more scalar and/or vector floating-point units, arithmetic and logic units (ALUs). In some embodiments, the ALUs are arranged into four SIMD arrays 121 in the shader core 122 that each include 16 processing elements, or lanes 123. Each SIMD array 121 executes a single instruction across the lanes 123 to a block of 16 work items, as illustrated in FIG. 1B. It should be appreciated that other configurations or grouping of ALUs, SIMD arrays, and lanes per array may be utilized. Each work item is mapped to a lane during execution. An execution mask indicates which of the lanes 123 are active and are to be executed. For example, the execution mask may include one bit per lane to indicate to the hardware that the lane is active and that the instructions are valid for that set of data.
The APD compute unit may include special purpose processing units (not shown), such as inverse-square root units and sine/cosine units. In the example provided, the shader core 122 includes a local data store (LDS) memory. The LDS is a high-speed, low-latency memory private to each compute unit. The LDS is a full gather/scatter model so that a work-group can write anywhere in an allocated space.
They might but then with the rumors of NX, PS4K (even Xbone^2) floating around they may not have enough dies to spare for it as such.Starting to think we'll see a Radeon 495x2 (Rage PRO?) around July-August.
Why?
The more densely packed 240mm2 Polaris 10 is probably close to the 294mm2 GP104 and winning outright in DX12, (AMD usually has better performance per mm2, not to mention Pascal will be burdened with extra DP/compute). While nvidia will charge at least $500 for the 1080,the 480 will be absolutely no more than $350.
Vega is looking more and more like Q1 17, so why not take the crown with a 250 watt dual Polaris card? Price it at $799 and dominate the top end.