AMD post-Bulldozer x86 CPU architecture

Madpacket · May 19, 2014

AMD looks to have real momentum at the moment. If they can keep this going I think they'll have a good chance surprising Intel in the near future. Don't underestimate the hungry underdog, they have to use unorthodox means to move forward and sometimes this is what's necessary for an industry to advance.

Idontcare · May 19, 2014

Cost correlates to mask set size, more masks to build higher and higher performance xtors (be it with better isolation for leakage control or better drive currents for switching speed).

Generally speaking (I know of no exception, but leave room for their existence) the lower power transistors will have a much lower switching speed, lower drive current, lower Vnom target, and smaller minimum design rule spec (higher density) as a consequence of the trade-off in having a simpler mask set that drives production costs down as well as the IC design costs.

That's not marketing, that's the foundries building various sub-nodes by design based on price/performance feedback from the customer base.

The one bit that is marketing is that they won't label a "low power" node as being "low performance" (even though it is), nor will they label the "high performance" node as "high power" (even though it is).

But that's just common sense, and tradition. Low power versus high performance sub-nodes have been around for at least 20 yrs, maybe longer.

AtenRa · May 19, 2014

Although 20nm LPM can scale as high as 28nm HPP, i still believe GloFo will still have 20nm HP or SHP for high computing devices such as CPUs and GPUs.

raghu78 · May 19, 2014

AtenRa said:
Although 20nm LPM can scale as high as 28nm HPP, i still believe GloFo will still have 20nm HP or SHP for high computing devices such as CPUs and GPUs.

There is no high performance node at 20nm bulk planar. All the foundries TSMC/Samsung/GF have only a low power process at 20nm planar. 20LPM is not suitable for high performance devices like high frequency CPUs which need to run at 4 Ghz. Below 28nm the only way to get to high performance is FINFET or FD-SOI . bulk planar is just not going to be enough. Here I quote AMD product CTO Joe Macri

http://www.theregister.co.uk/Print/2014/01/14/amd_unveils_kaveri_hsa_enabled_apu/

"What we found was with the CPU with planar transistors, when we went from 28 to 22, we actually started to slow down," he said, "because the pitch of the transistor had to become much finer, and basically we couldn't get as much oomph through the transistor."

The problem, he said, was that "our IDsat was unpleasant" at 22nm, referring to gate drain saturation current*. In addition, the chip's metal system needed to be scaled down to fit within the 22nm process, which increased resistance.

So what we saw was the frequency just fall off the cliff," he said. "This is why it's so important to get to FinFET."

So if Carrizo is designed for 4 Ghz speeds it will be built on GF 28SHP as Samsung 14 FINFET won't be ready for HVM till Q1 2015 . Apple will get the priority over others for the initial production ramp.

On the contrary if AMD designs Carrizo for lower clocks like 3 Ghz then they could get it to run on 20 LPM. AMD could go for doubling the integer execution pipes thus increasing IPC significantly and offsetting the loss of performance due to lower clocks. AMD has a 4 way decoder per Steamroller core feeding just 2 integer execution pipes and 2 address generation units. Thats a horrible design. Puma has 2 integer execution pipes and 1 load, 1 store pipe fed by a 2 way decoder and can get 90 - 95% of Steamroller IPC. AMD can easily double the number of integer execution pipes and load/store units and get a >50% improvement in IPC for a minimal increase in transistor circuitry. They can then clock the APU at 3 Ghz and fit into the same or lower TDP as Kaveri desktop 95w.

Since AMD's roadmap calls for only 65w and lower TDP for Carrizo thats the likely outcome. a wider execution core with much higher IPC running at much lower clocks. It would still be faster than Kaveri and more power efficient. Built on 20LPM it would be a good fit for SFF/HTPC builds. If AMD can deliver this with HBM it would be a home run but that would be expecting too much.

Ancalagon44 · May 19, 2014

That roadmap indicates that AMD has no plans to bring Mullins and Beema to the desktop, which is a real pity.

NTMBK · May 19, 2014

Ancalagon44 said:
That roadmap indicates that AMD has no plans to bring Mullins and Beema to the desktop, which is a real pity.

They were more about perf/W than raw performance anyway.

pTmdfx · May 19, 2014

mrmt said:
Maybe the engineers here could answer this, but if they are planning switchable cores, my guess is that the cores must be somehow close in width, and throughput, otherwise they will end up with wasted resources, meaning a bigger die than it should, or they will end up with a core choked by lack of resources.

I assume compatible methodology of building the SOC with a similar size of blocks (i.e. standardize at least one of the dimensions of the CPU blocks) is fairly enough, if you meant really "swappable" in the SOC floorplan. How the internal of the cores looks sounds irrelevant in this matter.

DrMrLordX · May 19, 2014

Ancalagon44 said:
That roadmap indicates that AMD has no plans to bring Mullins and Beema to the desktop, which is a real pity.

Question: did socketed Kabini (5350 et al) show up on AMD roadmaps? I would be surprised if AMD went to all that trouble to open up the very-low-end desktop/NUC/HTPC market to Kabini only to not update the lineup with Beema down the road.

parvadomus · May 19, 2014

Exophase said:
He thinks that Steamroller was originally going to have twice the IPC and half the clock speed, then at the last minute AMD disabled half the stuff in the module. This is all based on that leaked die shot that he somehow thinks is Steamroller is for some reason.

In other words, he makes up crazy theories based on zero evidence and talks about them like they're fact. You should ignore him.

It looks that finally, Excavator is an unlocked Steamroller. It will get unlocked for the mobile versions of Kaveri (or for some models).
http://wccftech.com/amd-mobile-kaveri-apu-confirmed-steamroller-excavator-hybrid-architectures/

Hitman928 · May 19, 2014

parvadomus said:
It looks that finally, Excavator is an unlocked Steamroller. It will get unlocked for the mobile versions of Kaveri (or for some models).
http://wccftech.com/amd-mobile-kaveri-apu-confirmed-steamroller-excavator-hybrid-architectures/

Two things:

1) wccftech is not a source, they're a rumor mill.

2) Even with 1), nothing in that link supports that Steamroller is half disabled or that Excavator is steamroller unlocked. They have 1 slide that has been publicly available for a while that states mobile kaveri is a "revolutionary architecture" and then they spin off the rails from there.

NTMBK · May 19, 2014

Jesus, WCCF is really not a source. Where do you think they got their "hybrid Excavator" story from? Seronx's forum posts...

Phynaz · May 19, 2014

NTMBK said:
Jesus, WCCF is really not a source. Where do you think they got their "hybrid Excavator" story from? Seronx's forum posts...

I guess that makes Seronx a source

NTMBK · May 19, 2014

Phynaz said:
I guess that makes Seronx a source

D:

Exophase · May 19, 2014

raghu78 said:
AMD can easily double the number of integer execution pipes and load/store units and get a >50% improvement in IPC for a minimal increase in transistor circuitry.

The cores can already sustain 2 load/cycle, and I think 1 load + 1 store, possibly 2 store as of Steamroller; can't remember on this. The first two would put it on the same level as Ivy Bridge, which is otherwise substantially wider. I don't think they can easily increase load/store bandwidth further, and I don't think they'd gain very much by doing so.

I agree that the 2 EX pipe design isn't well balanced, not just because it has two ALUs but because it has to use those pipes for branches, mul/div, stores, and so on - pretty much everything but address generation and loads, and I think the AGUs can handle moves and maybe some simple one input ALU operations like inc/dec. They should have a dedicated port for branches and stores. But there's no way they're going to get a 50+% IPC improvement just by increasing execution ports. Branch throughput, branch prediction rate and misprediction cost, L1 latency, and L1 miss rate + L2 latency are going to continue to be big limiters. IMO, they need a bigger L1 dcache, lower latency L2 cache, and lower branch misprediction penalty (especially if they end up not even clocking over 4GHz, they should revise the pipeline)

Genx87 · May 19, 2014

jpiniero said:
Except they have not much going on in mobile... and that's where the money is going. Maybe the K12 will change that, but it's still a stretch. I would not expect much out of this x86 processor.

And how many times have we heard this is will finally get AMD to be relevant in the last decade? And with each iteration of x86 chip from AMD they become less and less relevant.

AtenRa · May 19, 2014

From APU14 China

http://news.mydrivers.com/1/305/305092.htm

Automatic translation

May 15 , AMD held a APU14 Technology Innovation Conference theme is " enlightening the future is today " in Beijing . This is the second AMD Desktop business headquarters to Beijing, China , AMD Greater China attaches importance to China and even another manifestation .

Conference , AMD executives over center stage , a comprehensive introduction and demonstrated its latest progress almost every product line and look forward to the future development strategy , technology and product planning .

And after the meeting , the general manager of client and Bernd Lienhard , vice president, AMD Products Group Global , global vice president and general manager of AMD desktop products division Liu Shiwei (Steven Liu), AMD vice president and global general manager Matt Skynner also GPU continue to accept a joint interview with several foreign media , answering a lot of hot issues of concern to everyone .

Interview a lot , here we do not see them here , just pick some of these priorities and to share .

1, AMD is expected to target this year, the mainland market in the world and what ?

Lienhard / Liu: No matter now or in the future, the Chinese market is very important for AMD . We came to China , is to become No.1. The whole global market, we must first become the first in the Chinese market , and will be developed into a global business in China headquarters .

2, AMD 's forty-six how to treat the Chinese market ?

Liu: Chinese market is indeed one of the largest markets in the world , means a saturated market in second-tier cities OEM brand on volume purchasing and procurement will increase , DIY market will dive to four hundred fifty-six market . AMD Desktop Product headquarters moved to China , is to better understand the Chinese market . We will against China four hundred fifty-six cities , publish customized products .

Although the Government of home appliances activity has ended, but do not want brand computer manufacturers to promote lose power , AMD is the same. We will continue to work with vendors to do something similar . AMD now has nearly two thousand channels, we will have to do a lot of strategic business units based on feedback , targeted promotion.

3, Kaveri APU mainstream and low-end products where?

Lienhard: AMD currently no plans to introduce a dual-core or low version Kaveri, or in high-end oriented. Kaveri goal is not to replace Richland, but coexist and develop high-end . The low-end is still to Richland.

4, Kabini desktop version why not Chinese retail market ?

Lienhard: The product is to be launched in the OEM requirements , and not for the DIY market . OEM and DIY is not the same , consumers still want to see something higher frequency , Kabini relatively low frequencies .

5 , after the Windows XP delisting , AMD see what kind of role in stimulating PC market ?

Liu: From AMD 's perspective, we think it is still for the entire market has played a positive role.

6, AMD dual architecture strategy x86/ARM future whether the impact of existing low-end products ?

Liu: x86, ARM facing market is different. AMD even with ARM, it will not impact the current APU, the coverage will be more of the market, not replace each other .

7, AMD x86/ARM two pin-compatible architecture development process, the most difficult thing to come across what is ?

Lienhard: We are the experts in terms of CPU , nothing particularly difficult .

8, AMD FX series processors is relatively less active , what planning for the future ?

Lienhard: Last year we introduced a Piledriver ( hammers ) architecture , and achieved good market performance , there are about 30 % growth , Kaveri also used Steamroller updates ( excavator ) architecture. FX Series of high-performance market positioning in the future will , within two years you will definitely see an update . - The so-called new x86 core really is a high performance to fight another day .

Two years, finfets ??

NostaSeronx · May 19, 2014

AtenRa said:
Two years, finfets ??

FDSOI... AMD is saying FinFETs but they are hiring a lot of mask layout engineers with early exposure to IBM/CEA-Leti's 14nm FDSOI.

pw257008 · May 19, 2014

"we are THE experts"

i love google/etc translate

Idontcare · May 20, 2014

AtenRa said:
Although 20nm LPM can scale as high as 28nm HPP, i still believe GloFo will still have 20nm HP or SHP for high computing devices such as CPUs and GPUs.

They have to, unless they want to cede that business to TSMC.

That next year's low-power node shrink rivals last year's high-performance node is not the exception, it is the rule. At TI that was one of the metrics by which we set our node-on-node IDsat parametric targets.

If you couldn't build an N+1 low-power node that had better parametrics than those of the high-performance N node then you were not trying.

Finfet is definitely the future of CMOS beyond 14nm, with its natural extension to omega-gates and eventually "all-around" gate.

Note I am merely referring to xtor geometry here, materials choice (be it III-V, graphene, MoS2, etc) is a matter of evolving economics and manufacturability.

Substrate choice is going to become less and less of a value-add decision maker, which has Soitec and a fair number of vested interests quite concerned at the moment.

And rightfully so. Horse-buggy manufacturers did not go quietly into the night on the eve of the horseless-carriage revolution, they fought tooth-and-nail to convince lawmakers and regulators to keep those frightful machines off the road.

I don't expect traditional substrate manufacturers to go without a fight either, but the writing (or rather the physics) is plain as day for all to see. Substrates will soon no longer factor into the formation of the active channel, and once that value-add vector evaporates the foundries and IDM's will be free to pursue a healthy $200-$300 reduction in substrate costs at 450mm.

mrmt · May 20, 2014

Idontcare said:
They have to, unless they want to cede that business to TSMC.

Having a HP node or not is largely irrelevant to Globalfoundries' fortunes, because they have just AMD's market share on this market, and AMD is phasing out their high performance line. That business is already TSMC's.

Ajay · May 20, 2014

mrmt said:
Having a HP node or not is largely irrelevant to Globalfoundries' fortunes, because they have just AMD's market share on this market, and AMD is phasing out their high performance line. That business is already TSMC's.

I thought AMD was considering moving some GFX over GFL, is that off the table?

Ajay · May 20, 2014

Idontcare said:
I don't expect traditional substrate manufacturers to go without a fight either, but the writing (or rather the physics) is plain as day for all to see. Substrates will soon no longer factor into the formation of the active channel, and once that value-add vector evaporates the foundries and IDM's will be free to pursue a healthy $200-$300 reduction in substrate costs at 450mm.

Fantastic observation! That's why we pay you the big bucks

mrmt · May 20, 2014

Ajay said:
I thought AMD was considering moving some GFX over GFL, is that off the table?

That's the catch. AMD had in the past to tap dance around the legal clauses and pay WSA charges, but even under this kind of pressure they didn't offer GLF their entire GPU business to manufacture. They should be moving the lagging edge and the cheap stuff to Globalfoundries, but keeping the bleeding edge/high performance stuff at TSMC's.

Fjodor2001 · May 20, 2014

AtenRa said:
8, AMD FX series processors is relatively less active , what planning for the future ?

Lienhard: [...] FX Series of high-performance market positioning in the future will , within two years you will definitely see an update . - The so-called new x86 core really is a high performance to fight another day .

Idontcare said:
SlowSpyder said:

I wonder how high they'll aim with any new cores.

Click to expand...

Not very high for simple fact that they are stuck using foundry's that aren't developing high clockspeed process nodes.

Seems like those two quotes are contradictory? Or am I missing something here?

How can AMD be delivering a new high performance FX CPU based on the next non-Bulldozer uArch generation within two years if they will not have access to a high clockspeed process node?

mrmt · May 20, 2014

Fjodor2001 said:
How can AMD be delivering a new high performance FX CPU based on the next non-Bulldozer uArch generation within two years if they will not have access to a high clockspeed process node?

Is ARM A15 high performance? What about Jaguar, is it high performance or low performance? What about Atom? What about desktop Core?

In all of these cases you could say that these chips are high performance or low performance, depending on what you use as a baseline for comparison. Without AMD stating what *they* consider to be the baseline for "high performance", we have just marketing at work here.

AMD post-Bulldozer x86 CPU architecture

Platinum Member

Elite Member

Lifer

Diamond Member

Diamond Member

Lifer

Member

Lifer

Senior member

Diamond Member

Lifer

Lifer

Lifer

Diamond Member

Lifer

Lifer

Diamond Member

Senior member

Elite Member

Diamond Member

Lifer

Lifer

Diamond Member

Diamond Member

Diamond Member