Future to Bulldozer architecture?

Page 6 - Seeking answers? Join the AnandTech community: where nearly half-a-million members share solutions and discuss the latest tech.

amd6502

Senior member
Apr 21, 2017
971
360
136
Banded Kestrel and River Hawk has been Yonah'd by Stoney Ridge Refresh and Successor.
A 22nm Seronx style big.little Stoney+Carrizo-L would really be nice.

Stoney is 125mm2 and cat quads around 86mm2 (a little under 3/8" squared) of which less than 40mm2 is cpu related. If it shrinks down to like 110mm2 it'd likely be close (within 35%) to the minimum size of a die for AM4 (minimum width and pad ring requirements for the IO might limit how small a die can be, as it seems to have done for Polaris 12 https://www.reddit.com/r/Amd/comments/6aaiqw/what_is_purpose_of_the_polaris_12_eg_rx550/dhd63j9/).

AM1 has 721 contacts, FM2+ has 906, AM4 has 1331.

So 22FDX has a probable cost advantage over 14LPP for lower end products that don't need so many resistors.
 
Last edited:

NTMBK

Lifer
Nov 14, 2011
10,269
5,134
136
A 22nm Seronx style big.little Stoney+Carrizo-L would really be nice.

Stoney is 125mm2 and cat quads around 86mm2 (a little under 3/8" squared) of which less than 40mm2 is cpu related. If it shrinks down to like 110mm2 it'd likely be close to the minimum size of a die for AM4 (pad ring requirements for the IO might limit how small a die can be, as it seems to have done for Polaris 12 https://www.reddit.com/r/Amd/comments/6aaiqw/what_is_purpose_of_the_polaris_12_eg_rx550/dhd63j9/).

So 22FDX has a definite cost advantage over 14LPP for lower end products that don't need so many resistors.

Stoney Ridge also has 50% more GPU shaders than the Jaguar quads, don't forget. Given the amount of die area on Mullins/Kabini which is used for GPU, I suspect that makes up most of the size difference:



To be honest, I don't see the benefit of having 2 mediocre XV cores and 4 terrible cat cores, instead of 2 multithreaded Zen cores. You would get equivalent multithread performance, and significantly better single threaded performance.

You also need excellent OS-level support for migrating threads between the "big" and "little" cores in order to make it work. If it is too eager to put tasks on the "big" cores then efficiency suffers, but if it is too eager to put them on the "little" cores then performance suffers. And you can't keep hopping back and forth between the high and low power clusters, because there is a significant overhead to doing so. It's a very tricky load balancing problem. Android still struggles with it after years of mainstream big.LITTLE phone SoCs- I doubt Microsoft would come up with a better solution, especially for a single low-margin AMD product.

I think Banded Kestrel is the right approach, and hopefully it will actually come to market.
 

NostaSeronx

Diamond Member
Sep 18, 2011
3,689
1,224
136
The 2x 15h + 4x 16h SoC would require the lowest budget cost for the low end. If it was on 22FDSOI it would largely be identical designed to 20LPM/28A/28HPA. While, also giving maximum performance at the lowest power. (Same or higher clock rates with longer battery life than FinFET.)

imho, it would be best to take 15h and renew it top-down with 16h/17h+more advanced CMT design. While also targeting ultra low power and ultra high frequencies. (example: 0.6 Vdd @ 3 GHz to 1.0v @ 5 GHz, etc.) If going in linear order and there is no other architectures in the bushes. It thus would be titled 18h Family architectures.

Back to the 2x 15h + 4x 16h, it would largely be built upon the fused ambidextrous design on 20LPM;
...AMD's first Ambidextrous APU product family.
...protocol and transport layer of an AMD-ambidextrous SOC interconnect fabric.
...implemented an ambidextrous X86/ARM architecture
... Ambidextrous System Architecture design focused on onchip interconnect
also known as the Heterogeneous System Architecture 2.0. x86 <-> ARM Global Task Scheduling being harder than x86 <-> x86 GTS.

Also, researching ULP points to Ryzen having a feature that it would never use...
32KB, (stock)Vmin=0.5V, (stock)Fmax=4.5 GHz, L3 Macros. // (7nm maybe)
(A whole lot of reuse down-porting and up-porting, yiss pls)
 
Last edited:

NTMBK

Lifer
Nov 14, 2011
10,269
5,134
136
The 2x 15h + 4x 16h SoC would require the lowest budget cost for the low end. If it was on 22FDSOI it would largely be identical designed to 20LPM/28A/28HPA.

Except AMD would have to re-engineer the entire uncore to handle both core types on one chip. And then do a massive load of software engineering in co-operation with Microsoft in order to get the scheduling working properly.

While, also giving maximum performance at the lowest power. (Same or higher clock rates with longer battery life than FinFET.)

Look, Seronx- if 22FDX was really that amazing, then it would have lots of customers using it for their mobile SoCs. I'm sure Mediatek, HiSilicon etc would love a competitive advantage over Qualcomm, if they could really make their chips both cheaper and more power efficient.

Back to the 2x 15h + 4x 16h, it would largely be built upon the fused ambidextrous design on 20LPM;

"Ambidextrous" was about using the same fabric with different CPUs, but only with 1 CPU type at a time. It was never about mixed CPU types.
 

NostaSeronx

Diamond Member
Sep 18, 2011
3,689
1,224
136
Except AMD would have to re-engineer the entire uncore to handle both core types on one chip.
Not exactly, the Northbridge is the only thing that is required to change. With Carrizo-L/Bristol Ridge-L, both designs would share the same iFCH w/ Stoney Ridge, etc.
And then do a massive load of software engineering in co-operation with Microsoft in order to get the scheduling working properly.
Already happened with Intel/Xscale/Marvell/Larrabee. Look up Helios. It was a lot more complex than what would be required for a homogenous GST. I know there was a successor for it... not going to search it up.
Look, Seronx- if 22FDX was really that amazing, then it would have lots of customers using it for their mobile SoCs. I'm sure Mediatek, HiSilicon etc would love a competitive advantage over Qualcomm, if they could really make their chips both cheaper and more power efficient.
Bad news, Qualcomm is using 22FDX as well. AMD and Qualcomm are TIER ONE, the highest support provided by GloFo. GlobalFoundries produced GF28A and GF20A for AMD, and produced GF28Q and GF20Q for Qualcomm. With the cancellation of 20LPM so went GF20A/GF20Q, but came with GF22FDX.
"Ambidextrous" was about using the same fabric with different CPUs, but only with 1 CPU type at a time. It was never about mixed CPU types.
Like how the Infinity Fabric doesn't use Coherent Hypertransport? (you know what I'm talking about... ;P Ambidextrous Ring-Torus using Hypertransport 4.0 @ Semiaccurate)
 
Last edited:

NTMBK

Lifer
Nov 14, 2011
10,269
5,134
136
Not exactly, the Northbridge is the only thing that is required to change. With Carrizo-L/Bristol Ridge-L, both designs would share the same iFCH w/ Stoney Ridge, etc.

True, the Stoney Ridge southbridge would probably be reusable. But redesigning the Northbridge is still a lot of work. Especially compared to, say, reusing their existing Zen 14nm design and just shipping a 2C part.

Already happened with Intel/Xscale/Marvell/Larrabee. Look up Helios. It was a lot more complex than what would be required for a homogenous GST.

That was a specialised research-only operating system, which worked well enough to provide results for a research paper and was never seen again. Very different from actually trying to hack it into mainstream Windows, with all of its long heritage and quirks.

Bad news, Qualcomm is using 22FDX as well. AMD and Qualcomm are TIER ONE, the highest support provided by GloFo. GlobalFoundries produced GF28A and GF20A for AMD, and produced GF28Q and GF20Q for Qualcomm. With the cancellation of 20LPM so went GF20A/GF20Q, but came with GF22FDX.

I saw that press release, but it made it sound like 22FDX was more useful for IoT chips, due to capabilities like embedded MRAM. I haven't seen any indication that it will be used for mainstream SoCs, let alone high performance PC parts.

Like how the Infinity Fabric doesn't use Coherent Hypertransport? (you know what I'm talking about... ;P Ambidextrous Ring-Torus using Hypertransport 4.0 @ Semiaccurate)

Yeah, you got me on that one
 

scannall

Golden Member
Jan 1, 2012
1,948
1,640
136
Honestly, the construction core don't really have a future. The laptop APU's were a decent and compelling product. But to spend money on construction cores at this point would be a waste of monetary and engineering resources. Both of which are in short supply. Future APU's will use Zen and Vega. And that's where their focus needs to be. On the future.
 

NostaSeronx

Diamond Member
Sep 18, 2011
3,689
1,224
136
True, the Stoney Ridge southbridge would probably be reusable. But redesigning the Northbridge is still a lot of work. Especially compared to, say, reusing their existing Zen 14nm design and just shipping a 2C part.
Releasing Banded Kestral would be a complete redesign of everything. On FinFETs, on a new socket, new platform, etc.
That was a specialised research-only operating system, which worked well enough to provide results for a research paper and was never seen again. Very different from actually trying to hack it into mainstream Windows, with all of its long heritage and quirks.
www.cs.utexas.edu/users/mckinley/papers/heterogeneousProcessors-disc-2014.pdf [Low-power first, then to High-perf][[We achieved substantial performance improvements in simulation and on real systems by configuring Simultaneous Multi-Threading (SMT) hardware as a dynamic heterogeneous multicore]]
pages.cs.wisc.edu/~swift/papers/pact16-rinnegan.pdf [Swift has a significant financial interest in Microsoft. <== second dude]
I saw that press release, but it made it sound like 22FDX was more useful for IoT chips, due to capabilities like embedded MRAM. I haven't seen any indication that it will be used for mainstream SoCs, let alone high performance PC parts.
22FDX markets are Mid-to-Low End Mobility AP(Laptops, Tablets, Phones), RF-related SoCs, ADAS/Vision/ISP and Radar/mmWave, and finally High-to-Low End IoT procesors. Those are just recommendations though. Since, it ignores NUC-like Mini-PCs, AIO(Desktop Monitor with integrated SoC), etc.
And look at that:
the midrange AMD Embedded G-Series J Family SOCs
Stoney Ridge is considered a mid-range application processor for: thin client, digital signage, digital gaming, retail POS, industrial/automation, military/aerospace, smart camera, set-top box and networking/communications applications.
 

NTMBK

Lifer
Nov 14, 2011
10,269
5,134
136
Releasing Banded Kestral would be a complete redesign of everything. On FinFETs, on a new socket, new platform, etc.

Banded Kestrel would be the CPU, GPU, memory controller and fabric already designed for Raven Ridge, on the same process as Raven Ridge. Just with fewer cores, fewer shaders, and only one memory controller.
 

DrMrLordX

Lifer
Apr 27, 2000
21,813
11,168
136
Honestly, the construction core don't really have a future. The laptop APU's were a decent and compelling product. But to spend money on construction cores at this point would be a waste of monetary and engineering resources. Both of which are in short supply. Future APU's will use Zen and Vega. And that's where their focus needs to be. On the future.

Despite what some others say in here, what you are saying is basically correct.

R.I.P. Constructions cores.
 

NostaSeronx

Diamond Member
Sep 18, 2011
3,689
1,224
136
Oh snap. I think I stumbled upon a Bristol & Stoney refresh...
Stoney Ridge 2017 models:
A9-9430 = 3.2 -> 3.5 GHz replacing: 2.9 -> 3.5 GHz
A9-9420 = 3.0 -> 3.6 GHz replacing: 2.4 -> 3.2 GHz
A6-9220 = 2.5 -> 2.9 GHz replacing: 2.4 -> 2.8 GHz

Bristol Ridge 2017 models:
FX-9840P* = 3.3 -> 4.0 GHz replacing: 3.0 -> 3.7 GHz
FX-9820P* = 3.0 -> 3.7 GHz replacing: 2.7 -> 3.6 GHz
A12-9720P => 2.7 -> 3.6 GHz replacing: 2.5 -> 3.4 GHz
A10-9620P => 2.5 -> 3.4 GHz replacing: 2.4 -> 3.3 GHz
*Not conclusive, aka false positives?

While doing that above, I stumbled upon some evil info for naysayers. From what I just digested, the Excavator cores will essentially always be here. They will be used to prop up Ryzen prices throughout the lifespan of Ryzen.

So, 2018 models will be ported to 22FDX, every single Bulldozer-related product is getting the port. Wait...EVERY SINGLE ONE?! Yep.
CPU SoCs(8/6/4 cores) => $179 and lower
APU SoCs(4/2 cores) => $149 and lower

They'll all be strategically placed in supporting roles with Ryzen, Ryzen Mobile, Ryzen Lite. So, AMD can keep and protect those high margins. So, awkwardly the answer of the title is that they are Atom cores. Also, they are used to insure wafer usage at Dresden & Chengdu. (These two fabs will never touch FinFETs)

XV-FD CPUs => 4.4 GHz - 5.3 GHz (Lowest Stock -> Highest Turbo) // Nostacalc -> January 2018
XV-FD APUs => <4.4 GHz Turbo // Nostacalc -> March 2018

Centurion - Bristol - Stoney => Onwards. (Also, development is locked. So, basically Cheetah and Tiger like shrinks. New power optimizations, no performance increases other than Frequency.)
 
Last edited:

Exist50

Platinum Member
Aug 18, 2016
2,452
3,102
136
In answer to this thread's main topic, AMD's roadmaps seem pretty clear. Bristol Ridge and Stoney Ridge will remain only until AMD fleshes out its laptop and OEM desktop lineups, and by mid 2018, the Banded Kestrel die on 14nm should be more than cheap enough to take the mid to low end. AMD's probably more than happy to leave the Atom market and its non-existent margins to Intel.
 
Reactions: amd6502

SarahKerrigan

Senior member
Oct 12, 2014
611
1,491
136
Oh snap. I think I stumbled upon a Bristol & Stoney refresh...
Stoney Ridge 2017 models:
A9-9430 = 3.2 -> 3.5 GHz replacing: 2.9 -> 3.5 GHz
A9-9420 = 3.0 -> 3.6 GHz replacing: 2.4 -> 3.2 GHz
A6-9220 = 2.5 -> 2.9 GHz replacing: 2.4 -> 2.8 GHz

Bristol Ridge 2017 models:
FX-9840P* = 3.3 -> 4.0 GHz replacing: 3.0 -> 3.7 GHz
FX-9820P* = 3.0 -> 3.7 GHz replacing: 2.7 -> 3.6 GHz
A12-9720P => 2.7 -> 3.6 GHz replacing: 2.5 -> 3.4 GHz
A10-9620P => 2.5 -> 3.4 GHz replacing: 2.4 -> 3.3 GHz
*Not conclusive, aka false positives?

While doing that above, I stumbled upon some evil info for naysayers. From what I just digested, the Excavator cores will essentially always be here. They will be used to prop up Ryzen prices throughout the lifespan of Ryzen.

So, 2018 models will be ported to 22FDX, every single Bulldozer-related product is getting the port. Wait...EVERY SINGLE ONE?! Yep.
CPU SoCs(8/6/4 cores) => $179 and lower
APU SoCs(4/2 cores) => $149 and lower

They'll all be strategically placed in supporting roles with Ryzen, Ryzen Mobile, Ryzen Lite. So, AMD can keep and protect those high margins. So, awkwardly the answer of the title is that they are Atom cores. Also, they are used to insure wafer usage at Dresden & Chengdu. (These two fabs will never touch FinFETs)

XV-FD CPUs => 4.4 GHz - 5.3 GHz (Lowest Stock -> Highest Turbo) // Nostacalc -> January 2018
XV-FD APUs => <4.4 GHz Turbo // Nostacalc -> March 2018

Centurion - Bristol - Stoney => Onwards. (Also, development is locked. So, basically Cheetah and Tiger like shrinks. New power optimizations, no performance increases other than Frequency.)

Care to put a friendly wager on 22FDX XV or PD actually shipping?
 

amd6502

Senior member
Apr 21, 2017
971
360
136
Care to put a friendly wager on 22FDX XV or PD actually shipping?
I think the Nostacalc is not functioning correctly. 5+ ghz seems way off for a process meant for ultra low power mobile devices.

If it were cheap and simple port then I think a Bristol Ridge variant with reduced iGPU (to 3CU like Stoney) would be worth it.
 

NostaSeronx

Diamond Member
Sep 18, 2011
3,689
1,224
136
5+ ghz seems way off for a process meant for ultra low power mobile devices.
FDSOI is a "wide application" process. It's isn't built for any one market, it instead does all them at "excellent" capabilities.

I recommend looking at 28nm FDSOI 7T (Low-Cost or Ultra High Density)
http://i.imgur.com/QSuO5dy.png


5+ GHz would be fully capable... since taking the A9-9420(~10W@3 GHz) * 1.7x = 5.1 GHz, but realistic first port calc puts it around ~1.5x = 4.5 GHz. 10W x 4 => 40W + <15W(UNB-iFCH+8 MB L3) => 8 cores @ 4.5 GHz ~ 5.1 GHz at <65W. Add XFR to XV++ then well 90W/128W range for upper 5 GHz.
If it were cheap and simple port then I think a Bristol Ridge variant with reduced iGPU (to 3CU like Stoney) would be worth it.
That is ignoring the die shrink to Stoney Ridge, which would make it competitive to Apollo Lake/Gemini Lake. ~120 mm² to ~70mm² and the massive clock boosts to everything as well. Well, gg.
 
Last edited:
Reactions: amd6502

NostaSeronx

Diamond Member
Sep 18, 2011
3,689
1,224
136
Just an update... Take this with a bit of salt.

October 2016 is the date when the thing was conceived. With 22FDX soon and 12FDX in the works. So, where Zen is mostly a high performance core. The thing will replace the low power cores Bobcat/Jaguar/Puma(+)/Excavator(+).

So, if you read AMD's slides Ryzen is Premium and anything else is Mainstream. Mainstream is thus going to consolidate the architectures of the previous gen cores into one next-gen core.

Highly optimized ULP core. Next-gen Bulldozer skeleton with Next-gen Jaguar design. It is optimized to operate at ultra low power to essential high performance.

Very much speculation:
SMT4 NN Branch Predictor -> 2x 4-wide Fetch-Decode (2 cores per FET/DEC) -> 4x Dispatch(Trace/Op/L0i cache)
4x Retire/Rename(2x FP Retire/Rename) -> 4x Cores(2 ALU/2 AGU for every core) + 2x FPU(2x FMAC+1 MMX per dual-core)
2x L0d(per core) -> 1x L1d(shared between two cores) =(Module)> 2x L1d(per dual-core cluster) -> 1x L2(shared per module) =(CPU Complex)> 2x L2(per module) -> 1x L3(shared between CPU Complex).

Power and density is priority, with speed being free. (Initially, Stoney Ridge clock rate + two more cores)

My guesses on the speculation of caches are:
L0d = 4 KB, 2-cycle. (5KB if 1 KB is Stack)
L0i = 1024 ops
L1d = 32 KB, 6-cycle.
L1i = 128 KB
L2 = 1 MB, 18-cycle.
L3 = 4 MB, sub-54 cycles.
(Caches can either be write-through or write-back, extra cycle for selection. Depending on access type, etc. (There is also special hybrid mode; 2 MB L2/2 MB L3 (L3 gives 1 MB to L2, this makes it go from mostly-inclusive to mostly-exclusive.))

Also, most defining rule is that it is MINIMAL design change over Excavator-Zen. In large part is that the mainstream team @ AMD is extremely low budget currently.

FX-series, A-series, E-series, Opteron-series, etc will be using this core for next-gen SKUs. With low-end(4C/4T) first , and high-end(64C/64T) last. Everything is budget. (Zen-lite became CMT?!)
 
Last edited:
Reactions: amd6502

amd6502

Senior member
Apr 21, 2017
971
360
136
Highly optimized ULP core. Next-gen Bulldozer skeleton with Next-gen Jaguar design. It is optimized to operate at ultra low power to essential high performance.

Very much speculation:
SMT4 NN Branch Predictor -> 2x 4-wide Fetch-Decode (2 cores per FET/DEC) -> 4x Dispatch(Trace/Op/L0i cache)
4x Retire/Rename(2x FP Retire/Rename) -> 4x Cores(2 ALU/2 AGU for every core) + 2x FPU(2x FMAC+1 MMX per dual-core)
2x L0d(per core) -> 1x L1d(shared between two cores) =(Module)> 2x L1d(per dual-core cluster) -> 1x L2(shared per module) =(CPU Complex)> 2x L2(per module) -> 1x L3(shared between CPU Complex).

Power and density is priority, with speed being free. (Initially, Stoney Ridge clock rate + two more cores)

[....]

FX-series, A-series, E-series, Opteron-series, etc will be using this core for next-gen SKUs. With low-end(4C/4T) first , and high-end(64C/64T) last. Everything is budget. (Zen-lite became CMT?!)

CMT4 not SMT4, unless it can somehow do reverse multithreading and utilize two cores (or more) for one thread.

That sounds like merging two modules into one, and reverting back to Piledriver with the shared CMT front end.

Even without cache changes above, it sounds like a big project and almost as much work / cost as steamroller to excavator change.
 

DrMrLordX

Lifer
Apr 27, 2000
21,813
11,168
136
AMD still doesn't have the R&D budget for such an undertaking.

Chalk all this speculation up to more uarch designs discussed by Nosta that will never see the light of day, even in a lab.
 

NostaSeronx

Diamond Member
Sep 18, 2011
3,689
1,224
136
CMT4 not SMT4, unless it can somehow do reverse multithreading and utilize two cores (or more) for one thread.
It is SMT4, as the front-end of any CMT IP is shared. The use of SMT in the front-end is no where as bad as usage of SMT in a core.

From CMT research:
In this section, we examine the effect of sharing the front-end in clustered multi-threaded processors given that the back-ends are privately assigned to the threads.

Partitioning the front-end into four clusters clearly degrades performance. For four threads, the performance loss is more pronounced as each front-end is occupied by a single thread, which greatly reduces its utilization. With four threads, moving from four to two front-ends has the greatest impact on the mix workloads (a 20%-30% improvement) due to their orthogonal machine resource requirements.

...a single, unified, front-end is the best performing option, and eliminates a crossbar connection to the back-ends.
That sounds like merging two modules into one, and reverting back to Piledriver with the shared CMT front end.

Even without cache changes above, it sounds like a big project and almost as much work / cost as steamroller to excavator change.
It is a much smaller project than producing a full new core from scratch. Rather, one that is mostly recycled to produce a new core. Majority of the changes have similar possible wire placement to previous designs.

Zen -> Zen+ -> Zen-Lite -> Zen 2 -> Zen-Lite 2 -> Zen 3, etc. With Zen-Lite being chosen to use CMT for the fact that all CMT designs use less power than SMT designs.

CMT4 with 4 ALUs per core will consume half the power of a SMT8 with 8 ALUs per core.
So, take that... do some weird voodoo math and stuff.
CMT4 with 2 ALUs per core should consume half the power of a SMT4 core with 4 ALUs per core // or should consume slightly less than half of a CMP2_SMT2 processor with 4 ALUs per core.
 
Last edited:

Exist50

Platinum Member
Aug 18, 2016
2,452
3,102
136
Nosta, really, we appreciate how imaginative this all is, but you should clarify that it's all just made up. I see some people in this thread and others actually taking these claims seriously, and goodness knows how many lurkers there are that don't know any better either.

So let's make this abundantly clear. AMD's given zero indication that they are ever going to use CMT again. Nor is there even the slightest bit of evidence to indicate that anyone's using FD-SOI for a mainstream consumer product. 14nm is cheap enough for an SD450, and thus cheap enough for anything AMD would use it for. There's no need at all to use such a different process.

Moreover, the claim that CMT would be better for power usage is pretty laughable in light of Bulldozer's failure. That lineage is dead; let it rest in peace.

The answer to this thread's question is really quite simple. The Bulldozer architecture has no future. End of story.
 

amd6502

Senior member
Apr 21, 2017
971
360
136
Yes their public roadmap seems to show that excavator looks like it's at end of life.

Still, imho, this doesn't rule out a small low profile project. They don't announce all of their GPU projects either. A totally new post-excavator generation of dozers seems pretty unlikely though. If the new 22FDX with body biasing is as good (cheap and efficient) as Seronx thinks, I think a small shoestring 28nm port project would potentially make good sense. One die could fill a lot of niche areas that could economically be covered, and there would also be research to be gained.

Seronx, Piledriver efficiency was not too terrible under 3ghz, and Excavator efficiency was quite good---good enough for servers, thanks to the extensive gating and power savings techniques. I don't think it's generally more efficient than SMT though.

SMT is complex but increases utilization.

The other route to high efficiency is to go small core and low frequency. But even for small core, Atom has added MT for their phi server core; I believe SMT. For a single threaded core the amount of cycles that the core spends waiting for system memory is too significant, even at low frequencies like 1.5-2GHZ.

CMT4 with 4 ALUs per core will consume half the power of a SMT8 with 8 ALUs per core.
So, take that... do some weird voodoo math and stuff.
CMT4 with 2 ALUs per core should consume half the power of a SMT4 core with 4 ALUs per core // or should consume slightly less than half of a CMP2_SMT2 processor with 4 ALUs per core.

I have no idea what you mean above and don't know what is CMT4 with 2 ALUs per core, and CMP2_SMT2. I wouldn't expect that "CMT4 with 2 ALUs per core should consume half the power of a SMT4 core with 4 ALUs per core" since the former would have a total of 8 ALUs and the latter only 4 ALUs (plus more complex scheduler and more SMT tagging + circuitry in the core).
 
sale-70-410-exam    | Exam-200-125-pdf    | we-sale-70-410-exam    | hot-sale-70-410-exam    | Latest-exam-700-603-Dumps    | Dumps-98-363-exams-date    | Certs-200-125-date    | Dumps-300-075-exams-date    | hot-sale-book-C8010-726-book    | Hot-Sale-200-310-Exam    | Exam-Description-200-310-dumps?    | hot-sale-book-200-125-book    | Latest-Updated-300-209-Exam    | Dumps-210-260-exams-date    | Download-200-125-Exam-PDF    | Exam-Description-300-101-dumps    | Certs-300-101-date    | Hot-Sale-300-075-Exam    | Latest-exam-200-125-Dumps    | Exam-Description-200-125-dumps    | Latest-Updated-300-075-Exam    | hot-sale-book-210-260-book    | Dumps-200-901-exams-date    | Certs-200-901-date    | Latest-exam-1Z0-062-Dumps    | Hot-Sale-1Z0-062-Exam    | Certs-CSSLP-date    | 100%-Pass-70-383-Exams    | Latest-JN0-360-real-exam-questions    | 100%-Pass-4A0-100-Real-Exam-Questions    | Dumps-300-135-exams-date    | Passed-200-105-Tech-Exams    | Latest-Updated-200-310-Exam    | Download-300-070-Exam-PDF    | Hot-Sale-JN0-360-Exam    | 100%-Pass-JN0-360-Exams    | 100%-Pass-JN0-360-Real-Exam-Questions    | Dumps-JN0-360-exams-date    | Exam-Description-1Z0-876-dumps    | Latest-exam-1Z0-876-Dumps    | Dumps-HPE0-Y53-exams-date    | 2017-Latest-HPE0-Y53-Exam    | 100%-Pass-HPE0-Y53-Real-Exam-Questions    | Pass-4A0-100-Exam    | Latest-4A0-100-Questions    | Dumps-98-365-exams-date    | 2017-Latest-98-365-Exam    | 100%-Pass-VCS-254-Exams    | 2017-Latest-VCS-273-Exam    | Dumps-200-355-exams-date    | 2017-Latest-300-320-Exam    | Pass-300-101-Exam    | 100%-Pass-300-115-Exams    |
http://www.portvapes.co.uk/    | http://www.portvapes.co.uk/    |