Intel Ocean Cove Thread (next gen core design)

TheELF · Apr 30, 2018

FIVR said:
I think the key to intel's next architecture (and the key to AMD's current one) is the ability to implement MCM. It is obvious that process tech improvements are slowing and becoming more expensive, so the necessity to leverage die area of multiple packages will become important for performance in the next decade.

Anything more cores would give a big performance boost to get's a big performance boost from GPUs,which is why both companies came to the conclusion back in 2005 that they would have to incorporate GPUs into their CPUs as co processors,the main thing intel is trying to get some good GPU IP going,that's my guess.
For very specialized things that can't be run on GPU people already use many core ARM chips or stuff like xeonphi ,a lot of cores each running very slowly,such an co processor a mainstream CPU doesn't need and won't get for many many years.

TheF34RChannel said:
I'm opting for Moonbogg's alien technology I think that something completely new and modern is the way forward rather than redesigning the old and then sleepwalking with that for another decade. Unless they'd enjoy being overtaken by their sole competitor in the near future.

They can't be overtaken they could be,in theory,be matched since once you are maximized there is no beyond maximized and if they would get matched intel would still be the first who got there.

fleshconsumed · Apr 30, 2018

Interesting news. I just hope AMD stays competitive. Last 10 years before Ryzen were pretty painful. At least Intel doesn't have the process advantage anymore, so both Intel and AMD will be playing on a level field.

Thunder 57 · Apr 30, 2018

PingSpike said:
Didn't JK want to make a new kind of ARM core with AMD before he left? I thought I read that somewhere.

Rumor was he was more interested in K12 than Zen, and when K12 got shelved, he started looking elsewhere.

Donts00tmesanta · Apr 30, 2018

LTC8K6 said:
They already said, several times, when those would be fixed.

Ok, when...

IntelUser2000 · Apr 30, 2018

ksec said:
I like to think this time, it is different. ( I hope ),

Haven't you heard of the term "The more things change, the more they stay the same"?

Revolutionary means it can apply ideas that are radically different to what has been done before. The problem with revolution is it fails on having the right balance. Intel's EPIC ISA failed because it was off balance, and tried to push work to compilers too much. Pentium 4's Trace Cache didn't work because it tried to rely on it exclusively - once they re-adopted it for Sandy Bridge not as a replacement of existing ideas but a complementary of it, it worked very well.

I hope they find the right balance, but do not expect a real, total ground-up like the architectures that failed. They don't work.

Regarding the alien architecture Moonbogg was talking about. Sure it might be fantastic but I doubt you'll be able to run anything you currently have on it.

Thunder 57 · Apr 30, 2018

IntelUser2000 said:
Pentium 4's Trace Cache didn't work because it tried to rely on it exclusively - once they re-adopted it for Sandy Bridge not as a replacement of existing ideas but a complementary of it, it worked very well.

Indeed. Too bad AMD didn't figure that out until Zen. I'm really curious as to what more can be done. Two things that come to mind lately are the uop cache and micro-op fusion. Is there anything interesting out there that is being worked on? I'm sure there is, but we won't find out until it comes out.

Regarding the alien architecture Moonbogg was talking about. Sure it might be fantastic but I doubt you'll be able to run anything you currently have on it.

No, I'm pretty sure they've got that all figured out. Haven't you seen Independence Day?

IntelUser2000 · May 1, 2018

Thunder 57 said:
No, I'm pretty sure they've got that all figured out. Haven't you seen Independence Day?

Yea I did. The alien mothership got hacked by a Powerbook 5300 that didn't even have WiFi. I'm pretty sure if they invaded us now we'll totally own them with our Coffeelake Ultrabooks.

uop cache and micro-op fusion.

Yea, however the uop cache is based on the Trace cache, and uop fusion may be a feature particular to x86, so not entirely new either. The technologies that are being used today often comes from ideas that were thought of in the 70's.

Thunder 57 · May 1, 2018

IntelUser2000 said:
Yea I did. The alien mothership got hacked by a Powerbook 5300 that didn't even have WiFi. I'm pretty sure if they invaded us now we'll totally own them with our Coffeelake Ultrabooks.

Well there was clearly a PowerPC to Alien translation/emulation layer going on.

Yea, however the uop cache is based on the Trace cache, and uop fusion may be a feature particular to x86, so not entirely new either. The technologies that are being used today often comes from ideas that were thought of in the 70's.

I never did look into the trace cache all that much. It sounds very much like a uop cache. Are there differences? Micro-op fusion has been around since Conroe I want to say, so certainly not new. I'm just wondering if there is anything else out there that we may see in a new architecture. I've heard about Speculative Multithreading from David Kanter among a few others. By the sounds of it, it just wasn't all that feasible. Sort of like the compilers that were supposed to make Itanium awesome.

IntelUser2000 · May 1, 2018

Thunder 57 said:
I never did look into the trace cache all that much. It sounds very much like a uop cache. Are there differences?

The biggest difference is in the way the CPU relies on them. Pentium 4 had a single decoder and no L1 Instruction cache. Trace Cache was supposed to replace both. If it had a high hit rate it would have been great, because it would have cut down on several pipeline stages. But despite the large physical size it wasn't big enough. They could have opted to have 3 wide decoders and L1 I-cache but this was back in early 2000s when architecture was limited by transistor counts and die sizes and it would have simply made it too large. Plus it might have impacted frequency somewhat.

The idea for the Trace Cache was by minimizing decoders and cutting down L1 cache you'd end up with a core that's small and scales in frequency well, while the Trace Cache would make up for the lack of decoders and I-cache. The core was still bloated and way too large, also used lots of power.

Sandy Bridge's uop cache merely aids in it in case the 4-wide decoders and the L1-I cache is not enough. So the worst case scenario isn't absolutely horrific as it was with Netburst chips.

Micro Op fusion has been used since Pentium M back in 2003.

Thunder 57 · May 1, 2018

IntelUser2000 said:
The biggest difference is in the way the CPU relies on them. Pentium 4 had a single decoder and no L1 Instruction cache. Trace Cache was supposed to replace both. If it had a high hit rate it would have been great, because it would have cut down on several pipeline stages. But despite the large physical size it wasn't big enough. They could have opted to have 3 wide decoders and L1 I-cache but this was back in early 2000s when architecture was limited by transistor counts and die sizes and it would have simply made it too large. Plus it might have impacted frequency somewhat.

Yes, as it were, the P4 was already significantly larger in die size than contemporary Athlons. So to clarify, the trace cache is basically a uop cache, except that in modern designs, there is still an L1 instruction cache?

Micro Op fusion has been used since Pentium M back in 2003.

Wow, that is a bit further than I had thought. I'm not surprised it came from the Pentium M team though.

IntelUser2000 · May 1, 2018

Thunder 57 said:
Yes, as it were, the P4 was already significantly larger in die size than contemporary Athlons. So to clarify, the trace cache is basically a uop cache, except that in modern designs, there is still an L1 instruction cache?

And the decoders, the lack of which may be more important than L1 I-cache. Remember, the Netburst designs having only 1 decoder was a big deal, because going from 1 to 2 decoders results in a relatively easy, large improvement, in some cases nearly 2x, and Pentium III design had 3 of them.

x86 designs use decoders to convert more complicated x86 instructions into internal micro ops. Trace Cache stores decoded instructions, so if data exists in the Trace Cache(called a cache hit), you get to skip decoding stages, which also cuts down on pipeline stages. You get the benefit of not needing to decode, and the benefit of short pipeline stages which cuts down on branch misprediction which is costly.

Trace Cache had to do more, so its more complicated and bigger than uop cache.

I'm not surprised it came from the Pentium M team though.

Don't make the mistake of thinking them as a better team. Internally I thought they might have different goals. The original Pentium M team, from Haifa was more core-focused, while the Oregon team was more platform and I/O focused. You get the beefier core from Haifa, and things like Hyperthreading and integrated memory controller + QPI from Oregon. I am not sure if the distinction exists anymore. If it is, Skylake is Haifa's.

Thunder 57 · May 1, 2018

IntelUser2000 said:
Trace Cache stores decoded instructions, so if data exists in the Trace Cache(called a cache hit), you get to skip decoding stages, which also cuts down on pipeline stages. You get the benefit of not needing to decode, and the benefit of short pipeline stages which cuts down on branch misprediction which is costly.

Yea, I am familiar with the decoding. I just thought that a uop cache consisted of decoded instructions. Hence, "micro-op cache". If you get a hit there you save the decoding stage and some power. I'm just trying to understand how the trace cache was different.

Thunder 57 · May 1, 2018

IntelUser2000 said:
Don't make the mistake of thinking them as a better team. Internally I thought they might have different goals. The original Pentium M team, from Haifa was more core-focused, while the Oregon team was more platform and I/O focused. You get the beefier core from Haifa, and things like Hyperthreading and integrated memory controller + QPI from Oregon. I am not sure if the distinction exists anymore. If it is, Skylake is Haifa's.

I don't mean to say that the P4 was total crap. It must've had excellent branch prediction for it's time to keep that pipeline filled. I am really curious as to what Intel did to keep Prescott IPC right around Northwood despite the significantly long pipeline. Branch prediction, increased L1 and L2. Beyond that I have no idea.

naukkis · May 1, 2018

IntelUser2000 said:
And the decoders, the lack of which may be more important than L1 I-cache. Remember, the Netburst designs having only 1 decoder was a big deal, because going from 1 to 2 decoders results in a relatively easy, large improvement, in some cases nearly 2x, and Pentium III design had 3 of them.

One decoder was newer problem with netburst as instructions are decoded only if L1i misses. Of course having more decoders and real L1i is beneficial if there's die area to use that and way to power down decoders when uOP cache hits to prevent them wasting power. Both options needs transistors which weren't free at Netburst time.

Nowadays Intel cpu's will power down decode phase when loops fit in uOP cache so no power loss from having more complicated front end which is most beneficial at context switches when uOP cache is empty(uOP cache has to be virtual)

But nowadays is pretty much guaranteed that x86 will lose efficiency race against ARM because of complex instruction decode wasting power - and getting even with ARM means that hardware compatibility to x86 will be lost.

stuff_me_good · May 1, 2018

I love proper competition and all new technology and I like the idea Intel getting Jim Keller, but now I'm worried at the same time that in few years time, AMD is going to fall off of cliff like in phenom days because no one to steer the CPU design and in their infinite wisdom doing something horribly disastrous.

Yes Zen has still a lot to give, but if intel has in 4 years new design that is out of the box faster than their current at that time it will trounce latest and greatest Zen design which has barely been able to catch up intels latest *lake design.

And yet we get to the point where monopoly continues and people are bitching not having competition, yet doing nothing about it like voting with their wallets.

NTMBK · May 1, 2018

stuff_me_good said:
I love proper competition and all new technology and I like the idea Intel getting Jim Keller, but now I'm worried at the same time that in few years time, AMD is going to fall off of cliff like in phenom days because no one to steer the CPU design and in their infinite wisdom doing something horribly disastrous.

Yes Zen has still a lot to give, but if intel has in 4 years new design that is out of the box faster than their current at that time it will trounce latest and greatest Zen design which has barely been able to catch up intels latest *lake design.

And yet we get to the point where monopoly continues and people are bitching not having competition, yet doing nothing about it like voting with their wallets.

A modern CPU design team is way bigger than one person. Hopefully the culture and processes that created Zen are still in place at AMD, and will lead to more great designs in the future.

IntelUser2000 · May 1, 2018

naukkis said:
One decoder was newer problem with netburst as instructions are decoded only if L1i misses. Of course having more decoders and real L1i is beneficial if there's die area to use that and way to power down decoders when uOP cache hits to prevent them wasting power. Both options needs transistors which weren't free at Netburst time.

Please read previous posts.

Pentium 4 did not have L1I cache. Trace Cache was it. And I bet while overall the impact may not have been big as having just 1 decoder, there would still have been non-negligible impact, and in some scenarios where the impact was huge. When there was nothing in the Trace Cache and it was indeed a 1-wide chip.

I don't mean to say that the P4 was total crap. It must've had excellent branch prediction for it's time to keep that pipeline filled.

Yea, I just thought to mention that because just like Jim Keller, Steve Jobs, Elon Musk, and even teams like Haifa, people tend to put them in a pedestal. Yea they do good sure. Remember for Jobs he got evicted from Apple before he came to shape Apple that it is today. We need mistakes to shape us for better.

About the Trace Cache, yeah, there's low-level technical differences but the concept is similar. They both store decoded instructions.

jpiniero · May 1, 2018

Kind of assuming this is the server-focused product that comes after the Rapids. Given Intel's interest in accelerators, maybe this is where they really get serious about enabling non-CPU tiles.

DisEnchantment · May 1, 2018

Most likely it will be Zen 5 vs Ocean Cove. Zen 2 would have taped out already, and Zen 3 would have its key design goals frozen.
Mike Clark already started defining what the future of Zen 5 would look like and incoming challenges would definitely charge them up which is a very good thing for innovation.
Hopefully AMD can generate enough profits in the next 18 months to secure resources and talents for Zen 5, the outcome after 2021-22 would be an epic showdown again after so many sloppy years.
Core vs Core design, no more process handicap.

LTC8K6 · May 1, 2018

kyubi said:
Ok, when...

AMD and Intel have both answered the question about when fixed chips will be released. It was discussed several times in the big thread on Meltdown/Spectre.

I believe Intel said fixed chips would be out this year. Can't remember AMD's answer.

Donts00tmesanta · May 1, 2018

LTC8K6 said:
AMD and Intel have both answered the question about when fixed chips will be released. It was discussed several times in the big thread on Meltdown/Spectre.

I believe Intel said fixed chips would be out this year. Can't remember AMD's answer.

So, the architecture in question WILL have the fixes?

scannall · May 2, 2018

LTC8K6 said:
AMD and Intel have both answered the question about when fixed chips will be released. It was discussed several times in the big thread on Meltdown/Spectre.

I believe Intel said fixed chips would be out this year. Can't remember AMD's answer.

AMD's 470 chipsets already have the Spectre 1 and 2 mitigations in place. Meltdown is Intel only. Intel has said sometime this year.

LTC8K6 · May 2, 2018

scannall said:
AMD's 470 chipsets already have the Spectre 1 and 2 mitigations in place. Meltdown is Intel only. Intel has said sometime this year.

I'm not aware of any hardware fixes by either Intel or AMD.

LTC8K6 · May 2, 2018

kyubi said:
So, the architecture in question WILL have the fixes?

This is unknown at the moment.

LTC8K6 · May 2, 2018

Intel and AMD announcements of hardware fixes. Intel by end of 2018, AMD in 2019 with Zen version 2.

https://www.tomshardware.com/news/intel-in-silicon-fix-meltdown-spectre,36405.html

https://www.networkworld.com/articl...ns-silicon-fix-for-spectre-vulnerability.html

Remember, for most home PC users, these exploits are not really anything to worry about.
No examples of them have been found in the wild, either.

Intel Ocean Cove Thread (next gen core design)

Diamond Member

Diamond Member

Platinum Member

Senior member

Elite Member

Platinum Member

Elite Member

Platinum Member

Elite Member

Platinum Member

Elite Member

Platinum Member

Platinum Member

Senior member

Senior member

Lifer

Elite Member

Lifer

Golden Member

Lifer

Senior member

Golden Member

Lifer

Lifer

Lifer