Question Intel Mont thread

Page 7 - Seeking answers? Join the AnandTech community: where nearly half-a-million members share solutions and discuss the latest tech.

DavidC1

Golden Member
Dec 29, 2023
1,211
1,933
96
Skymont is a killer core:
-9 wide decode, up from 6-wide from Crestmont using 3x3 cluster configuration, up from 2x3
-New innovation: Nanocode, Using "Nanocode" adds further microcode parallelism within each decode cluster.
-Increased uop queue capacity from 64 to 96
-8-wide allocate, up from 6
-16-wide retire, from 8
-Dependency breaking to reduce instruction latency
-416 ROB, up from 256
-Bigger, PRF, reservation stations, and L/S buffers
-26 dispatch ports, up from 17 on Gracemont
-Literal doubling of FP capability with 4x128-bit FMA units, versus 2 before.
-Doubled L2 bandwidth to 128 bytes per cycle

All the way offering 3:1 area advantage, and some power advantage over Lion Cove, the latest P core.
 

DavidC1

Golden Member
Dec 29, 2023
1,211
1,933
96
i wonder why didn't Intel use Skymont for SRF. its release window was very close to Skymont release window.
It would have delayed the release further and it's a risk they shouldn't take since they want to recover the time lag from competitors. And that also delays their 5Y4N plans, which will be bad.

Servers also need extra validation. Worry not though, Clearwater Forest uses the latest E core, the Skymont-based Darkmont.
 
Reactions: H433x0n

Mahboi

Golden Member
Apr 4, 2024
1,035
1,900
96
All the way offering 3:1 area advantage, and some power advantage over Lion Cove, the latest P core.
Have we started having returns on the power draw of Skymont?
We say E-core, but considering the extra things, it could be a lot less E than it used to be, or very close. I'm curious to see details. If Intel hasn't mentioned power draw yet, I expect it's quite a bit worse.
 

NTMBK

Lifer
Nov 14, 2011
10,338
5,406
136
Have we started having returns on the power draw of Skymont?
We say E-core, but considering the extra things, it could be a lot less E than it used to be, or very close. I'm curious to see details. If Intel hasn't mentioned power draw yet, I expect it's quite a bit worse.
Depends on whether the E stands for power efficiency or area efficiency. Intel seem keen on reducing die size for some reason, now they have to pay TSMC for the wafers...
 

Khato

Golden Member
Jul 15, 2001
1,249
321
136
You can also think of SRF as somewhat of a proof of concept product. It's leveraging as much as possible from GNR and keeping core changes to a minimum. Which would be part of the reason why it ended up leap-frogging GNR and coming to market first. All the better for Intel as it's probably the better product for the majority of server volume.

Regarding Skymont efficiency - slide decks have plenty of performance versus power curves. It's better than Crestmont.
 

DavidC1

Golden Member
Dec 29, 2023
1,211
1,933
96
Have we started having returns on the power draw of Skymont?
We say E-core, but considering the extra things, it could be a lot less E than it used to be, or very close. I'm curious to see details. If Intel hasn't mentioned power draw yet, I expect it's quite a bit worse.
The process and design of the silicon their chips are on is optimized for frequency, thus is not completely optimal for the E core. The curve is flatter for faster clocked designs, thus take less power to reach for the designs that are capable of doing it. However the E cores stop before the P cores do, and voltage requirements increase dramatically towards the end.

In the optimized range, they are indeed much more efficient. Also, just being 12% off their P core performance per clock means it won't take a lot for Skymont to be more power efficient than Lion Cove.

Nevermind the humongous area advantage.
 
Reactions: Tlh97 and Mahboi

Mahboi

Golden Member
Apr 4, 2024
1,035
1,900
96
The process and design of the silicon their chips are on is optimized for frequency, thus is not completely optimal for the E core. The curve is flatter for faster clocked designs, thus take less power to reach for the designs that are capable of doing it. However the E cores stop before the P cores do, and voltage requirements increase dramatically towards the end.

In the optimized range, they are indeed much more efficient. Also, just being 12% off their P core performance per clock means it won't take a lot for Skymont to be more power efficient than Lion Cove.
I never had any doubt about that just from the area.
But I'm curious if this is an E-core in the traditional sense anymore.
Are we going to see a switcheroo where AMD makes a Zen 5 LP core while riding the success of mainline Zen 5, while Intel is taking their E-cores to become their performance cores and their LP cores replace them as E-cores?
(and will Pat burn the Israel Design Center?)
 

trivik12

Senior member
Jan 26, 2006
343
317
136
Sierra forest showing huge improvements in efficiency and that is with Crestmont plus core. Clear Water Forest will be a serious product that could target even more data center segments. That plus Panther Lake housing enhanced Skymont(Darkmont) on Intel 18A with Ribbon Fet/Power Via tech. That would show comps between Lunar Lake/Panther Lake very interesting. How would Intel 18A compare to TSMC N3B. Of course we will also see Arrow Lake Refresh on N3P?
 

TwistedAndy

Member
May 23, 2024
159
150
76
Have we started having returns on the power draw of Skymont?
We say E-core, but considering the extra things, it could be a lot less E than it used to be, or very close. I'm curious to see details. If Intel hasn't mentioned power draw yet, I expect it's quite a bit worse.

Yes, we have a slide covering the performance and power curves. I think Skymont will work great on lower power levels (1-5W per core), but Intel may decide to push it to 4-5GHz in the desktop Arrow Lake. Unfortunately, the right part of Skymont was not shown on the chart.

In general, Skymont can be renamed to Conroemont
 

Attachments

  • 1000008907.png
    165.8 KB · Views: 23
Jun 4, 2024
116
146
71
Yes, we have a slide covering the performance and power curves. I think Skymont will work great on lower power levels (1-5W per core), but Intel may decide to push it to 4-5GHz in the desktop Arrow Lake. Unfortunately, the right part of Skymont was not shown on the chart.

In general, Skymont can be renamed to Conroemont
Yeah leaks state 4.6ghz for Conroemont. I like the name
 
Reactions: DavidC1

TwistedAndy

Member
May 23, 2024
159
150
76
It may sound funny, but Conroemont (aka Skymont) is not that different from Apple Silicon P-cores in terms of internal structure. 9-wide decode, 8-wide allocation, 3 load units, 4 STA, 416 ROB...

Yes, it's not correct to compare CPUs this way, and a lot of details are simply missing, but still, it's not that different.
 

Attachments

  • GPPNrzKWEAAQgUj.jpg
    576.2 KB · Views: 19
  • GPPNr1XXIAAp-rD.jpg
    522.3 KB · Views: 17

TwistedAndy

Member
May 23, 2024
159
150
76
But with 25% lower IPC?

The problem with IPC is that it heavily depends on the test suite. And Geekbench 6.3 is not the best one for comparing different platforms. Actually, in some cases, you can't compare results within one platform (SME).

The only viable metric is the actual performance of the apps you're using on the fixed power limit.
 
Reactions: igor_kavinski

gdansk

Diamond Member
Feb 8, 2011
3,276
5,186
136
The M4 also has an advantage of having 9 stage pipeline compared to 14 for Intel's E cores. That alone is responsible for anywhere from 10 to 20% difference in performance assuming everything else is the same.
But Skymont doesn't clock much higher with those extra stages. What are they doing in the extra stages? Decode?
I'm not finding good information anywhere.
 

DavidC1

Golden Member
Dec 29, 2023
1,211
1,933
96
But Skymont doesn't clock much higher with those extra stages. What are they doing in the extra stages? Decode?
I'm not finding good information anywhere.
There's obviously more than just pipeline stages. Itanium went from 800MHz, 10 stages to 1GHz, 8 stages on the same process technology, because the circuit design improved drastically with the coordinated and experienced HP engineers.

Or the fact that Athlon mostly matched Pentium III's clockspeed despite some deficiency in pipelines as well and facing against Intel's much superior process.

Another example is when Nvidia greatly increased clock speeds with the Pascal generation and mentioned improved circuit design. Circuit design also affects many other details such as instruction latency of the execution units.

Additional unknown are things like the branch predictor, and unknown other features in Apple's chip that is brought on by hard work and inspiration, not what has been done previously and is basically a copy of it.

Why do some companies make better cars than others? Why are some people smarter? Same question.

I'd love to see how they do with Darkmont and Arctic Wolf, if they can bring themselves to be a class-leading core and rivals Apple chips of that time in every area. After all, they started from the absolutely anemic Bonnell Atom.
 
Last edited:

DavidC1

Golden Member
Dec 29, 2023
1,211
1,933
96
The M4 also has an advantage of having 9 stage pipeline compared to 14 for Intel's E cores. That alone is responsible for anywhere from 10 to 20% difference in performance assuming everything else is the same.
By the way, this is the number Intel quoted back in the Netburst days. "Each additional pipeline stage is responsible for roughly 2-4% impact in performance".

We know from high pipeline CPUs that the cost of higher pipeline stages are greater than branch misprediction itself. Pentium 4 required greatly more transistors. Prescott had to balloon again. The extra transistors cost in terms of die size and power. There are likely other unknowns.

This is because prediction isn't perfect(it will never be). So while adding capabilities for better prediction is difficult and costly, in addition to the benefit being some of the time, addition of extra stages is a guarantee loss of performance.
 
Reactions: igor_kavinski
sale-70-410-exam    | Exam-200-125-pdf    | we-sale-70-410-exam    | hot-sale-70-410-exam    | Latest-exam-700-603-Dumps    | Dumps-98-363-exams-date    | Certs-200-125-date    | Dumps-300-075-exams-date    | hot-sale-book-C8010-726-book    | Hot-Sale-200-310-Exam    | Exam-Description-200-310-dumps?    | hot-sale-book-200-125-book    | Latest-Updated-300-209-Exam    | Dumps-210-260-exams-date    | Download-200-125-Exam-PDF    | Exam-Description-300-101-dumps    | Certs-300-101-date    | Hot-Sale-300-075-Exam    | Latest-exam-200-125-Dumps    | Exam-Description-200-125-dumps    | Latest-Updated-300-075-Exam    | hot-sale-book-210-260-book    | Dumps-200-901-exams-date    | Certs-200-901-date    | Latest-exam-1Z0-062-Dumps    | Hot-Sale-1Z0-062-Exam    | Certs-CSSLP-date    | 100%-Pass-70-383-Exams    | Latest-JN0-360-real-exam-questions    | 100%-Pass-4A0-100-Real-Exam-Questions    | Dumps-300-135-exams-date    | Passed-200-105-Tech-Exams    | Latest-Updated-200-310-Exam    | Download-300-070-Exam-PDF    | Hot-Sale-JN0-360-Exam    | 100%-Pass-JN0-360-Exams    | 100%-Pass-JN0-360-Real-Exam-Questions    | Dumps-JN0-360-exams-date    | Exam-Description-1Z0-876-dumps    | Latest-exam-1Z0-876-Dumps    | Dumps-HPE0-Y53-exams-date    | 2017-Latest-HPE0-Y53-Exam    | 100%-Pass-HPE0-Y53-Real-Exam-Questions    | Pass-4A0-100-Exam    | Latest-4A0-100-Questions    | Dumps-98-365-exams-date    | 2017-Latest-98-365-Exam    | 100%-Pass-VCS-254-Exams    | 2017-Latest-VCS-273-Exam    | Dumps-200-355-exams-date    | 2017-Latest-300-320-Exam    | Pass-300-101-Exam    | 100%-Pass-300-115-Exams    |
http://www.portvapes.co.uk/    | http://www.portvapes.co.uk/    |