News Intel GPUs - Battlemage officially announced, evidently not cancelled

Page 77 - Seeking answers? Join the AnandTech community: where nearly half-a-million members share solutions and discuss the latest tech.

moinmoin

Diamond Member
Jun 1, 2017
4,994
7,765
136
Do we have any solid info on how their packaging tech is failing them in this instance?
I don't think their packaging tech as is is failing them. Regardless of how good your packaging tech is, putting together 63 parts/47 chiplets can't be cheap nor have an excellent yield. They essentially made this design because they could and likely needed to hit a target not achievable otherwise, not because it's an economical and smart design (similarly to how I consider the mirrored chiplets of SPR as inane, not smart).
 
Reactions: xpea

jpiniero

Lifer
Oct 1, 2010
14,841
5,456
136
I don't think their packaging tech as is is failing them. Regardless of how good your packaging tech is, putting together 63 parts/47 chiplets can't be cheap nor have an excellent yield. They essentially made this design because they could and likely needed to hit a target not achievable otherwise

I could see it being done as an R&D exercise to feel out the challenges of doing something like that. Now it turned into a disaster because of 7 nm but the packaging shouldn't have been an issue.
 

maddie

Diamond Member
Jul 18, 2010
4,787
4,771
136
I don't think their packaging tech as is is failing them. Regardless of how good your packaging tech is, putting together 63 parts/47 chiplets can't be cheap nor have an excellent yield. They essentially made this design because they could and likely needed to hit a target not achievable otherwise, not because it's an economical and smart design (similarly to how I consider the mirrored chiplets of SPR as inane, not smart).
Well, you know the old saying, "if a little is good then more must be better". Right, right?
 
Reactions: moinmoin

DrMrLordX

Lifer
Apr 27, 2000
21,807
11,161
136
I don't think their packaging tech as is is failing them.

Hmm. Well I was sort of thinking, if you look at how much silicon is expended on EMIB on Sapphire Rapids, that's one instance of packaging tech underperforming vs. expectations. Or at least my expectations. I didn't think Intel would need that much area just to bind four (admittedly large) chiplets together. My ideas about what those silicon bridges would be like were completely off, at least with respect to Sapphire Rapids.

Not sure if Ponte Vecchio has forced Intel to make similar compromises.
 

IntelUser2000

Elite Member
Oct 14, 2003
8,686
3,785
136
Sapphire Rapids packaging allows monolithic like latencies, while going the tile route so it's not all lost. Yes, it took them long time to get there, but you have to start somewhere. Successive generations will solve issues, like the mirrored die problem.

A350M is better performing than I expected. Yes it's late but it should allow them to get a foothold in the market.

@mikk I am saying that regardless of whether the die/transistor difference is 10% or 40%, Intel is behind in uarch. We can also see it from the top end die being large as Ampere while being on a noticeably better process. They are close enough to be a threat but not cutting edge. I assume they need advancements in all areas from optimizing fully to the process and a better uarch.

I don't know why it's so hard for you to admit that they are behind. You can be a fan of a company without giving excuses for them. They don't owe you anything - zero.

3DCenter reports Intel may have leaked specs of their desktop card.


In the ARC Control video, it shows VRAM clock of 1.1GHz and GPU clock of 2.25GHz. 175W GPU power, so if we assume core power the whole board may be 220-230W as speculated by 3DCenter.

18TFlops of compute, while the A350M even at 1.5GHz is 2.3TFlops. Power vs compute scaling is pretty linear.
 
Last edited:

jpiniero

Lifer
Oct 1, 2010
14,841
5,456
136
A350M is better performing than I expected. Yes it's late but it should allow them to get a foothold in the market.

It's slower than Rembrandt and any of the 4 GB cards. Maybe it'd work as an mx alternative if it were somehow profitable to go that cheap.
 

moinmoin

Diamond Member
Jun 1, 2017
4,994
7,765
136
Hmm. Well I was sort of thinking, if you look at how much silicon is expended on EMIB on Sapphire Rapids, that's one instance of packaging tech underperforming vs. expectations. Or at least my expectations. I didn't think Intel would need that much area just to bind four (admittedly large) chiplets together. My ideas about what those silicon bridges would be like were completely off, at least with respect to Sapphire Rapids.
But that's the thing: That's not related to packaging tech "underperforming" or some such, that's down to imo nutty design decisions. As @IntelUser2000 said they wanted monolith like latencies at all cost and that's how they paid for it, with plenty silicon area (and for whatever reason with little investment into area optimization as there seems to be plenty dark silicon).
 

DrMrLordX

Lifer
Apr 27, 2000
21,807
11,161
136
But that's the thing: That's not related to packaging tech "underperforming" or some such, that's down to imo nutty design decisions. As @IntelUser2000 said they wanted monolith like latencies at all cost and that's how they paid for it, with plenty silicon area (and for whatever reason with little investment into area optimization as there seems to be plenty dark silicon).

Yeah I just never sorted through my head how EMIB was going to affect interconnect performance. Maybe I was sipping a little too much Intel Kool-Aid in assuming they could get those latencies and speeds with smaller bridges.
 

IntelUser2000

Elite Member
Oct 14, 2003
8,686
3,785
136
moinmoin said:
said they wanted monolith like latencies at all cost and that's how they paid for it, with plenty silicon area (and for whatever reason with little investment into area optimization as there seems to be plenty dark silicon).

For the enterprise workloads such as TPC-C, the optimization that they put in will greatly benefit it. The only issue is that Sapphire Rapids is late, very late.

I liked it in the Xeon E5 days when it was split between Enterprise and scalability focused E7 and smaller server E5. Now they combined it and that's part of the problem. The E7 chips need much longer time to verify and the performance optimizations specific also takes longer as well. With Skylake-SP they decided they want to get benefits of both, but any sane person would know you don't get anything for free.

@jpiniero Not according to the Time Spy results. They are roughly even on the Default mode, and A350M is quite a bit faster on Performance.
 

moinmoin

Diamond Member
Jun 1, 2017
4,994
7,765
136
The only issue is that Sapphire Rapids is late, very late.
Makes one wonder what they did in all the time while they waited for the roadblock to be removed. Optimizing designs to not require mirrored dies and remove all unnecessary areas wasted by dark silicon could have been done while development on the process node held them back. As is many Intel designs lately seem very half baked and oddly rushed, despite being late.
 
Jul 27, 2020
17,967
11,709
116
As is many Intel designs lately seem very half baked and oddly hurried, despite being late.
They have to be scrapping some ambitious design late in the game and turning to plan B. What else could explain these delays, other than sheer laziness and/or incompetence?
 

IntelUser2000

Elite Member
Oct 14, 2003
8,686
3,785
136
They have to be scrapping some ambitious design late in the game and turning to plan B. What else could explain these delays, other than sheer laziness and/or incompetence?

The problems date back to incompetent management in the Kraznich days, simple as that, and because development cycles in total are pretty long, they are still suffering from the after effects.

As is many Intel designs lately seem very half baked and oddly rushed, despite being late.

That's what being late means though. You don't delay to make it better, it's delayed because it has issues that needs to be resolved. It's so basic that you can see from projects that grade school kids make. The good ones aren't late and is done well. The bad ones are late and is bad, because they either didn't spend time on it or didn't care.

Look at NV30, Prescott, Bulldozer. Delayed and bad. Core 2 was stellar, and it was actually a month early.

Even when you take "exceptions" such as Blizzard in the glory days or Valve, it still applies, and creative projects are somewhat different from technical ones where being late directly means you are at a disadvantage. Good art is good art, while an old computer is bad.
 
Last edited:
Reactions: xpea

moinmoin

Diamond Member
Jun 1, 2017
4,994
7,765
136
That's what being late means though. You don't delay to make it better, it's delayed because it has issues that needs to be resolved.
So far we assumed state of the process node was/is Intel's only issue. Design should be independent from that. Apparently design continuously had/has serious issues as well, and unlike with improving node yields designs stayed the rushed way until the end?
 

jpiniero

Lifer
Oct 1, 2010
14,841
5,456
136
So far we assumed state of the process node was/is Intel's only issue. Design should be independent from that. Apparently design continuously had/has serious issues as well, and unlike with improving node yields designs stayed the rushed way until the end?

Guessing a lot of it was that they were unsure if 10 nm would ever get good enough yield for something that big to be even feasible. So it's like why bother. The alternative would have been what happened with the gaming GPUs and fab them at TSMC. Of course that has a lot of downsides...
 

IntelUser2000

Elite Member
Oct 14, 2003
8,686
3,785
136
So far we assumed state of the process node was/is Intel's only issue. Design should be independent from that. Apparently design continuously had/has serious issues as well, and unlike with improving node yields designs stayed the rushed way until the end?

I don't think you can separate designs completely from process. If you look at all performance metrics(performance and performance per watt), a new process design is just better all the way around.

So design would be affected by process delays, because now the design team is unsure of what's happening. Look at why the Tick/Tock methodology came about. The design team needs to know the parameters of the process. The process is the foundation.

It would have rather turned out better if they just said "14nm will be cutting edge for 4 years" rather than promising "10nm will be next year" every year.

That's why designs like Goldmont Plus turned out really good since Atoms were purposely on an older process anyway, so they were much less subject to process delays.
 

moinmoin

Diamond Member
Jun 1, 2017
4,994
7,765
136
I don't think you can separate designs completely from process. If you look at all performance metrics(performance and performance per watt), a new process design is just better all the way around.

So design would be affected by process delays, because now the design team is unsure of what's happening. Look at why the Tick/Tock methodology came about. The design team needs to know the parameters of the process. The process is the foundation.

It would have rather turned out better if they just said "14nm will be cutting edge for 4 years" rather than promising "10nm will be next year" every year.

That's why designs like Goldmont Plus turned out really good since Atoms were purposely on an older process anyway, so they were much less subject to process delays.
While I agree with you in general, note that we are talking about basic stuff done early while planning the design:
  • Layout the chiplet in a way that it's symmetric on two sides instead the current asymmetric way that necessitated creating a mirrored die.
  • Organize the layout in a way which doesn't introduce plenty dark silicon next to the areas needed for EMIB.
Neither of those should even touch performance metrics (which is done when a design is realized for a specific node).

The only way I can see the current product as the result of a halfway sane development process is that the decision to split up SPR into four chiplets itself was done late in development, essentially splitting up an existing huge design that through some existing symmetry allowed to reuse two of the four cuts, one of them achievable through mirroring, and simply stick to it the areas required for EMIB to put the cuts back together. But that'd be a whole other category of a rush job.
 
Reactions: xpea

IntelUser2000

Elite Member
Oct 14, 2003
8,686
3,785
136
The only way I can see the current product as the result of a halfway sane development process is that the decision to split up SPR into four chiplets itself was done late in development, essentially splitting up an existing huge design that through some existing symmetry allowed to reuse two of the four cuts, one of them achievable through mirroring, and simply stick to it the areas required for EMIB to put the cuts back together. But that'd be a whole other category of a rush job.

Dark areas of silicon is likely because they had different plans due to originally being a 2020 product. Yes, it was supposed to be that early.

As you get delays into 2021, then into 2022, the competitive landscape changes and obviously they are aware of that, so designs were changed. Instead, Icelake became the 2021 product, and based on clocks they have parametric yield issues on the first commercial 10nm.

So that is still very likely due to the fallout of 10nm. Remember the entire management was in denial about the state of their 10nm process. SPR in 2022 may as well be a completely new chip compared to the original 2020 one.

Back to topic: ARC benchmarked against GTX 1650 Mobile. The 1650 is about 20% faster.

3dmark, etc isn't a true reflection of actual gaming performance. AMD's slide is more realistic.

3DMark is quite relevant if compared with the right settings. The integrated GPUs run them at unplayable frame rates, thus overly stressing the GPU. Majority of game comparisons are done using playable frame rates. Intel iGPUs tend to give a bit too much power to the CPU, so in games it tends to perform less than 3DMark. However when devices compared are in the 30-40 fps range latest 3DMark tests reflect it quite well.

Also, if you say 3DMark is inflated for Intel, then that is also true for the 1650 Mobile, as it gets slightly under 4K GPU points in TS and compares well with the 3.1K the top A350M got. I am betting the 1650 in that Youtube comparison is exceeding the top score of 3700 achieved by NBC since it uses the latest Ryzen mobile rather than not the top Cometlake.
 
Last edited:

IntelUser2000

Elite Member
Oct 14, 2003
8,686
3,785
136
Performance in Cyberpunk 2077 and Horizon Zero Dawn is respectable. I hate that reviewer for not testing raytracing.

Probably a stretch since they are both in the 30-40 fps range.

Another comparison to prove my point that 3DMark is underrated and is actually pretty accurate.


In Fire Strike the gap between those two are low. In Time Spy they are greater than 50%.

-In few games the Radeon 680M is faster than 1650 Mobile at the lowest settings but the 1650 catches up and oftentimes exceeds the 680M.
-In many games the 680M shows a smaller advantage ranging about 20% but at higher settings the gap stretches to 30, 40, 50%.

I bet you Fire Strike is a less demanding benchmark and thus is a better representation for lower settings and higher frame rates. But at higher settings and demanding games at lower frame rates, Time Spy is reasonably accurate.

At most titles especially ESports ones ARC 3 might be only about 20% faster than 680M. But at higher settings and lower frame rates it's going to grow, up to the 40% gap Time Spy shows. Of course this is also dependent on the TDP settings of ARC. The Default mode for the A350M got 2400 points which is only 5-10% better than 680M. At the higher power Performance mode it gets 3100.

This doesn't take into account another factor I mentioned in the above post(and few times before elsewhere). At least for their iGPUs, their power management algorithm prioritizes the CPU core too much, robbing the GPU. I remember a Youtube video where the Intel CPU was running at 3GHz+ while Cezanne was only doing about 1.5GHz. The dGPU is much less likely to suffer from the issue even with Deep Link.

Most of us are aware that Zen 2/3 mobile CPUs are quite power efficient. And Sunny/Willow are not and can reach 17W just with a single thread active.
 
Last edited:

Asterox

Golden Member
May 15, 2012
1,028
1,786
136
Lol, GPU-Z default GPU clock 2200mhz.


Intel claims, hm 1150mhz is for lowest TDP setings.Perhaps we should still ask Klingons for the exact GPU operating frequency range.


 

IntelUser2000

Elite Member
Oct 14, 2003
8,686
3,785
136
Source for the A350M scores?

It was posted just few pages ago. Time Spy GPU for Samsung's ARC A350M gets 2400 points in Default mode and 3100 in performance.

@Asterox While clocks are a bit different I wouldn't rely GPU-Z too much for Intel info. Things like ROPs and TMUs are entered manually. I had to correct them for earlier Intel GPUs.
 

mikk

Diamond Member
May 15, 2012
4,173
2,211
136
Back to topic: ARC benchmarked against GTX 1650 Mobile. The 1650 is about 20% faster.


This is a fake channel, there are many of these on youtube. Unless they can show the notebook you can be sure it's another fake benchmark. No reputable notebook tester in the world got an Arc powered notebook yet. I think the faker made a mistake, Samsung Galaxy Book 2 with 12700H is not real. It seems to be P model only.
 
sale-70-410-exam    | Exam-200-125-pdf    | we-sale-70-410-exam    | hot-sale-70-410-exam    | Latest-exam-700-603-Dumps    | Dumps-98-363-exams-date    | Certs-200-125-date    | Dumps-300-075-exams-date    | hot-sale-book-C8010-726-book    | Hot-Sale-200-310-Exam    | Exam-Description-200-310-dumps?    | hot-sale-book-200-125-book    | Latest-Updated-300-209-Exam    | Dumps-210-260-exams-date    | Download-200-125-Exam-PDF    | Exam-Description-300-101-dumps    | Certs-300-101-date    | Hot-Sale-300-075-Exam    | Latest-exam-200-125-Dumps    | Exam-Description-200-125-dumps    | Latest-Updated-300-075-Exam    | hot-sale-book-210-260-book    | Dumps-200-901-exams-date    | Certs-200-901-date    | Latest-exam-1Z0-062-Dumps    | Hot-Sale-1Z0-062-Exam    | Certs-CSSLP-date    | 100%-Pass-70-383-Exams    | Latest-JN0-360-real-exam-questions    | 100%-Pass-4A0-100-Real-Exam-Questions    | Dumps-300-135-exams-date    | Passed-200-105-Tech-Exams    | Latest-Updated-200-310-Exam    | Download-300-070-Exam-PDF    | Hot-Sale-JN0-360-Exam    | 100%-Pass-JN0-360-Exams    | 100%-Pass-JN0-360-Real-Exam-Questions    | Dumps-JN0-360-exams-date    | Exam-Description-1Z0-876-dumps    | Latest-exam-1Z0-876-Dumps    | Dumps-HPE0-Y53-exams-date    | 2017-Latest-HPE0-Y53-Exam    | 100%-Pass-HPE0-Y53-Real-Exam-Questions    | Pass-4A0-100-Exam    | Latest-4A0-100-Questions    | Dumps-98-365-exams-date    | 2017-Latest-98-365-Exam    | 100%-Pass-VCS-254-Exams    | 2017-Latest-VCS-273-Exam    | Dumps-200-355-exams-date    | 2017-Latest-300-320-Exam    | Pass-300-101-Exam    | 100%-Pass-300-115-Exams    |
http://www.portvapes.co.uk/    | http://www.portvapes.co.uk/    |