Intel processors crashing Unreal engine games (and others)

Page 33 - Seeking answers? Join the AnandTech community: where nearly half-a-million members share solutions and discuss the latest tech.

Markfw

Moderator Emeritus, Elite Member
May 16, 2002
26,062
15,200
136
Since its for pro use, I would return it now and just move to an AMD 7950X system. Better to have a working work PC.
I totally agree. Return all the hardware, and built a 7950x. Better performance for your uses, better reliability and cooler running.
 
Reactions: carancho and Ranulf

Ranulf

Platinum Member
Jul 18, 2001
2,510
1,571
136
From the GN interview @18m34s


"One of the game devs straight up said its gonna cost me over $100k in lost players."

The part 30s later about the out of vram error is interesting. One of the games can't use more than 20GB of vram but they're getting player reports of bugs with that occuring (I'm assuming on cards with 20+GB of vram).
 

MangoX

Senior member
Feb 13, 2001
591
108
106
Seems odd to me that they would not use workstation CPUs in workstation boards. And they did not say workstation, they said server, which is sure confusing to me. Desktop, workstation and server are all different CPUs in Intel as well as AMD. Even AMD has ECC rated server chips in some lower end workstation server boards. Never seen ANY desktop chip used or talked about as a server chip, but I guess there is always exceptions, and that would make sense in that respect, as even at pure stock that chip appears to degrade. That would also negate what some users say that at stock they are fine.
Any chip can be used as a server, even desktop chips. A server is just a machine runs 24/7 serving users. If you look up dedicated servers you'll find lower end machines running client cpus. They're extremely popular and cost effective. Intel 13th/14th generation are extremely popular due to their very high single thread performance. Web front ends and especially game servers love these processors. These days the lines between a server and desktop is slowly getting blurred. AMD is partly to blame for this, because their client and enterprise products pretty much share the same CCDs, with only the socket and IOD changing the target market.

Today, I wouldn't touch any Raptor Lake processor with a 10 foot pole. It's a terrible moment in time for Intel. They could do a recall, but that would be admitting their is a problem with their product. ARL/LNL can't come fast enough. They have nothing else competitive within their portfolio they can fall back on.

In any case, Intel needs to ride the storm until these new products are released. They are just some months until their new gen is released. But, how long can they weather the storm if news just keep cropping up regarding these cpus? The more time that passes, the more headlines being made, the more people will demand answers.
 

Jan Olšan

Senior member
Jan 12, 2017
400
685
136
I just purchased a new build with a Core™ i9-14900KF and the processor arrived today. Someone from another thread pointed me to this thread, and now I'm really concerned I'm going to have instability problems. Has there been a consensus in this thread about what's wrong? Are there any mitigations? And ultimately, should I return my CPU?

If you can return it and get refund, I would do it.

Disclaimer: I'm more fond of AMD than Intel, but this issue is looking extremely bad.

There is a growing likelihood that the issue is unfixable and there is also a growing likelihood that what is happening is all those CPUs slowly dying and it's just a question of how long before it manifests. There have been 2 cases of Intel product degrading like that, or actually more - first case was Sandy Bridge chipsets, then it repeatedly happened to more than one generation of Atom SoCs. Intel may have insufficient aging testing/simulation checks, based on that... I mean, it's not 100% sure they won't find a fix. I'll be conservative and say 20-50% likely - but I think even 10-20 % chance is not very worth the risk. To me having to change computers is a headache (you may be more willing to do it if you handle that routinelly).

In the past, solutions to such crashing bugs, when they appeared (Skylake had one, Zen had one - from the top of my mind) were usually found and published within 2 months. Companies usually presented more public statements early on, probably when they were confident they will fix it and thus they don't need to try to sweep it under the carpet.
Intel has kept silent, came with a few PARTIAL mitigations but the public statement is they still have not found the root cause. After 5 months of publicity and likely knowing about the crashes for longer before that. So that can conversely mean they are afraid they won't fix this.

This has huge vibes of stuff being seriously wrong and unfixable. I mean, there is chance you will hit the issue and get a refund then, problem fixed. If you are less lucky, you could get a replacement that will develop issues again later, but out of warranty period.

If you can just swap CPU, it's probably a better idea to do it, to be safe rather than sorry later. It's not clear you are in guaranteed doom scenario with 13900K, but the signs pointing towards serious issues are getting stronger and stronger.

If you don't want to swap your other hardware (board...), you could get a lower tier CPU like 13500 or 14400 or a 12th gen CPU like 12700K or 12900K. Those don't have issues reported - though I would be worried they still suffer from the underlying problem, only with a slower progression that has kept the issue hidden - and they too start showing sings of degradation after 5-6 years for example.
 
Last edited:

IEC

Elite Member
Super Moderator
Jun 10, 2004
14,437
5,415
136
The fact that 13900K/F/S and 14900K/F/S chips are initially passing all tests and then sporadically failing after just months of use in a hosted environment suggests the problem is likely to be fundamentally physical. I am skeptical it is possible to be fixed with microcode or BIOS updates.

1) These aren't consumer boards - they are using W680 class workstation boards with ECC RAM (3 separate datacenter type operations, per GN interview with Wendell)
2) They have overkill cooling in a datacenter environment - Wendell mentioning he sees peak hotspot temps in high 60s to low 70s.
3) #2 supports the presumption they likely never exceeded Intel specs unlike some consumer boards/chips - operating conditions most likely didn't exceed the recently released Intel "baseline" profile given the very cool temps

Bonus: It's not a consistent failure mode. Sometimes a P-core or two drop. Sometimes an E-core drops. Sometimes it's a memory error. They were resorting to downclocking to a max of 5.3GHz and DDR5-4200 (not a typo) to try to obtain stability and sometimes even that doesn't work.
 

DAPUNISHER

Super Moderator CPU Forum Mod and Elite Member
Super Moderator
Aug 22, 2001
29,484
24,218
146
The fact that 13900K/F/S and 14900K/F/S chips are initially passing all tests and then sporadically failing after just months of use in a hosted environment suggests the problem is likely to be fundamentally physical. I am skeptical it is possible to be fixed with microcode or BIOS updates.

1) These aren't consumer boards - they are using W680 class workstation boards with ECC RAM (3 separate datacenter type operations, per GN interview with Wendell)
2) They have overkill cooling in a datacenter environment - Wendell mentioning he sees peak hotspot temps in high 60s to low 70s.
3) #2 supports the presumption they likely never exceeded Intel specs unlike some consumer boards/chips - operating conditions most likely didn't exceed the recently released Intel "baseline" profile given the very cool temps

Bonus: It's not a consistent failure mode. Sometimes a P-core or two drop. Sometimes an E-core drops. Sometimes it's a memory error. They were resorting to downclocking to a max of 5.3GHz and DDR5-4200 (not a typo) to try to obtain stability and sometimes even that doesn't work.
Wendell tipped us off with that last comment that if you have one of those CPUs "so sorry and good luck." Steve saying a source that he thinks is reliable indicates millions of CPUs are affected.

The fact they are saying different things to OEMs and "smaller" customers that buy 1000s of CPUs is also telling. The hubris is real.

"downclocking to a max of 5.3GHz and DDR5-4200" Imagine owners seeing the bigger bar better benchmarks at those settings. ☠️ Meanwhile our sister site Tom's posted this line earlier this month - "Intel launched its 14th-Gen Raptor Lake Refresh processors, with the Core i9-14900K, Core i7-14700K, and Core i5-14600K, all based on the tried-and-true Raptor Lake architecture" LOLZ you can't make this stuff up!
 

poke01

Golden Member
Mar 8, 2022
1,998
2,535
106
I cannot stress stress how important efficiency is, Intel has a long history of pushing it too far. The core i9s in laptops 2018-2019 were an example of this. The 14nm node couldn't handle the 8 core Skylake+++++++ in Macbooks pros and Dell XPS.
 

Hail The Brain Slug

Diamond Member
Oct 10, 2005
3,485
2,407
136
Wendell tipped us off with that last comment that if you have one of those CPUs "so sorry and good luck." Steve saying a source that he thinks is reliable indicates millions of CPUs are affected.

The fact they are saying different things to OEMs and "smaller" customers that buy 1000s of CPUs is also telling. The hubris is real.

"downclocking to a max of 5.3GHz and DDR5-4200" Imagine owners seeing the bigger bar better benchmarks at those settings. ☠️ Meanwhile our sister site Tom's posted this line earlier this month - "Intel launched its 14th-Gen Raptor Lake Refresh processors, with the Core i9-14900K, Core i7-14700K, and Core i5-14600K, all based on the tried-and-true Raptor Lake architecture" LOLZ you can't make this stuff up!
5.3 GHz is alder lake clock territory. Perhaps their node/designs really are just not capable of clocking higher reliably and Intel pushed it anyway.
 

adroc_thurston

Diamond Member
Jul 2, 2023
3,492
5,055
96
The fact that 13900K/F/S and 14900K/F/S chips are initially passing all tests and then sporadically failing after just months of use in a hosted environment suggests the problem is likely to be fundamentally physical
Well yeah it's a repeat of 1.13Ghz Coppermine.
It's also kinda impossible to miss given modern Si reliability sims.
This reeks of product-level YOLO.
There have been 2 cases of Intel product degrading like that
More.
1.13G Coppermine was just canned in a more timely manner.
 

RnR_au

Platinum Member
Jun 6, 2021
2,013
4,904
106
Have there been changes in the silicon process between 12th gen and 13/14th gens? I'm not familiar with Intel's silicon processes beyond lame meme pics...
 

Hitman928

Diamond Member
Apr 15, 2012
6,051
10,381
136
Have there been changes in the silicon process between 12th gen and 13/14th gens? I'm not familiar with Intel's silicon processes beyond lame meme pics...

They are all on the Intel 7 process, but there may have been some minor tweaks between 12th and 13th gen.
 

Futuremotion

Junior Member
Jun 23, 2024
23
23
41
github.com
What’s your motherboard? If it’s an Asus motherboard make sure to run BIOS 2301 or newer and select the “performance” profile, not the “extreme” profile.

GIGABYTE Z790 AORUS PRO X WIFI7 LGA 1700 Intel Z790 X. Any recommendations for Gigabyte boards?
 

H433x0n

Golden Member
Mar 15, 2023
1,166
1,510
96
GIGABYTE Z790 AORUS PRO X WIFI7 LGA 1700 Intel Z790 X. Any recommendations for Gigabyte boards?
Try their latest bios with the Intel baseline profile. If it’s a really crappy implementation then do the following:

PL1=PL2=241W
IccMax=307A
IA CEP= enabled
SA CEP = enabled

Bonus points:

AC/DC load line = 1.0/1.0 ohm

The most important setting is making sure IccMax=307A and CEP (current excursion protection is enabled). If it were my system, I would limit it to 5.5ghz on P-cores and 4.3ghz on E-cores until Intel makes a public statement.

If you want to be super safe then just input the same clocks from the 12900KS as a temporary measure which is 5.3ghz P-Cores and 4.0ghz on E-Cores.
 
Reactions: Grazick

coercitiv

Diamond Member
Jan 24, 2014
6,598
13,939
136
GIGABYTE Z790 AORUS PRO X WIFI7 LGA 1700 Intel Z790 X. Any recommendations for Gigabyte boards?
My first recommendation would be to create a separate thread in CPU forum about your situation. I know it feels like your decision is directly linked to this topic, but you're not going to get better advice here while people are posting a myriad other things related 13th/14th gen CPUs.

My second suggestion is to seriously consider returning the parts if they are still in transit or have not been opened yet. However, if your system is already built, consider returning only the 14900KF and buying a 12900K or 12900KS instead. I know it sounds like a considerable downgrade but if the system instability sets in, you would end up downclocking your CPU anyway. (and maybe more)

Second option is to go with the new build as planned, and RMA only the CPU if you ever start having issues. For this you need to have a backup system or a backup CPU at the very least.
 
Last edited:

DrMrLordX

Lifer
Apr 27, 2000
22,000
11,562
136
It is technically possible that Intel hitted both things at once: Motherboard vendors aggressive settings making factory overclocked Processors to degrade faster than what they would have done, but it was above 0 so it was going to happen in a year or two anyways.

What's worse is that these crashes tend to get worse over time, and as mentioned by many above, they can still occur even at clockspeeds of 5.3 GHz. People who go "conservative" and dial PL2 back to 180W or lower may still suffer degradation over time. There doesn't seem to be any "safe" way to run these things. It's just a crapshoot.

1.13G Coppermine was just canned in a more timely manner.
Ah, memories. When Tom's Hardware teamed up with Anand and Kyle Bennett to figure out why Linux compilation benchmarks were failing repeatedly on the 1.13 GHz P3:

 
Jul 27, 2020
19,613
13,479
146
My second suggestion is to seriously consider returning the parts if they are still in transit or have not been opened yet. However, if your system is already built, consider returning the 14900KF and buying a 12900K or 12900KS instead. I know it sounds like a considerable downgrade but if the system instability sets it you would end up downclocking your CPU anyway. (and maybe more)
My counter suggestion would be going with 13900KS. That way he doesn't lose a lot of performance and still gets a decently premium CPU for his workloads (seen reports of people punishing the 13900KS and the CPU holding its own against the abuse). Of course, the best course of action would be to return the entire thing, wait about 3 weeks and get a 9950X for absolute Intel domination in almost all workloads.
 

moinmoin

Diamond Member
Jun 1, 2017
5,063
8,025
136
I mean, it's not 100% sure they won't find a fix.
And even if Intel finds a fix they may actually better off selling it as a clean "15th gen" or some such. After all without a fix all existing 13th and 14th gen chips may have been exposed to broken usage too long to not be degraded in some way that will affect their usage if not right now at some point down the road. We already know that the resulting degradation is unfixable.
 

KompuKare

Golden Member
Jul 28, 2009
1,164
1,426
136
And you're willing to bet other people's money to find out. Makes sense.
Ha! I similarly always support those willing to be beta testers!
Or almost always: the beta testers for higher prices - and I'm mostly thinking of Titan buyers of yore but buying an i9 KS is similar - those I do not support!

Back on topic, while millions of CPUs may be affected, I am more interested in that not being 100% of what Intel shipped.

Tons of variables - and really Intel internally most have a very good idea internally by now otherwise they are in the wrong business - but one speculation I haven't see yet is about different Intel sites?

I assume they have multiple foundry sites doing Intel 7 by now, and while their modus operandi must be too copy the whole process, maybe one site is responsible for the bad chips?

I'm sure Intel internal know if there is any pattern in terms of manufacturing weeks, sites, packaging sites etc. by now.
 
Jul 27, 2020
19,613
13,479
146
I assume they have multiple foundry sites doing Intel 7 by now, and while their modus operandi must be too copy the whole process, maybe one site is responsible for the bad chips?

I'm sure Intel internal know if there is any pattern in terms of manufacturing weeks, sites, packaging sites etc. by now.
Good speculation. And now I'm forced to think that the 13900KS was made at some top secret underground site where gremlins can't infiltrate and sneeze on the chips
 
Reactions: Thibsie
sale-70-410-exam    | Exam-200-125-pdf    | we-sale-70-410-exam    | hot-sale-70-410-exam    | Latest-exam-700-603-Dumps    | Dumps-98-363-exams-date    | Certs-200-125-date    | Dumps-300-075-exams-date    | hot-sale-book-C8010-726-book    | Hot-Sale-200-310-Exam    | Exam-Description-200-310-dumps?    | hot-sale-book-200-125-book    | Latest-Updated-300-209-Exam    | Dumps-210-260-exams-date    | Download-200-125-Exam-PDF    | Exam-Description-300-101-dumps    | Certs-300-101-date    | Hot-Sale-300-075-Exam    | Latest-exam-200-125-Dumps    | Exam-Description-200-125-dumps    | Latest-Updated-300-075-Exam    | hot-sale-book-210-260-book    | Dumps-200-901-exams-date    | Certs-200-901-date    | Latest-exam-1Z0-062-Dumps    | Hot-Sale-1Z0-062-Exam    | Certs-CSSLP-date    | 100%-Pass-70-383-Exams    | Latest-JN0-360-real-exam-questions    | 100%-Pass-4A0-100-Real-Exam-Questions    | Dumps-300-135-exams-date    | Passed-200-105-Tech-Exams    | Latest-Updated-200-310-Exam    | Download-300-070-Exam-PDF    | Hot-Sale-JN0-360-Exam    | 100%-Pass-JN0-360-Exams    | 100%-Pass-JN0-360-Real-Exam-Questions    | Dumps-JN0-360-exams-date    | Exam-Description-1Z0-876-dumps    | Latest-exam-1Z0-876-Dumps    | Dumps-HPE0-Y53-exams-date    | 2017-Latest-HPE0-Y53-Exam    | 100%-Pass-HPE0-Y53-Real-Exam-Questions    | Pass-4A0-100-Exam    | Latest-4A0-100-Questions    | Dumps-98-365-exams-date    | 2017-Latest-98-365-Exam    | 100%-Pass-VCS-254-Exams    | 2017-Latest-VCS-273-Exam    | Dumps-200-355-exams-date    | 2017-Latest-300-320-Exam    | Pass-300-101-Exam    | 100%-Pass-300-115-Exams    |
http://www.portvapes.co.uk/    | http://www.portvapes.co.uk/    |