Intel processors crashing Unreal engine games (and others)

Page 42 - Seeking answers? Join the AnandTech community: where nearly half-a-million members share solutions and discuss the latest tech.
Jul 27, 2020
19,186
13,142
146
Which setting do you mean by "unlimited"?
You wouldn't be running anything with that setting, other than booting Windows and opening Intel XTU. It's just to see what settings your mobo is using for the Unlimited profile and what are the "optimized" settings decided by Intel's Speed Optimizer. Then just try to run TFD with the optimized OC settings and see what happens.
 

DAPUNISHER

Super Moderator CPU Forum Mod and Elite Member
Super Moderator
Aug 22, 2001
29,277
23,623
146
On the full nerd podcast Wendell discusses how many of the servers were running 5600MTs often a single stick which he says Intel officially states should work. And that either dropping the multiplier or downclocking to 5200MTs would make the errors go away.

He also talks about with Linux it would just say the core is malfunctioning and that it is being turned off. Then it will limp along.But sometimes it'll kernel panic and sometimes it's so bad the bios resets. On the Asus W680 it deals with it by dropping the speed to 3600MTs and since it's DC they have to go somewhere else to handle the problem that board has and it's frustrating. This goes back to January, and he did not have that access,so he'd have to wait sometimes days for others to deal with it. Asus, ASRock, and Super Micro all handle things differently, but it's a coin toss which board will have failures with Raptor. And that within his 168hr window half would error. Out of the sample size of 2800.

The host says his first one had a bad IMC, the replacement was the one he had random crashes in games. Including the EAC ban in fortnite. He'd get blackscreen crashes with nothing in the fortnite log. It was also a permban which had he not known people there would have been the end of his account. Wendell said another dev he shared data with said they would be rolling back a bunch of bans on raptor users. Host said even with x52 multi the system was janky with stuff like WHEA errors.

Wendell said with games it would often point to memory issues, but in Linux he wrote a program that ran in cache and never touched memory and he was still seeing huge failure rates.

His conclusion was turning off e-cores rarely helped, and the solution was limiting multiplier and running memory speed "dog slow". At x53 and DDR4 level speeds the 12900K with 6400Mts should ROFLstomp raptor. That damned certain isn't what gamers paid for.
 
Jul 27, 2020
19,186
13,142
146
but I think that XTU's Watchdog hard cold boot crash "reset" - that sometimes even triggers BIOS errors/resets - may also trigger it the (likely PCIe) instability.
I get that Watchdog error too sometimes, especially when I'm trying some crazy memory overclock. But it hasn't resulted in any serious problem.
 

DAPUNISHER

Super Moderator CPU Forum Mod and Elite Member
Super Moderator
Aug 22, 2001
29,277
23,623
146
I almost forgot the part where Wendell alluded to the potential "bombshell" that he won't officially on the record say. "If 2 DIMMs is the source of the problem, even at like 4400MTs, that's unsettling."

The hell with it I'll keep going as I know at least one member is deaf and can't listen to the podcast.

After getting rid of the red herring which was Samsung nvme issues, and with no GPU involved. They were still getting PCIe errors, and it's a completely different clock domain. So even at 4.8GHz and 3600MTs they were still getting random PCIe errors; "this CPU is broken."
 
Last edited:

TheELF

Diamond Member
Dec 22, 2012
4,026
753
126
He was under clocking though. He didn’t set an all core frequency, he lowered the max boost frequency to 5.3 GHz to try and stabilize the processors.

Edit: the boost algorithm is still on, he just limits it from exceeding 5.3 GHz because going past that caused instability on a large percentage of the tested systems.
On the techpowerup review the highest overclock they managed was 5.5Ghz all core...so 5.3 all core is pretty up there, and that's the "fix" lets not forget.
Although there is no official turbo table from intel so we are all just guessing anyway.
Our maximum all-core OC is 5.5 GHz on the P-Cores, plus 4.4 GHz on the E-Cores, 100% stable. this still isn't enough to beat the stock configuration in lighter applications and most games, because here the CPU will boost two cores up to 6.0 GHz.
 

IEC

Elite Member
Super Moderator
Jun 10, 2004
14,432
5,397
136
@igor_kavinski
After the Clear CMOS I played through the tutorial once with Intel 307 A specs and once with my usual UV + OC (resetting all game data), both worked without a crash.

The reason I came up with the Clear CMOS at once is because I saw this kind of BSOD before and it seems to be kind of PCIe connected. I don't fully trust the mainboard anymore and the whole thing may have killed my PCIe Creative X-Fi. When the defect X-Fi was insert I could reproduce problems, but I think that XTU's Watchdog hard cold boot crash "reset" - that sometimes even triggers BIOS errors/resets - may also trigger it the (likely PCIe) instability. So the Clear CMOS may have fixed it for now with no connection to my CPU or memory UV/OC at all.
That PCIe error may actually be CPU-related...
 

coercitiv

Diamond Member
Jan 24, 2014
6,576
13,833
136
Since we like throwing things at the wall to see if they stick, I'd love to see a breakdown of failures between systems with DDR4 vs. DDR5. We could probably get this for ADL, but with RPL it did not make much sense.
 

Hitman928

Diamond Member
Apr 15, 2012
5,989
10,249
136
On the techpowerup review the highest overclock they managed was 5.5Ghz all core...so 5.3 all core is pretty up there, and that's the "fix" lets not forget.
Although there is no official turbo table from intel so we are all just guessing anyway.

I’m not sure why you keep going back to this all core at 5.3 GHz claim. That is not what is happening. Wendel did not increase the all core clocks at all, he just limited the max frequency any core can achieve to 5.3 GHz, that’s it. There is only under clocking happening during lightly threaded loads. For all core or heavily threaded loads, the clocks will be lower than this according to Intel’s default boosting behavior.
 

GTracing

Member
Aug 6, 2021
73
187
76
Since we like throwing things at the wall to see if they stick, I'd love to see a breakdown of failures between systems with DDR4 vs. DDR5. We could probably get this for ADL, but with RPL it did not make much sense.
If we're speculating, my crackpot theory is that there was a fundamental error early in the design process (either for raptor lake or for their latest Intel 7 node revision). Leading to numerous physical implementation errors spread across the whole chip.

It would explain why it took so long to find, and why so many different settings help to mitigate the crashing, and why Alder Lake and Meteor Lake aren't affected.
 
Jul 27, 2020
19,186
13,142
146
Since we like throwing things at the wall to see if they stick, I'd love to see a breakdown of failures between systems with DDR4 vs. DDR5.
That's a great idea. What if a CPU that fails working in a DDR5 mobo works flawlessly in a DDR4 mobo? That would make pinpointing the cause a lot easier!
 
Jul 27, 2020
19,186
13,142
146
@coercitiv

Here's a crazy idea: https://www.kitguru.net/components/cpu/joao-silva/intel-has-a-core-i9-14900ks-on-the-way/

Remember how everyone thought that company was nuts for pairing the 14900KS with DDR4-3200?

And it LACKS a GPU meaning no PCIe traffic???

That company is Israeli. Raptor Cove is an IDF design.



WHAT IF??????

Intel insiders knew about these issues all along and went ahead anyway because they figured, it's gonna impact only maybe 50% of users??? They decided to take a gamble on that coz by the time the problem was expected to rear its ugly head, they would have shipped enough Raptor Lake SKUs into the worldwide markets to make some tidy profits! And by the time they got around to actually providing a real fix, Arrow Lake and Lunar Lake would be out to save their asses!

YOU HEARD THIS THEORY HERE FIRST!
 
Jul 27, 2020
19,186
13,142
146
It would explain why it took so long to find, and why so many different settings help to mitigate the crashing, and why Alder Lake and Meteor Lake aren't affected.
Well, as DAPUNISHER wondered in a previous post (can't remember which page), MTL-S WAS cancelled. Maybe it's all related...
 
Reactions: lightmanek

H433x0n

Golden Member
Mar 15, 2023
1,156
1,499
96
@coercitiv

Here's a crazy idea: https://www.kitguru.net/components/cpu/joao-silva/intel-has-a-core-i9-14900ks-on-the-way/

Remember how everyone thought that company was nuts for pairing the 14900KS with DDR4-3200?

And it LACKS a GPU meaning no PCIe traffic???

That company is Israeli. Raptor Cove is an IDF design.



WHAT IF??????

Intel insiders knew about these issues all along and went ahead anyway because they figured, it's gonna impact only maybe 50% of users??? They decided to take a gamble on that coz by the time the problem was expected to rear its ugly head, they would have shipped enough Raptor Lake SKUs into the worldwide markets to make some tidy profits! And by the time they got around to actually providing a real fix, Arrow Lake and Lunar Lake would be out to save their asses!

YOU HEARD THIS THEORY HERE FIRST!
No, you never knowingly release a product that has a 50% failure rate. There is no profit in a product that fails that often, that type of product costs you money.

You might consider releasing a product with a 1-3% failure rate but a product that fails 50% of the time would never see the light of day.
 
Jul 27, 2020
19,186
13,142
146
You might consider releasing a product with a 1-3% failure rate but a product that fails 50% of the time would never see the light of day.
Not even to keep up appearances? What else could they do? Keep selling Alder Lake? Or promote their workstation CPUs with AVX-512 and only P-cores to lowly Core i9 status and lose thousands per CPU?
 

H433x0n

Golden Member
Mar 15, 2023
1,156
1,499
96
Not even to keep up appearances? What else could they do? Keep selling Alder Lake? Or promote their workstation CPUs with AVX-512 and only P-cores to lowly Core i9 status and lose thousands per CPU?
They would have just refreshed Alder Lake. They've done this type of thing in the past (see Meteor Lake being canned in favor of RPL refresh).

There's zero shot that they released Raptor Lake knowing this would be the result. Heck, they just refreshed it less than a year ago. Intel knows that part of what keeps them afloat was that they had a reputation of being reliable. The "just works" factor is what has kept me purchasing Intel throughout the years. They wouldn't knowingly jeopardize that.
 

Sgraffite

Member
Jul 4, 2001
101
37
101
No, you never knowingly release a product that has a 50% failure rate. There is no profit in a product that fails that often, that type of product costs you money.

You might consider releasing a product with a 1-3% failure rate but a product that fails 50% of the time would never see the light of day.
I'm not so sure anymore. My house came with a Samsung refrigerator where they purposely put the ice maker in the very top of the fridge section.
 
Jul 27, 2020
19,186
13,142
146
They would have just refreshed Alder Lake. They've done this type of thing in the past (see Meteor Lake being canned in favor of RPL refresh).
MTL-S got canned coz they got more performance out of RPL-R. Doubt they could've gotten more out of ADL refresh unless they ported it to Intel 3. They failed miserably with Rocket Lake in their pursuit to win benchmarks. IDF saved them with Alder Lake. Then they tried to coast along on that design by souping it up with more cache and higher clocks. That failed miserably again as we can now see. It seems Intel's desperate attempts at winning benchmarks are consistently blowing up in their faces. My advice to them would be to stop trying so hard and just do what they hired people to do in the first place. Design decent CPUs, instead of cooking up shortcuts to win at any cost.
 
Reactions: DarthKyrie

zir_blazer

Golden Member
Jun 6, 2013
1,191
483
136
On the full nerd podcast Wendell discusses how many of the servers were running 5600MTs often a single stick which he says Intel officially states should work. And that either dropping the multiplier or downclocking to 5200MTs would make the errors go away.
Raptor Lake officially ONLY supports DDR5 5600 on 1 SPC boards. 2 SPC boards with 1 DPC populated takes it down to 4400 just like Alder Lake, going down to as much as 3600 with 2 DPC populated. If they're running @ 5600, they're out of spec. And I think all non-ITX W680 boards I remember are 2 SPC, so anything above 4400 is likely to be wrong.
 

H433x0n

Golden Member
Mar 15, 2023
1,156
1,499
96
MTL-S got canned coz they got more performance out of RPL-R. Doubt they could've gotten more out of ADL refresh unless they ported it to Intel 3.
They didn't get any more performance out of Raptor Lake refresh either. They could've easily just done ADL-S -> ADL-R -> RPL-S instead of ADL-S -> RPL-S -> RPL-R.
 
sale-70-410-exam    | Exam-200-125-pdf    | we-sale-70-410-exam    | hot-sale-70-410-exam    | Latest-exam-700-603-Dumps    | Dumps-98-363-exams-date    | Certs-200-125-date    | Dumps-300-075-exams-date    | hot-sale-book-C8010-726-book    | Hot-Sale-200-310-Exam    | Exam-Description-200-310-dumps?    | hot-sale-book-200-125-book    | Latest-Updated-300-209-Exam    | Dumps-210-260-exams-date    | Download-200-125-Exam-PDF    | Exam-Description-300-101-dumps    | Certs-300-101-date    | Hot-Sale-300-075-Exam    | Latest-exam-200-125-Dumps    | Exam-Description-200-125-dumps    | Latest-Updated-300-075-Exam    | hot-sale-book-210-260-book    | Dumps-200-901-exams-date    | Certs-200-901-date    | Latest-exam-1Z0-062-Dumps    | Hot-Sale-1Z0-062-Exam    | Certs-CSSLP-date    | 100%-Pass-70-383-Exams    | Latest-JN0-360-real-exam-questions    | 100%-Pass-4A0-100-Real-Exam-Questions    | Dumps-300-135-exams-date    | Passed-200-105-Tech-Exams    | Latest-Updated-200-310-Exam    | Download-300-070-Exam-PDF    | Hot-Sale-JN0-360-Exam    | 100%-Pass-JN0-360-Exams    | 100%-Pass-JN0-360-Real-Exam-Questions    | Dumps-JN0-360-exams-date    | Exam-Description-1Z0-876-dumps    | Latest-exam-1Z0-876-Dumps    | Dumps-HPE0-Y53-exams-date    | 2017-Latest-HPE0-Y53-Exam    | 100%-Pass-HPE0-Y53-Real-Exam-Questions    | Pass-4A0-100-Exam    | Latest-4A0-100-Questions    | Dumps-98-365-exams-date    | 2017-Latest-98-365-Exam    | 100%-Pass-VCS-254-Exams    | 2017-Latest-VCS-273-Exam    | Dumps-200-355-exams-date    | 2017-Latest-300-320-Exam    | Pass-300-101-Exam    | 100%-Pass-300-115-Exams    |
http://www.portvapes.co.uk/    | http://www.portvapes.co.uk/    |