igor_kavinski
Lifer
- Jul 27, 2020
- 25,452
- 17,647
- 146
We do have a member here that reported their 1700 CPU is so warped it no longer posts. The degradation and crashes are starting to look like a sandwich made from layers of fail. Maybe the socket latch is the rancid meat.Oh no, "Intel 7 Ultra" transistors getting squished
That is a reasonable hypothesis. It is just part of the crap sandwich.I know a couple people who have done multiple RMA's and have used a contact frame for the entire life of all of them. It could be orthogonally related/cause tangential issues, but it doesn't seem to me to be causal.
We do have a member here that reported their 1700 CPU is so warped it no longer posts. The degradation and crashes are starting to look like a sandwich made from layers of fail. Maybe the socket latch is the rancid meat.
If true, this adds yet ANOTHER complication to the situation... Jesus, no wonder why Intel is struggling to get to the bottom of this.I know a couple people who have done multiple RMA's and have used a contact frame for the entire life of all of them. It could be orthogonally related/cause tangential issues, but it doesn't seem to me to be causal.
Half the W680 boards were from ASUS. So failures are not unique to SuperMicro boards.I suspect the Supermicro W680 boards, personally, for the "data center" game servers being unreliable.
Because I have one such board, running with two different 12600K at DDR5-3600 ECC. I've had issues to the point I am submitting RMA for the board if it shuts off randomly again. I'm pretty sure the CPU isn't the problem because like I said I tried two different 12600Ks. And neither showed issues in a cheap ASRock Z690 DDR4 board but they both shutdown randomly when in the Supermicro W680 board.
Just my experience.
And I don't think it is related to the client crashes - where it does seem to be CPU issues.
What about OS?I'm pretty sure the CPU isn't the problem because like I said I tried two different 12600Ks. And neither showed issues in a cheap ASRock Z690 DDR4 board but they both shutdown randomly when in the Supermicro W680 board.
Roughly 1 out of every 150 CPUs is dead within 100 hours or so. The statistics for "marginal" CPUs is even higher than this. The laptops having issues could just be the normal failure rate.The doc speculating on why the socket latch could be part of the problem -
" I'm talking about torsional twist exacerbating the issue - i.e finding the root cause, not the symptom."
How does that explain laptops with the issues?
On a humorous note/pure satire: The doc is obviously guerrilla marketing for Roman and going to get a nice kickback on all of the contact frames he is about to sell.
I actually faced thermal issues with my ASROCK mobo until I clamped the heatsink down hard on the CPU socket. Did you screw it down till it felt tight enough or did you go all the way and turn the screws till your screwdriver couldn't budge the screw anymore and slipped off?And neither showed issues in a cheap ASRock Z690 DDR4 board
LMAO - I actually chuckled because 1) it called out ASUS on some "feature" only their boards seem to have, probably something else with a custom name and 2)setting it does what? and its not safe? what?
Intels been driving in the wrong direction since they refused the iPhone.pat gelsinger talking about amd being in the rear view mirror, "are you sure it doesnt mean (intel) isnt driving in the wrong direction?"
To be fair, everyone is now margins obsessed. The other x86 vendor seems quite blasé about ignoring huge markets because margins would be lower and seem to have no strategy to offer a budget line using older nodes/Samsung etc.Intels been driving in the wrong direction since they refused the iPhone.
Yes, this seems to be the likeliest cause in my view. Buildzoid themselves summarized it as follows:buildzoid chimes in.
ac loadline setting isnt being enforced (CEP) and vdroop isnt reported correctly resulting in undervolting so low that it is making the cpu unstable. while not the cause, it does contribute.
he is speculating that the ring is killing itself when the cpu tries to crank up voltage for the 6ghz on the 2 p cores. the i9s are failing sooner because they are running higher voltages, but the i7s may just be damaging themselves slower but are still degrading.
the warframe dev's pie chart shows the main determinant by cpu model is voltage, with the only outlier being the ks models which have a different customer target.
Which of the UE engine games known to crash do you put a fair number of hours into?Maybe CPU bins affect which CPUs are more or less affected by degrading and/or instabilities?! I always called my own 13900K "good enough" and in the full range of 13/14th gen it is rather average, right in the middle of VID spread published by Igor's Lab. Maybe that means my CPU doesn't ask for too much (degrading) nor too little voltage (instability) at stock settings and thus stays healthy and stable?!
I have been using my 13900K for over 1.5 years now and send it through various stress tests within mostly sane limits (but know that it hit over 415 A at least a very few times).
We won't know until we understand the nature of failures. We don't know if it's voltage, current, temps, mechanical, factory of origin or a combination of all these factors. Depending on the root cause and possible secondary causes that may accelerate the degradation, a CPU being stable for more than 1 year is no indication that everything will remain fine in the future. (this applies to my 12700K too)I have been using my 13900K for over 1.5 years now and send it through various stress tests within mostly sane limits (but know that it hit over 415 A at least a very few times).