Yotsugi
Golden Member
- Oct 16, 2017
- 1,029
- 487
- 106
No.We may/probably will see it at some point though.
No.We may/probably will see it at some point though.
One day you'll manage to put even fewer words into a response.
I don't care about SMT4 itself, I want wide core with high IPC. Me as a customer I demand good products. And I cannot be happy that Apple mobile CPU A11 from 2017 is much stronger then Zen 3 will be in 2020. That's a shame for x86.Put a fork in it, SMT4 is dead. It was never a thing. There was never any evidence to suggest it.
I don't care about SMT4 itself, I want wide core with high IPC. Me as a customer I demand good products. And I cannot be happy that Apple mobile CPU A11 from 2017 is much stronger then Zen 3 will be in 2020. That's a shame for x86.
Better, they even taped out some stuff.AMDs Design of ZEN 3 is ready.
Why is this a meme? I agree that apple cores are different, they sacrifice considerable density to achive what they do, etc ... but they are considerably wider and considerably faster in most general-purpose operations, even when you disregard full stack optimizations, etc.How are people still parroting this meme
I don't care about SMT4 itself, I want wide core with high IPC. Me as a customer I demand good products. And I cannot be happy that Apple mobile CPU A11 from 2017 is much stronger then Zen 3 will be in 2020. That's a shame for x86.
So what are we waiting for?
What if Zen 3 has an interposer? How would that change number of links?
A12 has 158% of Skylake IPC in SPECint. A11 is slower but not much because it has 6xALUs too. It is nice example that 6xALU core needs some evolution steps to get max performance (pick the lowest fruits).A11 isn't really stronger than Zen2.
Good point. If Zen 2 SMT2 can gain +20% more performance this means average ALU loading is 80%. With 6xALU core you have base of 150%... so theoretically there might be +70% gain ( to Zen 2). However according to Zen 3 ST it would be around 40%. That's massive gain.we might see more gains. 40% is not too much to hope for
It should have no effect. The interconnect between L3 slices is implemented on the CPU chiplet.
An interposer should allow more complex, wider, faster and more efficient interconnect between chiplets, though, since a silicon interposer allows much finer metal layers and much lower energy-per-bit.
New AMD Patent Application
Prefetch data from RAM into L3 to reduce latency. With those big L3s this could mean something.
20190294546 - PREFETCHER BASED SPECULATIVE DYNAMIC RANDOM-ACCESS MEMORY READ REQUEST TECHNIQUE
A method includes monitoring a request rate of speculative memory read requests from a penultimate-level cache to a main memory. The speculative memory read requests correspond to data read requests that missed in the penultimate-level cache. A hit rate of searches of a last-level cache for data requested by the data read requests is monitored. Core demand speculative memory read requests to the main memory are selectively enabled in parallel with searching of the last-level cache for data of a corresponding core demand data read request based on the request rate and the hit rate. Prefetch speculative memory read requests to the main memory are selectively enabled in parallel with searching of the last-level cache for data of a corresponding prefetch data read request based on the request rate and the hit rate. View attachment 11717
Better, they even taped out some stuff.
Ah, but at what clock freq does power consumption jump through the roof due to uArch mobile optimisations?A12 has 158% of Skylake IPC in SPECint. A11 is slower but not much because it has 6xALUs too. It is nice example that 6xALU core needs some evolution steps to get max performance (pick the lowest fruits).
I thought the whole point of the interposer was interconnect, surely integrating the IO would be the better use case then?Of course, an active interposer could move some logic off the compute die and into the interposer, opening up all sorts of options for topology.
iOS is OSX, with a touch interface. Should be pretty easy to compare.Ah, but at what clock freq does power consumption jump through the roof due to uArch mobile optimisations?
And we already know the intrinsic vector length limits of NEON SIMD are below that of AMD, let alone Intel with AVX512.
Of course this will change in the future with SVE2, but that is then, this is now.
There still seems to be a gulf between benchmarking the 2 platforms that respects all possible performance avenues, and vector/SIMD length is a big one in certain use cases.
Should be, and yet we still have these strangely limited benchmarks that miss a crucial area of modern CPU performance in the SIMD execution.iOS is OSX, with a touch interface. Should be pretty easy to compare.
Frankly at this point they probably have full on engineering samples back in the labs. They taped out a while ago.What 'stuff'? Zen3 CCD? Zen3 IOD? Please elaborate, TIA.
Geekbench has had AVX512 for ages, why not use it?Should be, and yet we still have these strangely limited benchmarks that miss a crucial area of modern CPU performance in the SIMD execution.
Dunno how we would go about comparing them though - perhaps dav1d would suffice to at least test an AVX2 cpu vs a NEON cpu, but dav1d lacks AVX512 code at present to compare further.
Nowhere near ubiquitous enough.Geekbench has had AVX512 for ages, why not use it?
A12 has 158% of Skylake IPC in SPECint.
Ah, but at what clock freq does power consumption jump through the roof due to uArch mobile optimisations?
Don't be influenced by desktop CPUs. Server 64c Epyc 7742 has a base frequency 2.25 GHz (may boost to 2.5 GHz within TDP). Apple A13 runs at 2.66 GHz too.... so for servers is freqency identical however performance is around +50% higher for fruit machine A13. Power consumption for A13 is around 4W, subtract consumption of GPU and idling/sleeping 5 more cores, it can be 3.5W x 64c = 224 W (Epyc has TDP 225W). Pretty comparable consumption with massive performance gain +50%.Apple SoC at such a low clockspeed that nobody's really going to care about that comparison anyway