Discussion Zen 5 Speculation (EPYC Turin and Strix Point/Granite Ridge - Ryzen 9000)

DisEnchantment · Sep 29, 2022

Speculate at will

The Hardcard · Aug 1, 2024

poke01 said:
Strix Halo points to the future of windows laptops. It will likely be expensive and it will be a niche product but it’s the most innovative product out of all the client Zen 5 products.

The few issues I have with it are the CCD is not 3NE and it’s using RDNA 3.5 and not 4. But it’s builds a base and I hope it’s successful because of it is we don’t have deal with crap like VRAM limitations ever on a laptop cause of the unified memory and who doesn’t like big APUs.

Did you see somewhere that it would have unified memory? I am extremely curious about that, but haven’t picked up on anything.

gdansk · Aug 1, 2024

The Hardcard said:
Did you see somewhere that it would have unified memory? I am extremely curious about that, but haven’t picked up on anything.

AMD APUs have unified memory.
CPU and GPU and NPU may share a unified per-process virtual address space...

adroc_thurston · Aug 1, 2024

The Hardcard said:
Did you see somewhere that it would have unified memory?

They've had it since Kaveri.

The Hardcard · Aug 1, 2024

adroc_thurston said:
They've had it since Kaveri.

It didn’t make it to Ryzen until the MI300A for the datacenter. None of the consumer Ryzen APUs have unified memory.

adroc_thurston · Aug 1, 2024

The Hardcard said:
It didn’t make it to Ryzen until the MI300A for the datacenter.

?

The Hardcard said:
None of the consumer Ryzen APUs have unified memory.

Yeah they do. Since Kaveri.

blackangus · Aug 1, 2024

misuspita said:
Oh no, I'm waiting to see it in something the size of this, or probably a bit bigger and throw my money on it

Yeah a Strix Halo in that form factor would be a buy for me as long as the price isn't egregious.

The Hardcard · Aug 1, 2024

gdansk said:
AMD APUs have unified memory.
CPU and GPU and NPU may share a unified per-process virtual address space...

Where have you seen that? AMD’s hUMA documents end with Carrizo from what I’ve seen. I even bought and still have a Trinity laptop 9 or so years ago, thinking it had unified memory.

I’ve been watching for indications of unified memory since the first Ryzens. I am not seeing anything. Searching now, I still don’t see anything. Any links?

adroc_thurston · Aug 1, 2024

The Hardcard said:
AMD’s hUMA documents end with Carrizo from what I’ve seen

That's because HSA died with Carrizo.
But the hardware works the same since Kaveri.

The Hardcard · Aug 1, 2024

adroc_thurston said:
?

Yeah they do. Since Kaveri.

You got any links stating this? I have Googled my ass off for years and have been looking for it since the first Ryzen launched. The last reference to unified memory I’m finding in consumer APUs is Carrizo.

adroc_thurston · Aug 1, 2024

The Hardcard said:
You got any links stating this?

It's somewhere in Linux driver notes.

The Hardcard said:
The last reference to unified memory I’m finding in consumer APUs is Carrizo.

Of course, HSA died in 2016!
Unified memory is very not relevant outside of thin slice of GPGPU space.

The Hardcard · Aug 1, 2024

adroc_thurston said:
It's somewhere in Linux driver notes.

Of course, HSA died in 2016!
Unified memory is very not relevant outside of thin slice of GPGPU space.

I am still looking, but so far see zero mention of unified memory. To be clear, I am not talking about * shared * memory - as in both the CPU and the GPU separately using the same system RAM. in that scenario, the CPU gives the GPU a chunk of the system RAM that then becomes slow VRAM for the GPU. Once that happens, the CPU units are not allowed to operate on it directly, information is copied back-and-forth as if it was on a separate physical device just as it was a discrete GPU.

Unified memory is where the memory management units of the CPU and GPU are coherent and thus there’s no need for any copying. the CPU and GPU units can both directly operate on the same memory.

Every reference I have seen Ryzen APUs indicates shared memory requiring copying to the GPU allocated memory. Kaveri and Carrizo had coherent CPU/GPU memory management units and “zero copy”, the GPU units and CPU units could directly access the same memory at the same time.

Yes, that was part of HSA and I agree it died with Carrizo. That means unified memory died with Carrizo. Everything I have seen referencing Ryzen APUs indicates that they require copying memory between the CPU allocated system RAM and GPU allocated system RAM. Shared, not unified.

gdansk · Aug 1, 2024

The Hardcard said:
Where have you seen that? AMD’s hUMA documents end with Carrizo from what I’ve seen. I even bought and still have a Trinity laptop 9 or so years ago, thinking it had unified memory.

I’ve been watching for indications of unified memory since the first Ryzens. I am not seeing anything. Searching now, I still don’t see anything. Any links?

Here's a conversation from an AMD GPU driver developer:

AMD's AOMP 19.0-2 Compiler Brings Zero-Copy For CPU-GPU Unified Shared Memory - Phoronix Forums

Phoronix: AMD's AOMP 19.0-2 Compiler Brings Zero-Copy For CPU-GPU Unified Shared Memory AMD compiler engineers have released AOMP 19.0-2 as the newest version of their downstream LLVM/Clang compiler that carries all of their latest work around OpenMP/AOCC GPU device offloading to Radeon and...

www.phoronix.com

APU have supported it, but no one runs ROCm on an APU except the big $10000 one. They can all share pointers to system memory across CPU/GPU/NPU. Where it needs more work is sharing VRAM to the CPU. But since Strix etc don't have VRAM that isn't an issue.

poke01 · Aug 1, 2024

adroc_thurston said:
Unified memory is very not relevant outside of thin slice of GPGPU space.

consoles, phones?

The Hardcard · Aug 1, 2024

gdansk said:
Here's a conversation from an AMD GPU driver developer:

AMD's AOMP 19.0-2 Compiler Brings Zero-Copy For CPU-GPU Unified Shared Memory - Phoronix Forums

Phoronix: AMD's AOMP 19.0-2 Compiler Brings Zero-Copy For CPU-GPU Unified Shared Memory AMD compiler engineers have released AOMP 19.0-2 as the newest version of their downstream LLVM/Clang compiler that carries all of their latest work around OpenMP/AOCC GPU device offloading to Radeon and...

www.phoronix.com

APU have supported it, but no one runs ROCm on an APU except the big $10000 one. They can all share pointers to system memory across CPU/GPU/NPU. Where it needs more work is sharing VRAM to the CPU. But since Strix etc don't have VRAM that isn't an issue.

I greatly appreciate that link! From just last month, late June 2024. The first mention that Ryzen APUs can do zero copy that I know of since the launch in 2017. And I have been looking for it since then.

Interesting that at least that one AMD Linux driver developer (Bridgeman) also believed the capability was lost with Ryzen. So while physically capable, there is no driver support in either Linux or Windows. it is also interesting because it appears that implementing it in MI300A was just a case of writing support into the drivers.

I was a little disappointed that the AMD guy made no mention of plans to include drivers support for consumer Ryzens in the near future, though that compiler update, gives me a little hope.

Strix Halo will be very interesting as a machine learning workstation if AMD is smart enough to expose unified memory in the drivers and allow that 40 CU GPU access to all 128 GBs of RAM. it would have been so much better if they had gone with a 512-bit bus, but I suppose they are already sticking their necks out with 256 bits.

adroc_thurston · Aug 1, 2024

poke01 said:
consoles

That's a very separate ecosystem.

poke01 said:
phones

Phone GPU APIs are even more antiquated than what we have on big boy platforms.

The Hardcard said:
Strix Halo will be very interesting as a machine learning workstation if AMD is smart enough to expose unified memory in the drivers and allow that 40 CU GPU access to all 128 GBs of RAM

Yeah that's the selling point.

The Hardcard said:
it would have been so much better if they had gone with a 512-bit bus

That's server levels of memory setups, not feasible.

gdansk · Aug 1, 2024

The Hardcard said:
Strix Halo will be very interesting as a machine learning workstation if AMD is smart enough to expose unified memory in the drivers and allow that 40 CU GPU access to all 128 GBs of RAM.

I suspect that's their plan now. ROCm + Halo large memory space could finally give them a niche to get a few sales.

The Hardcard · Aug 1, 2024

adroc_thurston said:
Yeah that's the selling point.

That's server levels of memory setups, not feasible.

it’s feasible, just much more expensive, However, there is a growing community that would gladly pay the higher price, since LLM text generation is severely memory bandwidth constrained. Llama 3 70B 4-bit will get about 8 tokens per second on an enabled Strix Halo, 15 tok/s would have made it a slam dunk.

There are people who hate Apple almost as much as Igor does who are buying Macs for that unified memory and the 512-bit and 1024-bit buses.

adroc_thurston · Aug 1, 2024

The Hardcard said:
it’s feasible

Technically, yes; market-wise, no, at least for now.

The Hardcard said:
there is a growing community that would gladly pay the higher price, since LLM text generation is severely memory bandwidth constrained

Bubble huffers don't have much time left to live.

The Hardcard said:
There are people who hate Apple almost as much as Igor does who are buying Macs for that unified memory and the 512-bit and 1024-bit buses.

Oh yes, but that's niche.
-halo exists to displace x106 dies and below out of existence.

poke01 · Aug 1, 2024

adroc_thurston said:
That's server levels of memory setups, not feasible.

It’s certainly possible, it will increase costs. just requires more channels. If AMD wants to keep it around $2000 then 256-bit is the right choice.

poke01 · Aug 1, 2024

adroc_thurston said:
Bubble huffers don't have much time left to live.

I love this term😂

adroc_thurston · Aug 1, 2024

poke01 said:
It’s certainly possible

You're not routing 512b on laptop PCBs.

poke01 said:
just requires more channels.

How do I route all that stuff.

poke01 said:
If AMD wants to keep it around $2000

Less.
That's the idea!

poke01 · Aug 1, 2024

adroc_thurston said:
You're not routing 512b on laptop PCBs.

How does Apple do it? Does the on-package memory help?

adroc_thurston · Aug 1, 2024

poke01 said:
How does Apple do it?

MoP.

poke01 said:
Does the on-package memory help?

Yeah lmao.
The same reason why embedded Radeons since the times immemorial did on-carrier mem.

gdansk · Aug 1, 2024

Remember that it is Radeon not Instinct. Radeon balances memory for what gaming needs. Pairing 40 CU at even higher clocks on a 256-bit LPDDR5X bus seems balanced.

But it so happens that another possible market appeared after its conception. Where all they care about is memory bandwidth and memory size? Well... maybe we put AI in the name and try to have ROCm support ready.

The Hardcard · Aug 1, 2024

adroc_thurston said:
Technically, yes; market-wise, no, at least for now.

Bubble huffers don't have much time left to live.

Oh yes, but that's niche.
-halo exists to displace x106 dies and below out of existence.

I am a straight up bubble huffer. I say that AI functions will be the main reason for sales of computing devices by 2040. I still find it bizarre there’s so many people on various text site forums don’t see that AI is THE future of computing in society.

To be sure, new algorithms are needed. The current ones are just stopgaps, as impressive as they may be at these small simple tasks. But, just as current computers are many orders of magnitude more complex and capable than a Commodore 64, neural networks are going to follow the same trajectory, at about triple the speed.

Research is already underway to boost reasoning capacity by 100 to 1000 fold. Those algorithms will be here in a few short years, ones that will make large language models feel like stone age technology. Critically, those won’t be the last ones.

I believe the next 30 years will yield a bigger technological and social transformation than than the period from the dawn of civilization to 2024.

Yeah, I’m huffing big time.

Discussion Zen 5 Speculation (EPYC Turin and Strix Point/Granite Ridge - Ryzen 9000)

Golden Member

Member

Platinum Member

Diamond Member

Member

Diamond Member

Member

Member

Diamond Member

Member

Diamond Member

Member

Platinum Member

Platinum Member

Member

Diamond Member

Platinum Member

Member

Diamond Member

Platinum Member

Platinum Member

Diamond Member

Platinum Member

Diamond Member

Platinum Member

Member