Intel Execution Unit vs Nvidia CUDA Core vs AMD Stream Processors

Status
Not open for further replies.

bhvm

Member
Apr 1, 2009
47
1
71
Hello Friends,
I brought a new laptop thats HP p028TX that comes with Intel 4030u (HD4400) and Nvidia 830M (256 Shaders)

I just had a nice session of BFBC2 and COD MW2. And Here comes the surprize, I ended up using the onboard Intel 4400 and the game played well!

Yeah this is the same crappy intels which used to be the subject of mockery and puns 4 years ago.I would admit that intels have really grown up on their CPU as well as GPUs.

Now here's my question, Every GPU has some sort of "Cores".
Intel 4400 has 20
Nvidia 830m has 256
Nvidia 9800GT 112
AMD Rs 230m 320
AMD 8570m 384.

I see theres quite a difference in names and hence the confusion. In yesteryears, I would know that AMD Shaders where somewhere 4~5 Times weakers than Nvidia. I mean an Nvidia 64 or 96 Shader card would deliver the same FPS as AMD 320 shader card.Now comes intel in the game. Having the least 'Cores' Still keeping up. So how are intel's core when compared to AMD or nvidia GPU ones?

I am keen on understanding the technical and practical aspects of modern GPUs and figuring how these cores are different among architectures.
 

hdecharn

Junior Member
Sep 12, 2014
9
0
0
To understand these numbers, you need a survey of their micro-architecture. Open the attached .pdf. (Note, only general purpose units are showed.)

The basic element of Intel HD Grahics, AMD GCN, and NVIDIA Maxwell are the Execution Unit, Compute Unit, and Streaming Multiprocessor respectively. They contain ALUs, grouped in one or multiple SIMD. (ALUs belonging to the same SIMD execute the same instruction, i.e. there's one decoder per SIMD. They're are shown glued on my diagram.) These basic elements are multi-threaded, i.e. support multiple concurrent threads (AMD and NVIDIA call them wave and warp respectively), consisting of multiple kernels (1 kernel = 1 shader instance), and executed in one or more cycles on one SIMD.

Then, the GPU is constructed as a hierarchy of these elements. Intel groups 10 (in Gen 7.5; up to 8 in Gen 8) EUs, plus a L1/L2$, to form a Subslice; 2 Subslices plus an L3$ bank form a Slice. The GPU consists of 1 or 2 (IIRC) Slices.

AMD groups up to 16 CU into a Shader Engine. Up to 4 SEs can be incorporated in the GPU; they share the L2$.

Finally, NVIDIA group 4 SMs plus an L1/S$ ("S" stands for "shared") to form an SMM. Up to 4 SMMs form a GPC. Then, the GPU consists of one or more GPC, sharing the L2$.

The term "core" isn't relevant when comparing GPUs from two different companies: Intel cores = EUs; AMD cores = ALUs; NVIDIA cores = ALUS. The real metric is the number of ALUs (also known as shader cores).
The Intel 4400 has 20 EUs × 2 SIMD × 4 ALUs = 80 ALUs.

Note: No, AMD Shader Cores are not really weaker than NVIDIA GPU.
 

hdecharn

Junior Member
Sep 12, 2014
9
0
0
To understand this, you need to understand the underlying microarchitectures. Open the following PDF.

The basic processing structure in Intel, AMD, and NVIDIA is the Execution Unit, the Compute Unit, and the Streaming Multiprocessor. Three terms equivalent to core. They're composed of multiple ALU, grouped in one or more SIMD. ALU belonging to the same SIMD (glued together on my diagram) execute the same instruction. Thus, you need one instruction fetch/decode unit per SIMD (more efficient than one per ALU). Note: neither control nor special graphic units are represented.

The GPU is a hierarchy of cores. Intel group 10 EU and a L1/L2$ to form a subslice; 2 subslices paired with an L3$ bank form a slice; the GPU is composed of 1 or 2 slices. AMD group up to 16 CU to form a SE (shader engine); the GPU is composed of up to 4 SE, sharing the L2$. NVIDIA group 4 SM and a L1/S$ to form a SMM; up to 4 SMM form a GPC; the GPU a is composed of one or more GPC.

Intel metric is the EU (thus, HD 4400 has 20 EU); AMD metric is the shader core (i.e. the number of ALU); NVIDIA metric is the CUDA core (i.e. the number of ALU). To compare GPU, the good metric is the number of ALU.

PS: AMD shader core aren't slower than NVIDIA's ones—at same frequency. There're however some microarchitectural choices, such as the number of raster and texture units, internal and external bus width… that can make a GPU a faster than another.
 

mindless1

Diamond Member
Aug 11, 2001
8,201
1,500
126
1366 x 768 resolution laptop is why it was playable. The practical aspect is to not try to equate generations old cores but rather look at some benchmarks.
 

bhvm

Member
Apr 1, 2009
47
1
71
Hello Dear all,
Thanks for providing a very detailed know how on the technology. I Guess its the manufacturer's first job to confuse the consumers by technical jargons, deliberate proprietor standards and weird ways to define same things.

I was looking at a simple X no of Nvidia cores=y intel cores=z AMD cores type thing.
Or I would like to see what dedicated Nvidia or AMD cards perform close to Intel 4400

Anyways, after spending a half sunday on Notebook check and comparing various benchmark and game FPS,
I made a (Very) rough rule of Thumb like-
\
Intel 20 EU= Nvidia 96 Cores= Amd 192 Cores.

http://www.notebookcheck.net/Intel-HD-Graphics-4400.91979.0.html
http://www.notebookcheck.net/NVIDIA-GeForce-9800M-GT.9906.0.html
http://www.notebookcheck.net/AMD-Radeon-HD-8570M.86985.0.html
 

ThatBuzzkiller

Golden Member
Nov 14, 2014
1,120
260
136
I almost forgot to respond ...

What hdecharn covered was decent but I would also like to add is that you cannot make a direct comparison with just the SIMD units alone ...

You also need to consider other aspects such as occupancy and resources to get a clearer picture. With current Intel GPUs even at an extreme SIMD32 mode you still get 128 registers for each work group!

For AMD and Nvidia GPUs, if you want to achieve full occupancy you must use less than 26 registers for GCN or less than 33 registers for Maxwell for each work group!

This discrepancy alone can somewhat explain why Intel's execution unit ALUs have a much higher performance in comparison to the other IHVs when higher register pressure is present in the shader invocation ...

Then there are other issues about occupancy such as multi-issuing different types of instructions for higher utilization ...
 

hdecharn

Junior Member
Sep 12, 2014
9
0
0
Intel 20 EU = Nvidia 96 Cores = Amd 192 Cores.
It's not that simple. You need to take frequency, bus width, cache and register file sizes, special units number, and more, into account! Again, Intel, NVIDIA and AMD ALU are equivalent, but that's the overall micro-architecture that determines performances, not just the ALU count. From one generation of GPU to another–from the same company–, ALU occupancy can dramatically change! Trying to establish such a rule of thumb isn't relevant.
 

bhvm

Member
Apr 1, 2009
47
1
71
It's not that simple. You need to take frequency, bus width, cache and register file sizes, special units number, and more, into account! Again, Intel, NVIDIA and AMD ALU are equivalent, but that's the overall micro-architecture that determines performances, not just the ALU count. From one generation of GPU to another–from the same company–, ALU occupancy can dramatically change! Trying to establish such a rule of thumb isn't relevant.

Its just like comparing car engines. No v12s are going to be the same. However a layman knows that atleast a v12 performs better than a v6 or v8 for that matter. As I said these are rough estimates and direct comparisions are just impossible.
 

sm625

Diamond Member
May 6, 2011
8,172
137
106
Intel 20 EU= Nvidia 96 Cores= Amd 192 Cores.

It changes a bit with each generation. The ratio was much closer to 3:4 Nvidia:AMD in the kepler era. But it was around 1:4 in the Fermi era. With maxwell it has come down to around 1:2
 
Status
Not open for further replies.
sale-70-410-exam    | Exam-200-125-pdf    | we-sale-70-410-exam    | hot-sale-70-410-exam    | Latest-exam-700-603-Dumps    | Dumps-98-363-exams-date    | Certs-200-125-date    | Dumps-300-075-exams-date    | hot-sale-book-C8010-726-book    | Hot-Sale-200-310-Exam    | Exam-Description-200-310-dumps?    | hot-sale-book-200-125-book    | Latest-Updated-300-209-Exam    | Dumps-210-260-exams-date    | Download-200-125-Exam-PDF    | Exam-Description-300-101-dumps    | Certs-300-101-date    | Hot-Sale-300-075-Exam    | Latest-exam-200-125-Dumps    | Exam-Description-200-125-dumps    | Latest-Updated-300-075-Exam    | hot-sale-book-210-260-book    | Dumps-200-901-exams-date    | Certs-200-901-date    | Latest-exam-1Z0-062-Dumps    | Hot-Sale-1Z0-062-Exam    | Certs-CSSLP-date    | 100%-Pass-70-383-Exams    | Latest-JN0-360-real-exam-questions    | 100%-Pass-4A0-100-Real-Exam-Questions    | Dumps-300-135-exams-date    | Passed-200-105-Tech-Exams    | Latest-Updated-200-310-Exam    | Download-300-070-Exam-PDF    | Hot-Sale-JN0-360-Exam    | 100%-Pass-JN0-360-Exams    | 100%-Pass-JN0-360-Real-Exam-Questions    | Dumps-JN0-360-exams-date    | Exam-Description-1Z0-876-dumps    | Latest-exam-1Z0-876-Dumps    | Dumps-HPE0-Y53-exams-date    | 2017-Latest-HPE0-Y53-Exam    | 100%-Pass-HPE0-Y53-Real-Exam-Questions    | Pass-4A0-100-Exam    | Latest-4A0-100-Questions    | Dumps-98-365-exams-date    | 2017-Latest-98-365-Exam    | 100%-Pass-VCS-254-Exams    | 2017-Latest-VCS-273-Exam    | Dumps-200-355-exams-date    | 2017-Latest-300-320-Exam    | Pass-300-101-Exam    | 100%-Pass-300-115-Exams    |
http://www.portvapes.co.uk/    | http://www.portvapes.co.uk/    |