Are the next gen consoles the realization of AMDs HSA dream?

Page 8 - Seeking answers? Join the AnandTech community: where nearly half-a-million members share solutions and discuss the latest tech.

galego

Golden Member
Apr 10, 2013
1,091
0
0
You are really out on deep water.

How much do the driver+API account for in execution time in percentage in DX10+?

And since the 10x factor is well know, can you please document it? Or is it just made up?

Richard Huddy:

PCs aren't just a bit more powerful than PS3 and Xbox 360 - they're up to 10 times more powerful. So why aren't PC games 10 times their console equivalents? Because of Windows' meddling DirectX API (application programming interface), that's why.
http://www.eurogamer.net/articles/2011-03-21-pcs-have-10x-console-horsepower-amd

Timothy Lottes:

As a PC guy who knows hardware to the metal, I spend most of my days in frustration knowing damn well what I could do with the hardware, but what I cannot do because Microsoft and IHVs wont provide low-level GPU access in PC APIs. One simple example, drawcalls on PC have easily 10x to 100x the overhead of a console with a libGCM style API....
http://playstation-techzone.com/201...ance-potential-discussed-by-nvidia-dev-photos

J. Carmack:

It is extremely frustrating knowing that the hardware we've got on the PC is often ten times as powerful as the consoles but it has honestly been a struggle in many cases to get the game running at 60 frames per second on the PC like it does on a 360 [...] A lot of it's driver overhead issues, where there's so much that we do in the game, all of this dynamic texture updating where on the console we say 'alright, we've got a new page of data', we put that page in and update the page table that points to that.

On the console that may just be a matter of writing it to memory, it's like 'here's the texture, let's calculate exactly where this part of the page table is' and then we just poke it right in there [...] On the PC that turns into potentially a tech sub-image 2D and if you're a programmer and you start single-stepping through that you'll cry. You won't make it back out. It'll just take forever.

http://www.computerandvideogames.co...d-that-pc-is-10-times-as-powerful-as-ps3-360/

Ben Hardwidge:

On consoles, you can draw maybe 10,000 or 20,000 chunks of geometry in a frame, and you can do that at 30-60fps. On a PC, you can't typically draw more than 2-3,000 without getting into trouble with performance, and that's quite surprising - the PC can actually show you only a tenth of the performance if you need a separate batch for each draw call.

http://www.bit-tech.net/hardware/graphics/2011/03/16/farewell-to-directx/2
 
Last edited:

beginner99

Diamond Member
Jun 2, 2009
5,223
1,598
136
...snipped...

So? if we consider PS3 and Xbox 360 run at less than 720p mostly and 30 FPS there is already a very good reason why PCs require much more power because it will run it at 1080p and 60 fps. So only for that the PC needs about 4x times more power. Then add on top that image quality is better, higher view distance and so forth that adds even more.

And then most PC games were just poor console ports. If developers would spent just as much time to optimize for PC as for a specific console... Would be like saying my c++ program is much faster than your Java while half of the c++program is in assembler and took 100x times longer to develop.
 

Olikan

Platinum Member
Sep 23, 2011
2,023
275
126
If the overhead really is 10x, how have PC's been running the same games much better (way higher res and framerate) than consoles for most of this console generation?

Whatever advantage consoles might have will probably already be made up when the 20nm GPU's come around.

the overhead is mostly on shaders and the cpu....
ROPs, TMUs and fixed hardware alike, don't suffer from (much) overhead...and they are the main reason why consoles today, can't rech 1080p

20nm GPU's won't do much....but the stacked memory, for sure, will :thumbsup:
 

galego

Golden Member
Apr 10, 2013
1,091
0
0
Do you really think PS4 is better than a PC with a Titan? If that's true I payed two times more just for the GPU for an inferior gaming platform.

The Titan has only 4.5 TFLOP, lacks HSA, lacks unified memory architecture, cannot be coded at the metal level, and run on a much much slower PCI bus.

One of Nvidia developers has said that the "PS4 could be years ahead of PC" after reading the specs.
 
Last edited:

ShintaiDK

Lifer
Apr 22, 2012
20,378
145
106
The Titan has only 4.5 TFLOP, lacks HSA, lacks unified memory architecture, cannot be coded at the metal level, and run on a much much slower PCI bus.

The GPU is bolted on via a similar bus as PCIe. So there is no raw speed advantage, only latency.

HSA is still a pipedream.

And as said before, even a 540m card plays console ports at the same or better rate than consoles. And thats even with more and better visual options for the PC.

Not to mention, yet again, the API and draw calls got nothing to do with GPU performance. Only the CPU. But again, draw calls only account for so much. And if you need to make so many draw calls. You might be doing it wrong to start with.

And if you wish to compare screenshots:

 
Last edited:

vltra

Junior Member
Apr 18, 2013
6
0
66
It's important to keep in mind what GCN is as an architecture ( and it's role in HSA ) . It's main advantage is compute . IMO , this explains the slower x86 based processor . They are betting on leveraging parallelism aka what HSA is basically about , so yeah it pretty much is what AMD envisioned , more or less . On Tom's there a recent review of the Titan in professional applications . In OpenCL the 7970 GHz was twice as fast as the Titan . Imagine what kind of compute power is possible with low level access , beats anything an x86 processor could provide .
 

galego

Golden Member
Apr 10, 2013
1,091
0
0
Not sure about this but didn't DX 10 and 11 reduce problems with draw calls, etc?

No. DX11 reduced the overhead by a factor of less than two in some very specific case scenarios. The rest of calls continue being about 10x.
 
Last edited:

galego

Golden Member
Apr 10, 2013
1,091
0
0
So? if we consider PS3 and Xbox 360 run at less than 720p mostly and 30 FPS there is already a very good reason why PCs require much more power because it will run it at 1080p and 60 fps. So only for that the PC needs about 4x times more power. Then add on top that image quality is better, higher view distance and so forth that adds even more.

Snipping the quotes and ignoring them will not hide the facts. I reproduce again part of Carmack quote:

It is extremely frustrating knowing that the hardware we've got on the PC is often ten times as powerful as the consoles but it has honestly been a struggle in many cases to get the game running at 60 frames per second on the PC like it does on a 360

And then most PC games were just poor console ports. If developers would spent just as much time to optimize for PC as for a specific console... Would be like saying my c++ program is much faster than your Java while half of the c++program is in assembler and took 100x times longer to develop.

First, the overhead mentioned has nothing to do with optimization. You cannot program at the metal level in a windows gaming PC because the OS does not allow you to directly access the hardware.

Second, a game developer cannot optimize for PC, because there is no one PC, but hundred of millions of different PCs with different hardware, operative systems, drivers...
 
Last edited:

blckgrffn

Diamond Member
May 1, 2003
9,198
3,185
136
www.teamjuchems.com
Hey common. Don't play stupid. It's well known the earth is at the center of the universe!

I don't know much about this stuff but common sense tells me that this 10x number is BS. Maybe certain specific tasks have that overhead but in total the effect will be much smaller.

It's the same myth that Java is ultra slow and everything will magically be 10x faster in C++.

It's more like Java is what it is but if you hand tuned the same code in assembly language for a given platform it could run 10x faster.

I don't see why that is hard to fathom.

Not that this is free, but you get to spend the time optimizing for one platform vs all the testing required to make it work on the infinite number of combinations of PCs, etc.

The more I read, the more I think the PS4 is the console to get next generation.
 

Dresdenboy

Golden Member
Jul 28, 2003
1,730
554
136
citavia.blog.de
If the overhead really is 10x, how have PC's been running the same games much better (way higher res and framerate) than consoles for most of this console generation?
I agree with those, who think, the 10x factor is some corner case, e. g. drawing single triangles. But I'm no game coder (just 3D/demo effect in the past). I think, there are some documents out there giving more color to this claim.

The higher res isn't affected by API calls overhead as long as the game doesn't automatically adapt its LOD accordingly.
 

galego

Golden Member
Apr 10, 2013
1,091
0
0
DX11 didnt reduce anything. It was all DX10 that changed the execution API and lowered overhead.

No. I was considering the multi-threaded display lists, which come up in DX 11, that helps to reduce some draws by the mentioned factor.
 

Sleepingforest

Platinum Member
Nov 18, 2012
2,375
0
76
Snipping the quotes and ignoring them will not hide the facts. I reproduce again part of Carmack quote:

First, the overhead has nothing to see with optimization. You cannot program at metal level in a gaming Pc using windows 7 because the OS do not allow you to directly access the hardware.

Second, a developer cannot optimize for PC, because there is no one, but hundred of millions of different PCs with different operative systems, drivers...

But if you look at the actual games, you'll see that most have a cap at 30 frames per second with v-sync. For example: Bioshock Infinite, easily runs at 60FPS with v-sync on a 7970 without dropping to 30 frames a second. An Xbox can only get 30-50 with tearing and no v-sync, or 30 pretty constantly with v-sync, as seen in this discussion.

Secondly, a developer CAN optimize for PC: there are two common sets of drivers, and they each govern a set of video cards with are essentially identical with only a variable amount of the hardware necessary to perform the calculations. Most games only release for Windows, for which there are at best 3 relevant versions: XP, 7 (possibly 8).

I mean, look in General Hardware. 90% of the recommendations are for an i5 plus 7970 build at $1000 with Windows 7 for as OS.
 

Pilum

Member
Aug 27, 2012
182
3
81
It's more like Java is what it is but if you hand tuned the same code in assembly language for a given platform it could run 10x faster.
Against a modern JIT? No chance, given programmers of equivalent skills. A factor of 2 is believable, maybe 3, but more is pretty much impossible. Of course many unskilled programmers will tend to use the "easy" languages, giving the impression of slow language performance, but what you observe in this case isn't the difference of language quality but programmer skill.
 

Spjut

Senior member
Apr 9, 2011
928
149
106
And as said before, even a 540m card plays console ports at the same or better rate than consoles. And thats even with more and better visual options for the PC.

Not to mention, yet again, the API and draw calls got nothing to do with GPU performance. Only the CPU. But again, draw calls only account for so much. And if you need to make so many draw calls. You might be doing it wrong to start with.

Read the quote from the Codemaster graphics programmer.

The PS3/360 quickly run into being GPU-bound, but with the next-gen consoles, the GPU limitations will be raised.
The PC's current APIs are mostly holding back things on the CPU side, not GPU.

The CPUs' ghz race has for the most part ended, so the improvements we get are mostly from increased IPC.

Today's (eight year old) consoles outperform modern PCs with DX11 in draw calls, it's pretty clear that this will be even worse for the PC during next-gen.
Hopefully we'll get DX12 or OpenGL 5 to remedy that problem, but even then, it will take time before those APIs are getting used.
 

BD2003

Lifer
Oct 9, 1999
16,815
1
76
So then....seems to be general consensus that an APU/HSA has some advantage, the magnitude of which is debatable and dependent on the particular workload.

Then we have the traditional console advantage, of being able to write closer to the hardware.

So my next question.....how do these interact? Merely additive? Or does some other bottleneck then take precedence and we get diminishing returns? Or synergistic, where the sum of the advantages is greater than the individual?

If synergistic....doesn't this widen the gulf between PC and Console (given generally equivalent hardware) more than usual?
 
Last edited:

Arkadrel

Diamond Member
Oct 19, 2010
3,681
2
0
@BD2003

I think their synergistic.

Because your removeing bottlenecks, at the same time your codeing to metal = sick returns.

doesn't this widen the gulf between PC and Console (given generally equivalent hardware) more than usual?
Yes but again... there are PC's out there that ll just brute force its dis-advantages away by useing more powerfull hardware.


Jaguar is a nifty lil cpu, but its no match IPC wise for the latest Intel CPUs.
Also the "core" 's inside the PS4 APU will probably only run like 2.2ghz or something low like that.

A PC enthusiast, could probably have a CPU thats like ~50% faster IPC wise,
running at double the speed. Thus have 3 times as much CPU capabilities.

The same is true for the GPU.
1.84 TFLOPs is decent for a APU, but throw in 4 x Titan's in SLI (4.5 TFLOPS x 4 = 18 TFLOPs)

and your basically 10 times as fast (GPU wise).

There are people out there willing to spend 8000$+ on their PC.
They will have faster gameing machines, overhead or not.
And each year, the $ it takes to build such a machine will drop in price.


all that said... the 400$ or so a PS4 will cost, will probably be good "value" in terms of performance in gameing you get for it, compaired to a pc.
 
Last edited:

BD2003

Lifer
Oct 9, 1999
16,815
1
76
I don't doubt for a nanosecond that brute force always wins....it was never my intention to imply that the PS4 will be some unbeatable super computer. I'm both a console and PC gamer, always have been, always will be. I'm only trying to understand the extent of the challenge traditional PCs are up against going into the next gen.
 

gorobei

Diamond Member
Jan 7, 2007
3,714
1,069
136
So then....seems to be general consensus that an APU/HSA has some advantage, the magnitude of which is debatable and dependent on the particular workload.

-console advantage traditionally (based on john carmack quote) is 2x equivalent pc hardware
-libgcm should be a larger factor.
-hsa shared memory and apu on die lower latency and lower power(sweeney/richards quote) 1/10 the power and much better latency, unknown performance advantage.
-single hardware profile/i.e. GCN (sweeney) unknown but maybe 1 order of magnitude or at least enough to be significant and worth the price in dev time/resources.
-lack of dx layer overhead is unknown(draw calls affect how many things you can have on screen, this affects world building in terms of mesh variety and complexity)

the bottleneck in previous gen consoles were not enough ram, proprietary unique cpu, less power, different formats.
if the ps4 devs are to be believed, the adoption of x86 with ddr5 with hsa memory sharing and libgcm metal calls eliminate most of those bottlenecks. the ability to use x86 dev tools is just productivity gravy and speeds up game release schedules.

chances are it will put consoles on par with a 680 or 7970 at 1080@60. if according to lottes, if a game dev is willing to code exclusively for the ps4(with code designed specifically for the apu structure), the performance could go beyond a 680/7970 card. he doesnt say by how much, so that is the area of debate.
 

Arkadrel

Diamond Member
Oct 19, 2010
3,681
2
0
I'm only trying to understand the extent of the challenge traditional PCs are up against going into the next gen.
In that case, I see the unified memory space, and HSA as being a big thing.
Once developers start makeing use of it, you could probably see decent
aditional performance squeezed out of the architecture.

AMD has a program that used "Haar Face Detect",
a commonly used multi-stage video analysis algorithm.

They optimised it to run on CPU+GPU with OpenCL, and also for HSA.
The HSA implimentation gave 2.3x performance increase at a 2.4 reduction in power.

OpenCL (with a GPU) already does some things 6-10 times as fast, as a CPU would.
IF HSA can more than double that amount (efficient usage of both), its gonna do good things for GPGPU.

I guess it depends on how much these GPGPU abilities become used in the future of gameing.

If it really takes off, you ll be seeing PS4 doing some GPGPU stuff that ll put much bigger GPUs (in pc's) to their knee's.
 
Last edited:

Enigmoid

Platinum Member
Sep 27, 2012
2,907
31
91
It's important to keep in mind what GCN is as an architecture ( and it's role in HSA ) . It's main advantage is compute . IMO , this explains the slower x86 based processor . They are betting on leveraging parallelism aka what HSA is basically about , so yeah it pretty much is what AMD envisioned , more or less . On Tom's there a recent review of the Titan in professional applications . In OpenCL the 7970 GHz was twice as fast as the Titan . Imagine what kind of compute power is possible with low level access , beats anything an x86 processor could provide .

http://www.tomshardware.com/reviews/firepro-w8000-w9000-benchmark,3265-16.html

Titan is using the wrong drivers. Between the Quadro 6000 and W9000 its pretty much a mixed bag (and this is fermi vs gcn) on openCL.

Please, lets not forget that many of these console developers are people who still haven't figured out how to use more than two cores in a pc game.
 

blckgrffn

Diamond Member
May 1, 2003
9,198
3,185
136
www.teamjuchems.com
Against a modern JIT? No chance, given programmers of equivalent skills. A factor of 2 is believable, maybe 3, but more is pretty much impossible. Of course many unskilled programmers will tend to use the "easy" languages, giving the impression of slow language performance, but what you observe in this case isn't the difference of language quality but programmer skill.

Haha, fair point. It is about the individual developer.

Taking Java (or something like it) coded to the point where it "gets the job done" and - lets face it - a lot code is only written well enough to pass QA - vs the hand tuned assembler for a single platform like this... Yeah. I don't know how many supremely skilled coders the world has - and how many make games - but there aren't enough of them.

The thing is, no matter what the language, we are going to get some really intense optimization for this platform that will last for ~8 years.

Investment on the PC has much shorter shelf-life, making it a poorer investment.

Particularly when the hardware is so powerful it provides great CYA vs some inefficient coding.
 

Arkadrel

Diamond Member
Oct 19, 2010
3,681
2
0
@Enigmoid

It depends on driver optimisations .
How else do you explain something like this:



Where the Quadro 6000 is LESS than 1/5th the performance of a 7970 1ghz ed.
Is this because the quadro isnt optimised for this benchmark? maybe.
Is the other link you provided the same issue? probably.

But this has nothing to do with HSA (title/topic of thread).
So please dont derail thread into a AMD vs Nvidia thingy.
 
Last edited:

sontin

Diamond Member
Sep 12, 2011
3,273
149
106
That is OpenCL. And if you believe AMD is not optimizing towards the three OpenCL benchmarks out there than yeah...
 

Enigmoid

Platinum Member
Sep 27, 2012
2,907
31
91
It's important to keep in mind what GCN is as an architecture ( and it's role in HSA ) . It's main advantage is compute . IMO , this explains the slower x86 based processor . They are betting on leveraging parallelism aka what HSA is basically about , so yeah it pretty much is what AMD envisioned , more or less . On Tom's there a recent review of the Titan in professional applications . In OpenCL the 7970 GHz was twice as fast as the Titan . Imagine what kind of compute power is possible with low level access , beats anything an x86 processor could provide .

@Enigmoid

It depends on driver optimisations .
How else do you explain something like this:



Where the Quadro 6000 is LESS than 1/5th the performance of a 7970 1ghz ed.
Is this because the quadro isnt optimised for this benchmark? maybe.
Is the other link you provided the same issue? probably.

But this has nothing to do with HSA (title/topic of thread).
So please dont derail thread into a AMD vs Nvidia thingy.

Im assuming thats because the AMD card has a lot more DP than the quadro (basically twice).

Its all drivers. Nvidia could improve openCL performance on their cards but they do not.



Depends really on the situation and test.

Im posting this to show that openCL (and compute) depend completely on what is being done with it. Kepler sucks with dual precision compute but is decent with single. Anything requiring single is a driver problem (though nvidia may never release drivers fixing it) not a hardware problem.
 
sale-70-410-exam    | Exam-200-125-pdf    | we-sale-70-410-exam    | hot-sale-70-410-exam    | Latest-exam-700-603-Dumps    | Dumps-98-363-exams-date    | Certs-200-125-date    | Dumps-300-075-exams-date    | hot-sale-book-C8010-726-book    | Hot-Sale-200-310-Exam    | Exam-Description-200-310-dumps?    | hot-sale-book-200-125-book    | Latest-Updated-300-209-Exam    | Dumps-210-260-exams-date    | Download-200-125-Exam-PDF    | Exam-Description-300-101-dumps    | Certs-300-101-date    | Hot-Sale-300-075-Exam    | Latest-exam-200-125-Dumps    | Exam-Description-200-125-dumps    | Latest-Updated-300-075-Exam    | hot-sale-book-210-260-book    | Dumps-200-901-exams-date    | Certs-200-901-date    | Latest-exam-1Z0-062-Dumps    | Hot-Sale-1Z0-062-Exam    | Certs-CSSLP-date    | 100%-Pass-70-383-Exams    | Latest-JN0-360-real-exam-questions    | 100%-Pass-4A0-100-Real-Exam-Questions    | Dumps-300-135-exams-date    | Passed-200-105-Tech-Exams    | Latest-Updated-200-310-Exam    | Download-300-070-Exam-PDF    | Hot-Sale-JN0-360-Exam    | 100%-Pass-JN0-360-Exams    | 100%-Pass-JN0-360-Real-Exam-Questions    | Dumps-JN0-360-exams-date    | Exam-Description-1Z0-876-dumps    | Latest-exam-1Z0-876-Dumps    | Dumps-HPE0-Y53-exams-date    | 2017-Latest-HPE0-Y53-Exam    | 100%-Pass-HPE0-Y53-Real-Exam-Questions    | Pass-4A0-100-Exam    | Latest-4A0-100-Questions    | Dumps-98-365-exams-date    | 2017-Latest-98-365-Exam    | 100%-Pass-VCS-254-Exams    | 2017-Latest-VCS-273-Exam    | Dumps-200-355-exams-date    | 2017-Latest-300-320-Exam    | Pass-300-101-Exam    | 100%-Pass-300-115-Exams    |
http://www.portvapes.co.uk/    | http://www.portvapes.co.uk/    |