[Eurogamer] Deep Dive on the PS4 PRO GPU (Polaris + Vega features!)

Det0x

Golden Member
Sep 11, 2014
1,061
3,105
136
Interesting bits for me are:

"Finally, there's better support of variables such as half-floats. To date, with the AMD architectures, a half-float would take the same internal space as a full 32-bit float. There hasn't been much advantage to using them. With Polaris though, it's possible to place two half-floats side by side in a register, which means if you're willing to mark which variables in a shader program are fine with 16-bits of storage, you can use twice as many. Annotate your shader program, say which variables are 16-bit, then you'll use fewer vector registers."

The enhancements in PS4 Pro are also geared to extracting more utilisation from the base AMD compute units.

"Multiple wavefronts running on a CU are a great thing because as one wavefront is going out to load texture or other memory, the other wavefronts can happily do computation. It means your utilisation of vector ALU goes up," Cerny shares.

"Anything you can do to put more wavefronts on a CU is good, to get more running on a CU. There are a limited number of vector registers so if you use fewer vector registers, you can have more wavefronts and then your performance increases, so that's what native 16-bit support targets. It allows more wavefronts to run at the same time."

and:

"We can have custom features and they can eventually end up on the [AMD] roadmap," Cerny says proudly. "So the ACEs... I was very passionate about asynchronous compute, so we did a lot of work there for the original PlayStation 4 and that ended up getting incorporated into subsequent AMD GPUs, which is nice because the PC development community gets very familiar with those techniques. It can help us when the parts of GPUs that we are passionate about are used in the PC space."

In actual fact, two new AMD roadmap features debut in the Pro, ahead of their release in upcoming Radeon PC products - presumably the Vega GPUs due either late this year or early next year.

"One of the features appearing for the first time is the handling of 16-bit variables - it's possible to perform two 16-bit operations at a time instead of one 32-bit operation," he says, confirming what we learned during our visit to VooFoo Studios to check out Mantis Burn Racing. "In other words, at full floats, we have 4.2 teraflops. With half-floats, it's now double that, which is to say, 8.4 teraflops in 16-bit computation. This has the potential to radically increase performance."

A work distributor is also added to the GPU design, designed to improve efficiency through more intelligent distribution of work.

"Once a GPU gets to a certain size, it's important for the GPU to have a centralised brain that intelligently distributes and load-balances the geometry rendered. So it's something that's very focused on, say, geometry shading and tessellation, though there is some basic vertex work as well that it will distribute," Mark Cerny shares, before explaining how it improves on AMD's existing architecture.

"The work distributor in PS4 Pro is very advanced. Not only does it have the fairly dramatic tessellation improvements from Polaris, it also has some post-Polaris functionality that accelerates rendering in scenes with many small objects... So the improvement is that a single patch is intelligently distributed between a number of compute units, and that's trickier than it sounds because the process of sub-dividing and rendering a patch is quite complex."

And:

Beyond that, we're moving into the juicy stuff - the custom hardware that Sony has introduced, elements of the 'secret sauce' that allow the Pro graphics core to punch so far above its weight. In creating 4K framebuffers, a lot of the technological underpinnings are actually based on advanced anti-aliasing work with the creation of new buffers that can be exploited in a number of ways.

Right now, post-process anti-aliasing techniques like FXAA or SMAA have their limits. Edge detection accuracy varies dramatically. Searches based on high contrast differentials, depth or normal maps - or a combination - all have limitations. Sony had fashioned its own, highly innovative solution.

"We'd really like to know where the object and triangle boundaries are when performing spatial anti-aliasing, but contrast, Z [depth] and normal are all imperfect solutions," Cerny says. "We'd also like to track the information from frame to frame because we're performing temporal anti-aliasing. It would be great to know the relationship between the previous frame and the current frame better. Our solution to this long-standing problem in computer graphics is the ID buffer. It's like a super-stencil. It's a separate buffer written by custom hardware that contains the object ID."

It's all hardware based, written at the same time as the Z buffer, with no pixel shader invocation required and it operates at the same resolution as the Z buffer. For the first time, objects and their coordinates in world-space can be tracked, even individual triangles can be identified. Modern GPUs don't have this access to the triangle count without a huge impact on performance.

"As a result of the ID buffer, you can now know where the edges of objects and triangles are and track them from frame to frame, because you can use the same ID from frame to frame," Cerny explains. "So it's a new tool to the developer toolbox that's pretty transformative in terms of the techniques it enables. And I'm going to explain two different techniques that use the buffer - one simpler that's geometry rendering and one more complex, the checkerboard."



 
Last edited:
Reactions: swilli89

HurleyBird

Platinum Member
Apr 22, 2003
2,726
1,342
136
The ID buffer seems like it could be a game changer. A pretty obvious innovation in hindsight.

The extra 1GB of DDR3 was a bit of a surprise as well, but does a good job of explaining why the Pro has more usable memory for games.

This GPU is going to be able to hit way above its weight class. Certainly above a 480 despite a modest deficit in raw numbers. Console devs will make use of FP16 everywhere they can get away with it given time. If Vega has the ID buffer as well, I could see it besting the new Titan given proper software support. Of course, I imagine most rendering techniques would be incompatible with other GPUs, making any comparison necessarily apples-to-oranges.
 
Last edited:

Headfoot

Diamond Member
Feb 28, 2008
4,444
641
126
Very, very impressive. The ID Buffer idea is awesome. I like the idea of another player in the market actively pushing new high performance GPU hardware techniques. In essence this allows Sony and the PlayStation team to enter into competition without having to enter via the same market strategy as the dGPU makers. It's another group out there pushing technology forward along with AMD and nVidia, so it can only benefit game design.

The idea of creating a 4k image in a new manner using these advanced techniques is extremely interesting. There's a sea change with using data from one frame to another and composing images fundamentally in some different ways.

Very cool.
 

swilli89

Golden Member
Mar 23, 2010
1,558
1,181
136
Post-polaris functionality eh?

Wonder if they'll launch a 480 replacement with those.
Well if whatever functionality is not included in DX12 then it would be a lost cause, also they may be saving the updates for Vega. Thinking it may be too costly right now cost:benefit to implement into Polaris 10.
 
May 11, 2008
20,055
1,290
126
If this is correct, it will be a serious game changer. This is also going to debut in upcoming vega.
In situations where there are a lot of these operations, performance should greatly increase.
2017 is going to be very interesting.
I hope Anandtech will then return with a nice detailed Vega review.

"Finally, there's better support of variables such as half-floats. To date, with the AMD architectures, a half-float would take the same internal space as a full 32-bit float. There hasn't been much advantage to using them. With Polaris though, it's possible to place two half-floats side by side in a register, which means if you're willing to mark which variables in a shader program are fine with 16-bits of storage, you can use twice as many. Annotate your shader program, say which variables are 16-bit, then you'll use fewer vector registers."

The enhancements in PS4 Pro are also geared to extracting more utilisation from the base AMD compute units.

"Multiple wavefronts running on a CU are a great thing because as one wavefront is going out to load texture or other memory, the other wavefronts can happily do computation. It means your utilisation of vector ALU goes up," Cerny shares.

One of the features appearing for the first time is the handling of 16-bit variables - it's possible to perform two 16-bit operations at a time instead of one 32-bit operation," he says, confirming what we learned during our visit to VooFoo Studios to check out Mantis Burn Racing. "In other words, at full floats, we have 4.2 teraflops. With half-floats, it's now double that, which is to say, 8.4 teraflops in 16-bit computation. This has the potential to radically increase performance."

I do think this will be high end for some time to come.
 

escrow4

Diamond Member
Feb 4, 2013
3,339
122
106
And yet they are still using a poky tablet CPU, even if its clocked at 2.1GHz. Keep compatibility but bottleneck their own brand new spanking new GPU. The irony.
 
May 11, 2008
20,055
1,290
126
Yep, i guess their idea is to let developers develop new games that does computations as much as possible on the GPU. Offloading the cpu.
I guess it is not possible to develop an 8 core cpu that has 100% jaguar compatibility mode and an advanced more modern 8 X86-64 core mode. It makes sense. The only way to do that is to make a sort of big little core with 8 jaguar cores and 8 modern cores and all the fabric that goes along with it. A monster chip that will consume and cost more.
 

Glo.

Diamond Member
Apr 25, 2015
5,763
4,667
136
If this is correct, it will be a serious game changer. This is also going to debut in upcoming vega.
In situations where there are a lot of these operations, performance should greatly increase.
2017 is going to be very interesting.
I hope Anandtech will then return with a nice detailed Vega review.

I do think this will be high end for some time to come.
Lets not get dragged out from the merit . All of this is designed for efficiency and making the GPU as fed as possible all of the time, with also increased throughput. Efficiency here might not mean that suddenly it will use less power. But it might have higher performance. "Might" is the key word here. I would not expect for very long time mature drivers for this architecture, and changes to previous generations of games.

As I see people discussing this on Beyond3D Forum say that all of those changes are focused also on next generation scheduling. Pretty interesting discussion there is: https://forum.beyond3d.com/threads/...ors-and-discussion.59649/page-15#post-1949326

I want to see how Vega behaves in games.
 
Last edited:
Reactions: swilli89
May 11, 2008
20,055
1,290
126
Lets not get dragged out from the merit . All of this is designed for efficiency and making the GPU as fed as possible all of the time, with also increased throughput.

As I see people discussing this on Beyond3D Forum say that all of those changes are focused also on next generation scheduling. Pretty interesting discussion there is: https://forum.beyond3d.com/threads/...ors-and-discussion.59649/page-15#post-1949326

I want to see how Vega behaves in games.

Well, that is my point. RX480 has much more computation units and theoretically more processing power than a GTX1060. But in reality, it is not able to make a huge jump in performance, although all the driver updates do seem to give performance increases with every new release. I always get the feeling that the games cannot make optimal use of the computation units of polaris. I wonder if the cuda cores from the GTX1060 are able to that 16+16 computation. I get the impression that that is the case. Although not forgetting the 1.5x higher clocks which also play a big part.

edit:
Thanks for the link. I am going to enjoy reading that.
 

Glo.

Diamond Member
Apr 25, 2015
5,763
4,667
136
Well, that is my point. RX480 has much more computation units and theoretically more processing power than a GTX1060. But in reality, it is not able to make a huge jump in performance, although all the driver updates do seem to give performance increases with every new release. I always get the feeling that the games cannot make optimal use of the computation units of polaris. I wonder if the cuda cores from the GTX1060 are able to that 16+16 computation. I get the impression that that is the case. Although not forgetting the 1.5x higher clocks which also play a big part.
It is not matter of Polaris design, but GCN design as a whole. Polaris lifted a lot of utilization problems in situation where you had high CPU overhead and underutilization of the CU's, but there is still a lot of room for improvements.

Vega and what has been discussed gives at least hope that this will be first true next generation GPU from any of the Vendors(Polaris is just slightly tuned Tonga on 14 nm process, and Pascal is slightly tuned Maxwell on 16 nm process). If I have to use analogy, Vega in change of architectures will be like Maxwell was for Kepler, or Pascal GP100 for Maxwell. Tuned, but much different.
 
May 11, 2008
20,055
1,290
126
I have to wait a bit longer and then i can buy the gfx card i desire. The gigabyte G1 RX480.
I hope to play doom when it is Christmas and i have a holiday and cold and dark out side.
 

RussianSensation

Elite Member
Sep 5, 2003
19,458
765
126
I have to wait a bit longer and then i can buy the gfx card i desire. The gigabyte G1 RX480.
I hope to play doom when it is Christmas and i have a holiday and cold and dark out side.

What's so special about Gigabyte RX 480? For 480 SKUs, XFX GTR, Asus Strix and MSI Gaming are by far superior to Gigabyte or Sapphire's offerings. Asus Strix 480 is on another level entirely. The heatsink is right off the $700 Asus Strix 1080. The custom PCB's 6 power phases are dedicated entirely for the GPU core (i.e., the GPU core gets 0 power from the PCIe slot on the Strix).

---

Awesome write-up by Digital Foundry and great level of detail by Cerny. Unlike MS that wants to abandon console generations, Cerny outright hints he wants a PS5 as a clean slate design, even fin it means breaking BC with PS4/Pro. I am looking forward to that. I hope we see PS5 with 7-8Tflops by 2020 at the latest, preferably by Fall 2019. Given how great The Order 1886, Uncharted 4, Driveclub, InFamous First Light look on a 1.8Tflop PS4, I am pretty exited for PS5 by the end of this decade. Horizon Zero Dawn looks incredible when looking at rather mediocre low-end specs of PS4. I am not particularly excited about PS4 Pro since Sony will not allow any exclusives. Seems like a nice mid-cycle $399 console for those who haven't purchased the original PS4 yet. Still, the prospects of a 2019-2020 PS5 with 6-8 core Zen, Navi+ era GPU with 16GB of HBM2 would provide for a great generational leap. It will be interesting if Sony chooses AMD again given how open AMD appears to be with catering exactly to Sony's needs.
 
Last edited:

dogen1

Senior member
Oct 14, 2014
739
40
91
And yet they are still using a poky tablet CPU, even if its clocked at 2.1GHz. Keep compatibility but bottleneck their own brand new spanking new GPU. The irony.

The PS4 Pro will be playing the same games as PS4 at higher resolutions and or higher quality. It won't be any more CPU bottlenecked than the PS4, especially with a 30% clock boost.
 
May 11, 2008
20,055
1,290
126
What's so special about Gigabyte RX 480? For 480 SKUs, XFX GTR, Asus Strix and MSI Gaming are by far superior to Gigabyte or Sapphire's offerings. Asus Strix 480 is on another level entirely. The heatsink is right off the $700 Asus Strix 1080. The custom PCB's 6 power phases are dedicated entirely for the GPU core (i.e., the GPU core gets 0 power from the PCIe slot on the Strix).
Normally, i am not a fan of ASUS, because in the past they bundled and forced the use of their own in my opinion, crappy software.
Anyway, i will keep that strix in mind, maybe it is different now. The strix is 30 euro more expensive than the G1, that is not a big issue. The bigger issue is that the strix is really big (No pun intended). I do not know if it will fit in my case. The MSI seems to consume a lot more than the generic rx480. XFX, i have not read enough about. With gigabyte i have good experiences. That is why in general it is my preferred brand. And the review about the card seems to be positive.
Also, since i only want to play at HD, I have the desire to undervolt and downclock to a level that the gpu becomes as efficient as possible while still maintaining an acceptable frame rate. So, a beast like the binned strix versions would be overkill for me. 4K monitors are for me out of reach at the moment. For general use, i would rather have another 1080p screen.


Awesome write-up by Digital Foundry and great level of detail by Cerny. Unlike MS that wants to abandon console generations, Cerny outright hints he wants a PS5 as a clean slate design, even fin it means breaking BC with PS4/Pro. I am looking forward to that. I hope we see PS5 with 7-8Tflops by 2020 at the latest, preferably by Fall 2019. Given how great The Order 1886, Uncharted 4, Driveclub, InFamous First Light look on a 1.8Tflop PS4, I am pretty exited for PS5 by the end of this decade. Horizon Zero Dawn looks incredible when looking at rather mediocre low-end specs of PS4. I am not particularly excited about PS4 Pro since Sony will not allow any exclusives. Seems like a nice mid-cycle $399 console for those who haven't purchased the original PS4 yet. Still, the prospects of a 2019-2020 PS5 with 6-8 core Zen, Navi+ era GPU with 16GB of HBM2 would provide for a great generational leap. It will be interesting if Sony chooses AMD again given how open AMD appears to be with catering exactly to Sony's needs.

That kept me wondering too, I think Sony and AMD share patents to keep the costs down. Sony is very good at finding the bottlenecks in gpu processing. And AMD seems to be able to service them with every desire Sony has.
 

Shivansps

Diamond Member
Sep 11, 2013
3,873
1,527
136
So AMD screwed us, the PC gamers, by selling an inferior product than they sold to consoles AND delaying superior products than consoles, thank you again, and thank you all who recomended to buy an RX480 because its the same gpu used on the PS4 PRO, its not either.
 

jpiniero

Lifer
Oct 1, 2010
14,841
5,456
136
So AMD screwed us, the PC gamers, by selling an inferior product than they sold to consoles AND delaying superior products than consoles, thank you again, and thank you all who recomended to buy an RX480 because its the same gpu used on the PS4 PRO, its not either.

Polaris 10 probably supports 2xFP16, it's just set at 1X for the consumer cards. Games don't use it although if the consoles start using it, maybe they will?

BTW, Cerny's comments make me think that Sony is backtracking a little bit on games requiring compatibility with the PS4.
 

Bryf50

Golden Member
Nov 11, 2006
1,429
51
91
So AMD screwed us, the PC gamers, by selling an inferior product than they sold to consoles AND delaying superior products than consoles, thank you again, and thank you all who recomended to buy an RX480 because its the same gpu used on the PS4 PRO, its not either.
lol? Who said this?
 

Shivansps

Diamond Member
Sep 11, 2013
3,873
1,527
136
I talk about the ID Buffer... im not gona get intro the FP16 because as you said, we cant be sure. They also delayed VEGA and that one whould be better than what the PS4 Pro has, its clear to me that AMD agenda and priorities are now consoles first.

Its also probably that desktop Polaris will not benefict as much from console optimizacions while under DX12 because of all this.
 

swilli89

Golden Member
Mar 23, 2010
1,558
1,181
136
So AMD screwed us, the PC gamers, by selling an inferior product than they sold to consoles AND delaying superior products than consoles, thank you again, and thank you all who recomended to buy an RX480 because its the same gpu used on the PS4 PRO, its not either.
AMD didn't screw over anyone, this is an absurd statement.
 

brandonmatic

Member
Jul 13, 2013
199
21
81
http://www.eurogamer.net/articles/d...tation-4-pro-how-sony-made-a-4k-games-machine

Super interesting article.. Seems the PS4 Pro GPU has some features not available on current Polaris cards and some features are mentioned as not available in retail until Vega hits..

Great article, thanks for posting it. It's really exciting to see GPU development pushed forward through collaboration between companies like AMD and Sony. The article makes it sound like the ID buffer approach is based on Sony developed technology though. I wonder if that will make its way over to AMD GPUs.

Edit: Also, the PS4 Pro may be the first good reason to buy a 4K TV.
 
Last edited:
sale-70-410-exam    | Exam-200-125-pdf    | we-sale-70-410-exam    | hot-sale-70-410-exam    | Latest-exam-700-603-Dumps    | Dumps-98-363-exams-date    | Certs-200-125-date    | Dumps-300-075-exams-date    | hot-sale-book-C8010-726-book    | Hot-Sale-200-310-Exam    | Exam-Description-200-310-dumps?    | hot-sale-book-200-125-book    | Latest-Updated-300-209-Exam    | Dumps-210-260-exams-date    | Download-200-125-Exam-PDF    | Exam-Description-300-101-dumps    | Certs-300-101-date    | Hot-Sale-300-075-Exam    | Latest-exam-200-125-Dumps    | Exam-Description-200-125-dumps    | Latest-Updated-300-075-Exam    | hot-sale-book-210-260-book    | Dumps-200-901-exams-date    | Certs-200-901-date    | Latest-exam-1Z0-062-Dumps    | Hot-Sale-1Z0-062-Exam    | Certs-CSSLP-date    | 100%-Pass-70-383-Exams    | Latest-JN0-360-real-exam-questions    | 100%-Pass-4A0-100-Real-Exam-Questions    | Dumps-300-135-exams-date    | Passed-200-105-Tech-Exams    | Latest-Updated-200-310-Exam    | Download-300-070-Exam-PDF    | Hot-Sale-JN0-360-Exam    | 100%-Pass-JN0-360-Exams    | 100%-Pass-JN0-360-Real-Exam-Questions    | Dumps-JN0-360-exams-date    | Exam-Description-1Z0-876-dumps    | Latest-exam-1Z0-876-Dumps    | Dumps-HPE0-Y53-exams-date    | 2017-Latest-HPE0-Y53-Exam    | 100%-Pass-HPE0-Y53-Real-Exam-Questions    | Pass-4A0-100-Exam    | Latest-4A0-100-Questions    | Dumps-98-365-exams-date    | 2017-Latest-98-365-Exam    | 100%-Pass-VCS-254-Exams    | 2017-Latest-VCS-273-Exam    | Dumps-200-355-exams-date    | 2017-Latest-300-320-Exam    | Pass-300-101-Exam    | 100%-Pass-300-115-Exams    |
http://www.portvapes.co.uk/    | http://www.portvapes.co.uk/    |