videocardzAMD Radeon R9 490X and R9 490 launches in June/Pro Duo launches on April

Page 9 - Seeking answers? Join the AnandTech community: where nearly half-a-million members share solutions and discuss the latest tech.
Feb 19, 2009
10,457
10
76
Since the post above to that source which claims power gating tech in the patent is in Polaris, detected in drivers, I would have to assume that the other tech mentioned in the same paragraph, would be turbo boosting SIMDS/ALUs.

Going with a base clock + boost is a much better idea for perf/w, as running a higher base clock all the time is wasteful when the ALUs are not 100% utilized.
 

Adored

Senior member
Mar 24, 2016
256
1
16
It seems reasonable to assume that AMD will target the perfect perf/Watt curve including bandwidth constraints of GDDR5, so 135W is it. Nvidia likely feels the need to win on raw performance at any power cost (dual 8-pins a reality? I doubt it), so even the 1070 might be slightly faster than Polaris but at 50W higher power. I can't see any other likely result unless one of them has really dropped the ball.
 
Last edited:

Saylick

Diamond Member
Sep 10, 2012
3,385
7,151
136
Hm... I am really hoping they incorporated the use of various width vector ALUs. Clock gating at the CU level is an improvement, but I think the granularity could be better... I hope AMD went down that route as well.
 

Saylick

Diamond Member
Sep 10, 2012
3,385
7,151
136
It seems reasonable to assume that AMD will target the perfect perf/Watt curve including bandwidth constraints of GDDR5, so 135W is it. Nvidia likely feels the need to win on raw performance at any power cost (dual 8-pins a reality? I doubt it), so even the 1070 might be slightly faster than Polaris but at 50W higher power. I can't see any other likely result unless one of them has really dropped the ball.

Well, that would be an interesting point of debate, considering perf/W was put on a higher pedestal last generation or two. *readies popcorn*
 

JDG1980

Golden Member
Jul 18, 2013
1,663
570
136
You want to cite some examples of this in recent amd history?

Every GCN product with a clock speed over 900 MHz is effectively overclocked. Going from 900->1000 MHz provides marginal increases in performance, but dramatically increases power consumption. Going from 1000->1050 MHz reduces efficiency even more, for even smaller benefits.
 

IEC

Elite Member
Super Moderator
Jun 10, 2004
14,362
5,028
136
Every GCN product with a clock speed over 900 MHz is effectively overclocked. Going from 900->1000 MHz provides marginal increases in performance, but dramatically increases power consumption. Going from 1000->1050 MHz reduces efficiency even more, for even smaller benefits.

The sweet spot will be different for FinFET GPUs. Should be able to clock higher for an optimized TDP.
 

IllogicalGlory

Senior member
Mar 8, 2013
934
346
136
Every GCN product with a clock speed over 900 MHz is effectively overclocked. Going from 900->1000 MHz provides marginal increases in performance, but dramatically increases power consumption. Going from 1000->1050 MHz reduces efficiency even more, for even smaller benefits.
I thought something like this, but what about the 7870? That was clocked at 1000MHz out of the box and its efficiency was amongst the best compared to other GCN cards, very close to the 7850, which was clocked at 860MHz.

 
Feb 19, 2009
10,457
10
76
I thought something like this, but what about the 7870? That was clocked at 1000MHz out of the box and its efficiency was amongst the best compared to other GCN cards, very close to the 7850, which was clocked at 860MHz.

Yes. It's more to do with transistor density.

Finfet doesn't suffer such a rapid increase in leakage as density increases, so it should reward a denser approach on perf/mm2 without the perf/w penalty.
 

Slaughterem

Member
Mar 21, 2016
77
23
51
Google translate:
Our resourceful users Bomby has found some interesting information about AMD's upcoming Polaris graphics cards. In the mailing list to AMD's Linux graphics driver an employee has submitted a patch, the first support for the Polaris family to offer. In it you will find some interesting information.

Firstly, it is confirmed that the next AMD graphics cards can display 16 bits per color channel. Thus, the Polaris-graphics cards may also be suitable for demanding applications. Furthermore you can find information to RAM. In Ellesmere 8 channels and 8 banks of 32 bits can be specified, Baffin is designed to provide only 4 channels, but can be fitted with 8 banks. From the specifications can be seen that AMD will initially continue to rely on GDDR5 memory. With four channels logically produces a 256-bit memory interface, eight channels of 512 bits. In both cases, a maximum memory configuration of 8 GiB GDDR5. As maximum clock a very high value of 6000 MHz is specified in each case. This corresponds to the effective clock speed of the GDDR5 memory and is already in models like AMD's Radeon R9 380 or GeForce GTX see 960th To what extent the currently specified maximum clock speed of 1154 MHz, which is valid for both chips, is also so incorporated into the series remains to be seen. At first engineering samples (prototypes) most other clock rates than the series production apply.
Furthermore Baffin is probably up to 5 displays and Ellesmere those able to control. 6 In another post Alex Deucher writes something to a new power saving mechanism which first times shall represent only Baffin available. It looks like AMD has in the new GCN generation Powergating for individual CUs (Compute Units), which bundle multiple shaders. This feature is likely to find some friends especially in the mobile segment. By adjusting the active computing cluster, the energy consumption control well and extend the battery life.
If you would like to read up on power patent it refers to Dynamic as compared to Static control of SIMD's
http://patents.justia.com/patent/9311102
Dynamic Medium Grain Clock Gating

As discussed above, in conventional approaches, clocking of all SIMD units in a shader complex is either enabled or disabled simultaneously. In many applications, not all SIMDs are assigned work. However, conventional approaches continue to actively provide clocking signals to such SIMDs. This approach increases power consumption of a graphics processing unit and is inefficient. Conventional approaches can include static clock gating for shader complex blocks in which, when a request is initiated by a SPI, clocks of shader complex blocks are turned-on, one by one, with a di/dt (i.e., rate of change of current) avoidance count delay. Once started, the clocks keep clocking for the entire shader complex even if there is no work for many blocks inside the shader complex. In other words, only a few SIMDs are active at any given time. Once work is completed by the shader complex, the clocks are shut-off automatically using the di/dt avoidance count delay. Thus, in conventional approaches, clock gating is static in nature, and treats the shader complex as a single system.

In contrast to conventional approaches, embodiments of the invention achieve dynamic grain (e.g., dynamic medium grain) clock gating of individual SIMDs in a shader complex. Switching power is reduced by shutting down clock trees to unused logic, and by providing a clock on demand mechanism (e.g., a true clock on demand mechanism). In this way, clock gating can be enhanced to save switching power for a duration of time when SIMDs are idle (or assigned no work).

Embodiments of the present invention also include dynamic control of clocks to each SIMD in a shader complex. Each SIMD is treated as shader complex sub-system that manages its own clocks. Dynamic control for each block/tile in an SIMD is also provided. Clocking can start before actual work arrives at SIMDs and can stay enabled until all the work has been completed by the SIMDs.

Dynamic medium grain clock gating, according to the embodiments, causes negligible performance impact to the graphics processing unit. Embodiments of the present invention can also be used to control power of SIMDs by power gating switches and thus save leakage power of SIMDs.
 
Feb 19, 2009
10,457
10
76
If you would like to read up on power patent it refers to Dynamic as compared to Static control of SIMD's
http://patents.justia.com/patent/9311102

Yeah, nice find.

Each CU is made up of 4x SIMDs (which has 16x ALU/SP) in GCN. With the ability to dynamically clock individual SIMDs (and the wording even mentions control of individual blocks within a SIMD is possible, suggest per ALU/SP clock/gating), there's some nice potential there for power savings, but combined with boosting clocks, leading to a perf increase.
 

Slaughterem

Member
Mar 21, 2016
77
23
51
Yeah, nice find.

Each CU is made up of 4x SIMDs (which has 16x ALU/SP) in GCN. With the ability to dynamically clock individual SIMDs (and the wording even mentions control of individual blocks within a SIMD is possible, suggest per ALU/SP clock/gating), there's some nice potential there for power savings, but combined with boosting clocks, leading to a perf increase.
For laptops this will reduce power for video playback, web surfing, gaming or any other application. And you are correct it states the following about boost.
A. Condition Based Control

Dynamic control of SIMDs can be condition dependent. Such exemplary conditions include, but are not limited to:

(1) Temperature Trip: When external sources indicate a higher processor temperature and there is a need for reduction in power consumption (or boost when applicable).

(2) Current Trip: When external sources indicate a higher processor current and there is a need for reduction in power consumption (or boost when applicable).

(3) CAC Management: When an on-chip CAC manager notices increased processing activity and makes a decision to increase performance by enabling more SIMDs or when the on-chip CAC manager notices decreased activity and makes a decision to reduce power by disabling a number of SIMDs without reduction in performance.
 

3DVagabond

Lifer
Aug 10, 2009
11,951
204
106
Every GCN product with a clock speed over 900 MHz is effectively overclocked. Going from 900->1000 MHz provides marginal increases in performance, but dramatically increases power consumption. Going from 1000->1050 MHz reduces efficiency even more, for even smaller benefits.

Stop making baseless claims, please? It completely sidetracks any useful discussion.
 

flopper

Senior member
Dec 16, 2005
739
19
76
Yeah, nice find.

Each CU is made up of 4x SIMDs (which has 16x ALU/SP) in GCN. With the ability to dynamically clock individual SIMDs (and the wording even mentions control of individual blocks within a SIMD is possible, suggest per ALU/SP clock/gating), there's some nice potential there for power savings, but combined with boosting clocks, leading to a perf increase.

Best of both worlds
 

Mahigan

Senior member
Aug 22, 2015
573
0
0
Yea, the same way gtx970 3,5GB comes with 0,5GB additional VRAM - nonfunctional.
What are you supposed to do with it?

I personally see no use for it but folks over at overclock.net were all enthused about owning that nonfunctional GPU.
 

MrTeal

Diamond Member
Dec 7, 2003
3,586
1,746
136
Well, if you needed any further evidence that Fiji isn't selling well, there it is.
 

R0H1T

Platinum Member
Jan 12, 2013
2,582
162
106
I personally see no use for it but folks over at overclock.net were all enthused about owning that nonfunctional GPU.
That's pretty cool actually, would definitely buy the spare (defunct?) GPU for 10$ plus shipping

That stuff is "collector's edition" & who knows it might just as well sell for 1000$ two decades from now :biggrin:
 
sale-70-410-exam    | Exam-200-125-pdf    | we-sale-70-410-exam    | hot-sale-70-410-exam    | Latest-exam-700-603-Dumps    | Dumps-98-363-exams-date    | Certs-200-125-date    | Dumps-300-075-exams-date    | hot-sale-book-C8010-726-book    | Hot-Sale-200-310-Exam    | Exam-Description-200-310-dumps?    | hot-sale-book-200-125-book    | Latest-Updated-300-209-Exam    | Dumps-210-260-exams-date    | Download-200-125-Exam-PDF    | Exam-Description-300-101-dumps    | Certs-300-101-date    | Hot-Sale-300-075-Exam    | Latest-exam-200-125-Dumps    | Exam-Description-200-125-dumps    | Latest-Updated-300-075-Exam    | hot-sale-book-210-260-book    | Dumps-200-901-exams-date    | Certs-200-901-date    | Latest-exam-1Z0-062-Dumps    | Hot-Sale-1Z0-062-Exam    | Certs-CSSLP-date    | 100%-Pass-70-383-Exams    | Latest-JN0-360-real-exam-questions    | 100%-Pass-4A0-100-Real-Exam-Questions    | Dumps-300-135-exams-date    | Passed-200-105-Tech-Exams    | Latest-Updated-200-310-Exam    | Download-300-070-Exam-PDF    | Hot-Sale-JN0-360-Exam    | 100%-Pass-JN0-360-Exams    | 100%-Pass-JN0-360-Real-Exam-Questions    | Dumps-JN0-360-exams-date    | Exam-Description-1Z0-876-dumps    | Latest-exam-1Z0-876-Dumps    | Dumps-HPE0-Y53-exams-date    | 2017-Latest-HPE0-Y53-Exam    | 100%-Pass-HPE0-Y53-Real-Exam-Questions    | Pass-4A0-100-Exam    | Latest-4A0-100-Questions    | Dumps-98-365-exams-date    | 2017-Latest-98-365-Exam    | 100%-Pass-VCS-254-Exams    | 2017-Latest-VCS-273-Exam    | Dumps-200-355-exams-date    | 2017-Latest-300-320-Exam    | Pass-300-101-Exam    | 100%-Pass-300-115-Exams    |
http://www.portvapes.co.uk/    | http://www.portvapes.co.uk/    |