Overclock your uncore. :)

Page 2 - Seeking answers? Join the AnandTech community: where nearly half-a-million members share solutions and discuss the latest tech.

Idontcare

Elite Member
Oct 10, 1999
21,110
59
91
Originally posted by: TC91
would ocing the fsb with a lower multiplier have a similar but much less significant effect on a core 2 system?

You can see the impact of increasing FSB and/or increasing the memory bandwidth on this benchmark for the core 2 quad here:

http://i272.photobucket.com/al.../BandwidthAnalysis.jpg

Look near the bottom, the row labeled "Metatrader 4 Backtest (s)". Basically zero improvement from increasing FSB or memory bandwidth or both.

(note these benches are run with a 4GHz Kentsfield...as fast as these cores are running and the FSB still doesn't hold them back, and its Kentsfield so the L2$ is smaller that Yorkfield, and still not FSB dependent)

Thanks to the massive L2$ on C2D and C2Q products combined with the aggressive prefetchers, very few real desktop applications actually experience a noteworthy benefit from increasing the FSB and memory bandwidth. On an L2$ deprived nehalem we'd expect the L3$ speed (bandwidth) to play a much larger role in system performance, likewise with the larger (but still smaller than C2D) L2$ phenom/phenom II.
 

Denithor

Diamond Member
Apr 11, 2004
6,298
23
81
And note that even with the i7 architecture in gaming applications you really don't see much advantage with singleGPU setup at all. In the conclusion in that review I linked above they commented that basically any Core 2 Dual/Quad was adequate for any single video card setup for today's games. It wasn't until you got to the level of two GTX 260/4870 cards in SLI/CF mode that you saw the i7 system pull ahead significantly - but when it did, it was convincing - and it even happened like this in games that don't really show a need for quad-core processors.

EDIT: I wonder if multiple GPUs would run F@H better on i7 than on other processors?
 

SlowSpyder

Lifer
Jan 12, 2005
17,305
1,001
126
Originally posted by: TC91
would ocing the fsb with a lower multiplier have a similar but much less significant effect on a core 2 system?

Along the lines of what IDC said, while testing for IDC I had my memory at different speeds as I tweaked and tested the uncore. I saw no increase in performance from 800MHz memory when I had the memory all the way up to 880MHz. Nor did I see any loss in performance from the 800MHz speed memory when I went down to 732MHz. So bandwidth appears not to be an issue.

I did not test timings though.

To satisfy my own curiosity I'd like to see a Propus CPU run this program. It seems the PhenomII's greater L3 cache size didn't help it at all, and the Athlon x4 has the same 512KB.
 

Rhoxed

Golden Member
Jun 23, 2007
1,051
3
81
Where is this bench located? i would love to help out any with results, both phI and phII
 

Idontcare

Elite Member
Oct 10, 1999
21,110
59
91
Originally posted by: Rhoxed
Where is this bench located? i would love to help out any with results, both phI and phII

It's available if you'd like to try it, but be forewarned it is not a "push the button and sit back" type benchmark. It is quite a pain in the ass process, get in touch with slowspyder thru pm to find out how much work is involved, if you are still interested after that then pm me and I'll gladly get you setup.
 

DrMrLordX

Lifer
Apr 27, 2000
22,035
11,618
136
Speculative question: what holds back the clock speeds on Phenom II NBs? I rarely if ever hear of NB speeds higher than 2700 mhz. One wonders if Propus will be any different.
 

Idontcare

Elite Member
Oct 10, 1999
21,110
59
91
Originally posted by: DrMrLordX
Speculative question: what holds back the clock speeds on Phenom II NBs? I rarely if ever hear of NB speeds higher than 2700 mhz. One wonders if Propus will be any different.

It is interesting isn't it? The uncore clockspeed is so much lower than the core clockspeed even when folks do their best to overclock it that is really seems like the uncore was designed (as in layout and architecture) with intent to NOT operate at high clockspeed.

Meaning it's not that the uncore is underclocked at binning so as to enable more sellable chips, but rather it really does seem that the uncore clockspeed is truly targeted in silicon for the lowered clockspeed.

Until recently I was under the assumption that AMD (and Intel) would love to yield chips which the uncore clocks the same as the core, but they get so many chips that only function if the uncore is clocked as low as it is in the retail chips so that is why we get chips with this low uncore. But this assumption really does appear to be in error, because no one ever reports an uncore that clocks as well as the core, not even your random one-off excursion chip.
 

DrMrLordX

Lifer
Apr 27, 2000
22,035
11,618
136
Indeed. I'm wondering if it's something to do with the L3, but I could be wrong. Propus could answer some questions in that regard. As we all know, the memory controller on K8 operated in synch with the core(s), even at speeds well in excess of 3 ghz. So while K10 has had obvious and measurable increases in IPC over K8, all incarnations of K10 suffer from slower clock speeds on their memory controllers. If Propus can change that, it will be interesting to see how a Propus with a NB speed synched with its cores would stack up versus a Deneb with a 2.4-2.7 ghz NB speed with both CPUs running at core clock speeds of 3.5 ghz or higher. Or, to put a finer point on it, how fast would the NB/IMC have to be running to make up for the absence of L3 cache at core clock speeds in excess of current NB limitations on Phenom IIs?
 

Idontcare

Elite Member
Oct 10, 1999
21,110
59
91
Originally posted by: DrMrLordX
Indeed. I'm wondering if it's something to do with the L3, but I could be wrong. Propus could answer some questions in that regard. As we all know, the memory controller on K8 operated in synch with the core(s), even at speeds well in excess of 3 ghz. So while K10 has had obvious and measurable increases in IPC over K8, all incarnations of K10 suffer from slower clock speeds on their memory controllers. If Propus can change that, it will be interesting to see how a Propus with a NB speed synched with its cores would stack up versus a Deneb with a 2.4-2.7 ghz NB speed with both CPUs running at core clock speeds of 3.5 ghz or higher. Or, to put a finer point on it, how fast would the NB/IMC have to be running to make up for the absence of L3 cache at core clock speeds in excess of current NB limitations on Phenom IIs?

Spot-on analysis per your usual astute self. I had lost sight of the fact the K8-based X2's had IMC's that operated at the core clockspeed and the core clocks were 3GHz and higher with OC'ers.

I agree, Propus holds the key to expanding our understanding of the uncore clockspeed situation.

Does anyone recall if the uncore was overclocked above 3GHz territory in any of those WR suicide bench sessions with LN2 and LHe back before PhII was officially launched? If SlowSpyder is right about the uncore clockspeed seriously hampering the IPC of PhII as core clockspeed approaches 4GHz then those guys were leaving a LOT of performance on the table when shooting for WR's at 6GHz+ core clockspeeds.
 

DrMrLordX

Lifer
Apr 27, 2000
22,035
11,618
136
Originally posted by: Idontcare

Spot-on analysis per your usual astute self. I had lost sight of the fact the K8-based X2's had IMC's that operated at the core clockspeed and the core clocks were 3GHz and higher with OC'ers.

Thank you sirrah

I agree, Propus holds the key to expanding our understanding of the uncore clockspeed situation.

The one thing I fear about Propus is that the NB may represent some sort of ubercontroller or peer chip that is responsible for handling communications between the IMC, cores, and L3. If that's the beast that has problems with high clock speeds then AMD's entire design is hobbled.

Does anyone recall if the uncore was overclocked above 3GHz territory in any of those WR suicide bench sessions with LN2 and LHe back before PhII was officially launched? If SlowSpyder is right about the uncore clockspeed seriously hampering the IPC of PhII as core clockspeed approaches 4GHz then those guys were leaving a LOT of performance on the table when shooting for WR's at 6GHz+ core clockspeeds.

If we can one of the articles reporting on AMD's public LN2 and LHe demos, we should get our answer. I recall NB speeds of around 3.2 ghz but I could be wrong.

And you're right, that is a lot of performance left on the table . . .
 

Denithor

Diamond Member
Apr 11, 2004
6,298
23
81
Found the benchmarking I referenced previously.

Thread.

The min fps in FC2 is the most interesting to me: linky.

From the bottom (1 at bottom, 6 at top):
1,2,3: Overclocking GPU has much more impact than OCing CPU
2,3,5: Overclocking CPU + uncore has more impact that OCing GPU (for tripleCF)

What I find fascinating is the fact that the minimum fps for a single card is virtually flat no matter what you do - OC GPU and/or CPU to no real effect. However, overclocking the uncore gives a nice boost in performance when you add that third card to the mix (also you get almost no benefit from the third card if you run all stock or OC just the CPU without also bumping up the uncore).

This tells me the speed of the IMC has a major impact on managing data flow to multiple cards.
 

Idontcare

Elite Member
Oct 10, 1999
21,110
59
91
Originally posted by: Denithor
This tells me the speed of the IMC has a major impact on managing data flow to multiple cards.

For AMD could it be more the HT aspects of the integrated NB (of which the IMC is a component) which communicates with the PCIe cards?

I ask about this distinction as this is one aspect that overclocking the uncore on an i7 would probably not boost as overclocking the uncore is not the same as overclocking QPI or the part of the bus that communicates with the PCIe slots (the parts of the NB that are still in the discrete x58 NB chipset).

I'm not trying to call shens on your assessment, I find myself woefully uneducated about which part of these chips does what exactly so I'm really asking questions here based on what little I think I know at the understood risk of betraying my sizable ignorance on just about all things unrelated to the CPU core

edit: changed my word discreet to discrete to avoid the obligatory wet-fish splappage
 

Denithor

Diamond Member
Apr 11, 2004
6,298
23
81
Could be, I assumed the "uncore" portion of both architectures contained the same basic components? Or is that incorrect?

Either way it looks like something in the uncore helps multiGPU setups perform much more efficiently than they do on the older architectures.

Now, I wonder how these results compare to the PhI chips? As noted above, they have an uncore that runs at the same speed as the core (3GHz or higher possible). Even though those chips are hamstrung by insufficient L3$ could they perhaps show even better multiGPU scaling?
 

DrMrLordX

Lifer
Apr 27, 2000
22,035
11,618
136
Actually Phenom I chips have an asynchronous NB speed (default on Phenom Is were 1.8 ghz or 2 ghz depending on stepping/model I think). K8s were the last chips AMD produced with memory controllers synchronized with the core speed.
 

Flipped Gazelle

Diamond Member
Sep 5, 2004
6,666
3
81
I ran Cinebench r10's multi-CPU test, and provides us a little more data to back up SlowSpyder's discovery:

PhII X4 @ 3.4ghz, HT/NB @ 2080 - 13588 cb-cpu
HT @ 2080, NB @ 2340 - 13796
HT/NB @ 2340 - 14086
 

ultimahwhat

Member
Aug 13, 2008
166
0
71
n00b question: what's a good way to stress test for NB/HT/uncore OC on Phenom II? Will small FFTs work?
 

Idontcare

Elite Member
Oct 10, 1999
21,110
59
91
Originally posted by: ultimahwhat
n00b question: what's a good way to stress test for NB/HT/uncore OC on Phenom II? Will small FFTs work?

Large FFT.

Small FFT is really only for stressing the core and L1/L2 caches.
 

SlowSpyder

Lifer
Jan 12, 2005
17,305
1,001
126
Just figured I'd give this a quick bump since AT tested uncore overclcoking in their 720 BE review.

Bottom of page. By overclocking the uncore on the 720 with a single 4870 they got minor gains, but in Crossfire where the CPU has more work to do the uncore gained 9% in minimum frame rates and 11% in average frame rates. Not earth shattering, but not garbage gains either.

I just figured this was one more scenario where it shows that as your core speed increases (they were running a healthy 3.8GHz on the 720) pushing the uncore a bit can certainly give you a bit more benefit than just core speed alone.
 

Denithor

Diamond Member
Apr 11, 2004
6,298
23
81
Very interesting testing results there. Wish they had looked at similar scaling for the i7 920 though - crank up that QPI and see how much impact it has.

FC2 especially seems sensitive to the bandwidth afforded by i7 - check out how the stock-clocked i7 920 + CF put the smack-down on all the other processors heavily overclocked.
 

Idontcare

Elite Member
Oct 10, 1999
21,110
59
91
Originally posted by: heyheybooboo
Are we supposed to call the IMC/NB the 'uncore' now ?

It's not a "supposed to" or "have to" type situation.

It's just that if you refer to the uncore of either AMD or Intel chips very few people will have trouble understanding what you are referring to and why.

It also helps that for both Intel and AMD, the uncore region of their latest chips do operate in their own clock domain.

So referring to things like the "clockspeed of the uncore" mean quite practically the same thing regardless whether you are talking about Nehalem or Phenom/PhenomII.

Its just a convenience thing for practical communication. Expect your more diehard AMD fans to resist the notion of calling something AMD had released first by Intel's nomenclature.

Personally I am not interested in the debate on the semantics, talk uncore and we ALL know what you are attempting to communicate. If a person can't be bothered to engage the discussion for simple fact that the terminology of uncore has been applied to AMD chips then that poster has likely done the rest of the thread a favor.
 

DrMrLordX

Lifer
Apr 27, 2000
22,035
11,618
136
Out of curiosity, are there many production AM2+ or AM3 boards out there that support disabling L3 cache on Phenom IIs? I asked about this over a year ago when Phenom Is were struggling with the TLB bug and reasoned that the bug could be end-run by disabling the chip's L3, and found out that very few boards supported disabling L3 cache.
 

DrMrLordX

Lifer
Apr 27, 2000
22,035
11,618
136
Originally posted by: Idontcare

Does anyone recall if the uncore was overclocked above 3GHz territory in any of those WR suicide bench sessions with LN2 and LHe back before PhII was officially launched? If SlowSpyder is right about the uncore clockspeed seriously hampering the IPC of PhII as core clockspeed approaches 4GHz then those guys were leaving a LOT of performance on the table when shooting for WR's at 6GHz+ core clockspeeds.

I actually found an article that answers your question (sure took me long enough):

http://www.tgdaily.com/content/view/41204/113/

According to this article, they finally settled on a NB speed of 3.920 ghz with an NB voltage of 1.425v.

The ratio of core speed to NB speed is pretty ridiculous and hints at poor scaling in memory/cache intensive apps.

The memory speeds also left something to be desired (DDR2-1120 5-5-5-18? Come on guys! I did DDR2-1100 5-5-5-15 stable on an Abit nf-m2 nview using DDR2-800 modules).
 

Idontcare

Elite Member
Oct 10, 1999
21,110
59
91
:thumbsup: Awesome find!

<4GHz on Liquid helium, definitely says she's not designed or targeted to operate at core clockspeeds even when temp/thermal barriers are removed.
 
sale-70-410-exam    | Exam-200-125-pdf    | we-sale-70-410-exam    | hot-sale-70-410-exam    | Latest-exam-700-603-Dumps    | Dumps-98-363-exams-date    | Certs-200-125-date    | Dumps-300-075-exams-date    | hot-sale-book-C8010-726-book    | Hot-Sale-200-310-Exam    | Exam-Description-200-310-dumps?    | hot-sale-book-200-125-book    | Latest-Updated-300-209-Exam    | Dumps-210-260-exams-date    | Download-200-125-Exam-PDF    | Exam-Description-300-101-dumps    | Certs-300-101-date    | Hot-Sale-300-075-Exam    | Latest-exam-200-125-Dumps    | Exam-Description-200-125-dumps    | Latest-Updated-300-075-Exam    | hot-sale-book-210-260-book    | Dumps-200-901-exams-date    | Certs-200-901-date    | Latest-exam-1Z0-062-Dumps    | Hot-Sale-1Z0-062-Exam    | Certs-CSSLP-date    | 100%-Pass-70-383-Exams    | Latest-JN0-360-real-exam-questions    | 100%-Pass-4A0-100-Real-Exam-Questions    | Dumps-300-135-exams-date    | Passed-200-105-Tech-Exams    | Latest-Updated-200-310-Exam    | Download-300-070-Exam-PDF    | Hot-Sale-JN0-360-Exam    | 100%-Pass-JN0-360-Exams    | 100%-Pass-JN0-360-Real-Exam-Questions    | Dumps-JN0-360-exams-date    | Exam-Description-1Z0-876-dumps    | Latest-exam-1Z0-876-Dumps    | Dumps-HPE0-Y53-exams-date    | 2017-Latest-HPE0-Y53-Exam    | 100%-Pass-HPE0-Y53-Real-Exam-Questions    | Pass-4A0-100-Exam    | Latest-4A0-100-Questions    | Dumps-98-365-exams-date    | 2017-Latest-98-365-Exam    | 100%-Pass-VCS-254-Exams    | 2017-Latest-VCS-273-Exam    | Dumps-200-355-exams-date    | 2017-Latest-300-320-Exam    | Pass-300-101-Exam    | 100%-Pass-300-115-Exams    |
http://www.portvapes.co.uk/    | http://www.portvapes.co.uk/    |