semi-accurate has a pretty in-depth article about Haswell's Graphics..
http://semiaccurate.com/2012/04/02/haswells-gpu-prowess-is-due-to-crystalwell/
http://semiaccurate.com/2012/04/02/haswells-gpu-prowess-is-due-to-crystalwell/
semi-accurate has a pretty in-depth article about Haswell's Graphics..
http://semiaccurate.com/2012/04/02/haswells-gpu-prowess-is-due-to-crystalwell/
"We are told the GT3 variants of Haswell will have 64MB of on-package memory connected through an ultra-wide bus."
Will that really be enough? Discrete GFX cards have like 1-3 GB of RAM!
Seems like the worst case scenario will be really bad with only 64 MB. Then you'll sometimes have to swap out the content of that 64 MB memory, and populate it from normal system RAM, i.e. you fall back to the same memory bandwidth as today with Sandy/Ivy Bridge? So the worst case will not be any better than with those CPUs, or?
So you might get very high performance for some time while gaming, and then from time to time the performance drops radically when the 64 MB cache memory will have to be swapped out? So the rendering will stutter?
"In the end, the massive bandwidth, coupled with the 5x increase in shader performance, will mean Haswell is a real graphics monster."
Can that really be true? 5x performance increase compared to Ivy Bridge HD4000 IGP?
The VR-Zone article estimates a 2-3x increase instead (see: http://vr-zone.com/articles/mystery...up-the-graphics-ante-further-again/15272.html).
Are either realistic? :hmm:
"We are told the GT3 variants of Haswell will have 64MB of on-package memory connected through an ultra-wide bus."
Will that really be enough? Discrete GFX cards have like 1-3 GB of RAM!
Exactly RAM, not cache. Cache is a lot faster.
If its used like the cache on the xbox 360 (20MB), it could really help.
The xbox 360 edram is 10MB, no? Have to remember that it was originally built on 90nm, and at that node it was a pretty fair sized die.
Regardless, the smaller size just results in it being unable to store all game textures... but textures aren't the only source of bandwidth consumption. Unfortunately it's annoying to find current figures for the various sources of bandwidth consumption, it used to account for around 75%. Even at high resolutions, Z, color, and render target buffers should fit within a 64MB edram. If those buffers still account for a fair amount of bandwidth, then removing them and having only textures in main memory could result in a marked difference.
So, if 10MB is good for ~640P, what resolution will 64MB be good for? 1080P? I am guessing it is not a linear scale?
But GPUs have L1 and L2 caches now too...It's about the workloads. The typical cpu is interested in low-latency accesses to a small subset of the memory, and is thus well served with a good cache hierarchy. The SNB cache system has a total hitrate well in excess of 95%, which means you get some 20 times more realized bandwidth than what your memory provides.
The typical GPU workload consists of rapidly streaming through large data sets. This is essentially uncacheable, as accessing an item of memory makes it the least likely one to be accessed again in the near future. So what you want is just raw bandwidth.
I was under the impression that the caches on GPUs today were more for HPC tasks than for graphics tasks. I could be wrong however.But GPUs have L1 and L2 caches now too...
Can that really be true? 5x performance increase compared to Ivy Bridge HD4000 IGP?
The VR-Zone article estimates a 2-3x increase instead (see: http://vr-zone.com/articles/mystery...up-the-graphics-ante-further-again/15272.html).
Are either realistic? :hmm:
I was under the impression that the caches on GPUs today were more for HPC tasks than for graphics tasks. I could be wrong however
How much would something like this cost? 64MB on-package memory, L4 cache... I'm afraid to ask.
errr...in some older leak, semiaccurate said it was 5x the sandy bridge, and fit's very well the the VR-Zone performance numbers
It wouldn't cost too much if they do not use traditional SRAM technology. Remark that Power 7 use 45nm, still get 32nm eDRAM on dieHow much would something like this cost? 64MB on-package memory, L4 cache... I'm afraid to ask.
But GPUs have L1 and L2 caches now too...
The XB360 edram is never used to store any textures. It is strictly a render target. (The rops are actually on the daughter die, and they are tightly integrated in the memory pool.)Regardless, the smaller size just results in it being unable to store all game textures... but textures aren't the only source of bandwidth consumption.
Deferred rendering and deferred texturing have switched this around -- they make the texture lookups from the big pools relatively rare, but massively increase the writing to (and reading from) the render targets. Texturing (from the big pool) would then typically be less than 30% of the total bw, and most of the "texture lookups" would actually happen from the render targets of the previous phases.Unfortunately it's annoying to find current figures for the various sources of bandwidth consumption, it used to account for around 75%.
Even at high resolutions, Z, color, and render target buffers should fit within a 64MB edram. If those buffers still account for a fair amount of bandwidth, then removing them and having only textures in main memory could result in a marked difference.
So, if 10MB is good for ~640P, what resolution will 64MB be good for? 1080P? I am guessing it is not a linear scale?
<snip>
This is only for traditional rendering schemes. Deferred rendering techniques write a lot more than just a few color values into the render targets. Think more like a few stages writing 16 bytes per stage (per pixel, per frame). 64MB would probably hurt BF3 at 1080p.
This is all related a little in that one of the reasons of going deferred rendering is that it gains relatively a lot from bigger caches on the GPU.
Thanks for the very informative post :thumbsup: