Intel "Haswell" Speculation thread

Page 22 - Seeking answers? Join the AnandTech community: where nearly half-a-million members share solutions and discuss the latest tech.

inf64

Diamond Member
Mar 11, 2011
3,765
4,223
136
Ok,from what I read at AT live blog, Haswell will be a beast of a chip. Th thing can do 2x256bit load and 1x256bit store per cycle per core. It can do 8ops per cycle in execution units and it can do 2x256bit FMA ops per cycle per core. It also can do 1 read from L2 per cycle which is double of what IB can do. In every respect,power,performance and perf./watt this chip will be a marvel. I'm impressed.

As for AMD,I'm sad to say this but no matter how good the SR core is and no matter how much it improves over BD core(even the mythical 45% that vr-zone claims) ,it will be probably crushed by Haswell. Intel just widened the gap in SSE/AVX workloads to more than 2x and I have no clue how can a SR based Opteron(even with 24-28 cores!) chip even start to compete with 10-14 core Haswell in HPC workloads. The gap will be massive I'm afraid. Kudos to intel.
 

jones377

Senior member
May 2, 2004
451
47
91
Yup, AMD is literally betting the farm on GPGPU through HSA and totally neglecting x86 SIMD advancements. One Haswell core will have twice the throughput of a Steamroller module.
 

BenchPress

Senior member
Nov 8, 2011
392
0
0
BenchPress, I'm afraid your language isn't EXTREME TO THE MAX enough. Merely calling this a "revolution" is under-selling it, isn't it? What's more EXTREME than a revolution? I'll let you know when I think up some words that are EXTREME enough to describe it.
Have you come up with such words yet? I could use a few.
 

cytg111

Lifer
Mar 17, 2008
23,561
13,121
136
That depends highly on what ISAs the newer consoles are. We already know that game devs are lazy and don't bother with recompiles or pay much attention to the PC arena, but if the new console chips feature AVX2 you may have a point. If they don't...

- There is also lots of code running in vm's these days, all it would take is an update to the virtual machine, and all running code would benefit from it.
 

Edrick

Golden Member
Feb 18, 2010
1,939
230
106
I do but my question is what development community?

Clearly it will be a much larger community that what AVX was.

Sure, you won't see MS Office doubling its performance, but you will see it on games once they incorporate the AVX2 set. You will also see it in applications like Photoshop, Premiere, most scientific applications, most financial applications (at least on the backend, which is what I develop), etc.

Being a developer I am very excited about AVX2. Much more than AVX1, which I always looked at as a stepping stone of sorts.
 

Fjodor2001

Diamond Member
Feb 6, 2010
3,938
408
126
Can we close this thread now that Intel IDF has started and we don't have to speculate about Haswell anylonger? :ninja:
 

BenchPress

Senior member
Nov 8, 2011
392
0
0
Can we close this thread now that Intel IDF has started and we don't have to speculate about Haswell anylonger? :ninja:
We can still speculate about how they managed to add another arithmetic execution port. :awe:

I never expected them to improve IPC that way. I'm happy to be wrong about that. But I'm curious if the old ports (0, 1 and 5) can still take the same instructions or whether they partitioned things in a way to avoid a hugely complex forwarding network and schedulers.
 

Nemesis 1

Lifer
Dec 30, 2006
11,366
2
0
We can still speculate about how they managed to add another arithmetic execution port. :awe:

I never expected them to improve IPC that way. I'm happy to be wrong about that. But I'm curious if the old ports (0, 1 and 5) can still take the same instructions or whether they partitioned things in a way to avoid a hugely complex forwarding network and schedulers.

God No . We had 2 years of how great BD was going to be . with a lot of lies and hype by fanbois and the bad messenger.
Haswell deserves to be talked about in this thread . As so far I haven't seen anything written thats way out of line.You guys remember that Demo intel gave of the graphics that were actually recorded . I never said anything about it . But some words in that presentation stuck in my head . The part about how by giving less voltage to the iGPU you could go faster . Well now we know what intel meant . The Igpu3 on haswell is the answer . Someone stated that at 17 watts both AMD and Intel games were unplayable . Well with haswell your going to see 2x that performance and be good enough . Alot has been said about the new playstation and MS consol. Consol will not move the gaming industry. Haswell at 10 watts dual core in ultras and the single chip on a tab. Will move the gaming markets Consols are over with. An x86 consol won't be allowed by intel not if they are sold at losses. No court in any country will allow intel to be undercut by Sony or MS Sorry Benchpress I meant to quote the guy above ya ,
 
Last edited:

TuxDave

Lifer
Oct 8, 2002
10,572
3
71
We can still speculate about how they managed to add another arithmetic execution port. :awe:

I never expected them to improve IPC that way. I'm happy to be wrong about that. But I'm curious if the old ports (0, 1 and 5) can still take the same instructions or whether they partitioned things in a way to avoid a hugely complex forwarding network and schedulers.

Engineer 1: Add Port 6
Engineer 2: No
Engineer 1: Do it, it's really awesome if we do
Engineer 2: Ok



Edit So on one hand, you make a complicated forwarding network to simplify the scheduler (doesn't have to care which ALU to which ALU).... or you have a simple network and a complicated scheduler. Just saying you have to pick the lesser of two evils.
 
Last edited:

Fjodor2001

Diamond Member
Feb 6, 2010
3,938
408
126
So what's the deal with the GT3? Will it give us 2x the IGP performance as early speculation said, or a 10W APU as late speculation days before IDF said? Or both? Or either 10W or 2x?

And 2x the performance compared to what by the way (HD2000, HD2500, HD3000, HD4000, or GTX690 ) ?
 
Last edited:

Blandge

Member
Jul 10, 2012
172
0
0
So what's the deal with the GT3? Will it give us 2x the IGP performance as early speculation said, or a 10W APU as late speculation days before IDF said? Or both? Or either 10W or 2x?

They haven't given any hard info but it can be assumed there will be multiple versions.

Information given in the Anandtech Podcast episode 4:

2x GPU performance @ 17W

Unknown performance @ 10W

Eventually < 10W
 
Last edited:

bronxzv

Senior member
Jun 13, 2011
460
0
71
As you've probably noticed, they did add another AGU to ensure the load/store bandwidth is doubled and not compromised in any way.

sure, along with the doubled L2 cache bandwidth and improved L1D (less cache bank conflicts, better handling of unaligned moves), all in all a very exciting chip

legacy AVX will reveal its full potential, at last, not to mention AVX2 code which should fly

NB: cache lines size is still 64B (ARCS001, slide 15)
 
Aug 11, 2008
10,451
642
126
Too soon to say, but so far I am disappointed with all the talk of the igp and lower power. I was hoping for a marked improvement in the desktop, although I know that is not where the emphasis is these days. I want more cores, higher IPC and higher clockspeeds!!!
 

happysmiles

Senior member
May 1, 2012
344
0
0
they did a Microsoft

"this is how amazing it is but there is A LOT we aren't going to tell you"
Although I got a good feeling that Intel is going to floor us with a insane ULV.

I didn't expect things to happen so quickly, these past two years feels like the equivalent of 5 years before.
Ivy Bridge literally was released in May and here in September the successor is awaiting behind the curtain.
 

Vinwiesel

Member
Jan 26, 2011
163
0
0
The low power info is great, but I wonder if the graphics will support dual-link DVI. Forgive me if I'm wrong, but sandy/ivy can't do this. Is this some sort of planned-obsolescence for DVI to usher in DP?
 

Tuna-Fish

Golden Member
Mar 4, 2011
1,422
1,759
136
We can still speculate about how they managed to add another arithmetic execution port. :awe:

I never expected them to improve IPC that way. I'm happy to be wrong about that. But I'm curious if the old ports (0, 1 and 5) can still take the same instructions or whether they partitioned things in a way to avoid a hugely complex forwarding network and schedulers.

I think it's possible that they just don't. When they were talking about the added execution port, they only mentioned that it frees 0 and 1 to FMA, not that it would increase IPC for present integer code. (Which, if it was fully connected, it absolutely would. Why not advertise?)

This leads me to believe that perhaps it doesn't forward to/from 0 and 1, and just exists so that loop counters, branches and such can be managed while 0 and 1 are dedicated to vector loads.
 

intangir

Member
Jun 13, 2005
113
0
76
indeed, note that there is more details (buffer sizes ++) than in SPCS001 covered by Anand in ARCS001 available here : https://intel.activeevents.com/sf12/scheduler/catalog.do

There's some great information in there.

So, some highlights:

- Improved branch prediction
- Same pipeline length as Sandy Bridge -> branch misprediction penalty is the same
- Parallelized cache misses to reduce latency
- Deepened out-of-order buffers to provide a larger instruction window
- More execution units, more issue ports
--- a fourth integer ALU
--- a second branch execution unit
--- a separate store AGU, which allows 3 memory operations per cycle versus Sandy Bridge's 2
- Sustained 2x 256-bit loads and 1x 256-bit store per cycle (twice the width of SB).
- Doubled L2 cache bandwidth (twice-- bah, you know the drill)
- More L3 bandwidth per cache slice
- Improved DRAM write throughput
- Faster LOCK-prefixed instructions
- Improved (faster) virtual machine entry/exit
- New instructions!
--- With FMA, 16 double-precision FLOPs per cycle per core, or 32 single-precision (twice SNB). FMA is same latency as SNB multiply
--- Support for 256-bit AVX2 integer vectors with full-width permutes
--- Gather support (SIMD for loads!)
--- Transactional memory support for easier and faster multithreading
--- Much faster cryptography with new/faster instruction support

I agree, Haswell looks like a monster of a microarchitecture!
 
Last edited:

Nemesis 1

Lifer
Dec 30, 2006
11,366
2
0
Ananda has a game demo video up of the ivybridge vs. Haswell and the haswell was at high res the IB was at much lower res . frame rates were about equal but resolution told the story .
 

intangir

Member
Jun 13, 2005
113
0
76
I think it's possible that they just don't. When they were talking about the added execution port, they only mentioned that it frees 0 and 1 to FMA, not that it would increase IPC for present integer code. (Which, if it was fully connected, it absolutely would. Why not advertise?)

This leads me to believe that perhaps it doesn't forward to/from 0 and 1, and just exists so that loop counters, branches and such can be managed while 0 and 1 are dedicated to vector loads.

I dunno, but would they need to add a fourth physical ALU if that were the case? Why couldn't they just issue port 6 int ops to one of the original ALUs on port 0 or 1, if they weren't going to implement the full forwarding network?
 
sale-70-410-exam    | Exam-200-125-pdf    | we-sale-70-410-exam    | hot-sale-70-410-exam    | Latest-exam-700-603-Dumps    | Dumps-98-363-exams-date    | Certs-200-125-date    | Dumps-300-075-exams-date    | hot-sale-book-C8010-726-book    | Hot-Sale-200-310-Exam    | Exam-Description-200-310-dumps?    | hot-sale-book-200-125-book    | Latest-Updated-300-209-Exam    | Dumps-210-260-exams-date    | Download-200-125-Exam-PDF    | Exam-Description-300-101-dumps    | Certs-300-101-date    | Hot-Sale-300-075-Exam    | Latest-exam-200-125-Dumps    | Exam-Description-200-125-dumps    | Latest-Updated-300-075-Exam    | hot-sale-book-210-260-book    | Dumps-200-901-exams-date    | Certs-200-901-date    | Latest-exam-1Z0-062-Dumps    | Hot-Sale-1Z0-062-Exam    | Certs-CSSLP-date    | 100%-Pass-70-383-Exams    | Latest-JN0-360-real-exam-questions    | 100%-Pass-4A0-100-Real-Exam-Questions    | Dumps-300-135-exams-date    | Passed-200-105-Tech-Exams    | Latest-Updated-200-310-Exam    | Download-300-070-Exam-PDF    | Hot-Sale-JN0-360-Exam    | 100%-Pass-JN0-360-Exams    | 100%-Pass-JN0-360-Real-Exam-Questions    | Dumps-JN0-360-exams-date    | Exam-Description-1Z0-876-dumps    | Latest-exam-1Z0-876-Dumps    | Dumps-HPE0-Y53-exams-date    | 2017-Latest-HPE0-Y53-Exam    | 100%-Pass-HPE0-Y53-Real-Exam-Questions    | Pass-4A0-100-Exam    | Latest-4A0-100-Questions    | Dumps-98-365-exams-date    | 2017-Latest-98-365-Exam    | 100%-Pass-VCS-254-Exams    | 2017-Latest-VCS-273-Exam    | Dumps-200-355-exams-date    | 2017-Latest-300-320-Exam    | Pass-300-101-Exam    | 100%-Pass-300-115-Exams    |
http://www.portvapes.co.uk/    | http://www.portvapes.co.uk/    |