IBM Power 7 Released

brybir

Senior member
Jun 18, 2009
241
0
0
http://www.informationweek.com/shar...4WWZXPQE1GHOSKH4ATMY32JVN?articleID=222700387


Anyone have any insight on these processors? We get focused on the consumer x86 world around here, but this announcement has many of the same buzzwords as many of intel's latest announcement and looks interesting including:

---Power7 chips will run between 3.0GHz and 4.14GHz and will come with four, six, or eight cores

---45nm

---said to deliver twice the performance of older Power6 systems, but are four times more energy efficient.

---Power7 chips can run 32 simultaneous tasks thanks to an 8-core architecture and four virtual cores, or threads, per core.

---Power7 features TurboCore mode...TurboCore shifts resources from non-active cores to active cores on-the-fly to increase memory, bandwidth and clock speed.

---Power7's "Intelligent Threads" technology also affords dynamic resource allocation depending on workloads,

---Memory Expansion uses compression technology to virtually double the amount of physical memory available to an application.



In any event, I know that the POWER series is up competing against Itanium and Sun etc for the *NIX server spaces, and is quite different in many ways, but I think its interesting to see how IBM implements something like Intel's TurboMode, but in a different (better?) way, as well as their interesting memory expansion technology (although I am not sure how that works in real world performance)
 

Nemesis 1

Lifer
Dec 30, 2006
11,366
2
0
http://www.informationweek.com/shar...4WWZXPQE1GHOSKH4ATMY32JVN?articleID=222700387


Anyone have any insight on these processors? We get focused on the consumer x86 world around here, but this announcement has many of the same buzzwords as many of intel's latest announcement and looks interesting including:

---Power7 chips will run between 3.0GHz and 4.14GHz and will come with four, six, or eight cores

---45nm

---said to deliver twice the performance of older Power6 systems, but are four times more energy efficient.

---Power7 chips can run 32 simultaneous tasks thanks to an 8-core architecture and four virtual cores, or threads, per core.

---Power7 features TurboCore mode...TurboCore shifts resources from non-active cores to active cores on-the-fly to increase memory, bandwidth and clock speed.

---Power7's "Intelligent Threads" technology also affords dynamic resource allocation depending on workloads,

---Memory Expansion uses compression technology to virtually double the amount of physical memory available to an application.



In any event, I know that the POWER series is up competing against Itanium and Sun etc for the *NIX server spaces, and is quite different in many ways, but I think its interesting to see how IBM implements something like Intel's TurboMode, but in a different (better?) way, as well as their interesting memory expansion technology (although I am not sure how that works in real world performance)

Well you took time to read this . Now read about the new Itanium that was just released. If your interested in this type of Cpus its a good read also.
 

IntelUser2000

Elite Member
Oct 14, 2003
8,686
3,786
136
The amazing thing about Power 7 is that it has absolutely monstrous multi-threading performance yet its single thread performance is likely to be very good as well. Even in single thread, the performance per clock should be on par with Deneb, while clocking substantially higher and being 8 cores.

It also integrates for the first time in commercial CPU history, eDRAM, or embedded DRAM. Those that don't know can think of conventional RAM technology on the CPU. The benefit of DRAM on CPU die is that is only takes 1 transistor+1 capacitor for 1 bit, while SRAM, which is what's used in caches uses 6 transistors. The access latency is higher on eDRAM than SRAM, but at this size, it might actually be not so bad since the array will be smaller than equivalent capacity SRAM.

3.8GHz Power 7 is estimated to be more than 50% faster than the 8 core, Nehalem based, Nehalem-EX running at 2.26GHz in multi-threaded apps. Now if you realize how formidable Nehalem EP is already, you can imagine how fast Power 7 is. That's all while using 45nm process technology.

At 3.8GHz, the TDP is 200W.
 

Voo

Golden Member
Feb 27, 2009
1,684
0
76
The benefit of DRAM on CPU die is that is only takes 1 transistor+1 capacitor for 1 bit, while SRAM, which is what's used in caches uses 6 transistors. The access latency is higher on eDRAM than SRAM, but at this size, it might actually be not so bad since the array will be smaller than equivalent capacity SRAM.
But there's a reason everyone uses SRAM for caches although it's much more expensive.. I don't know current numbers so maybe that has changed, but in the past DRAM had a order of magnitude higher latency.
 

jason166

Member
Dec 11, 2009
56
1
71
3.8GHz Power 7 is estimated to be more than 50% faster than the 8 core, Nehalem based, Nehalem-EX running at 2.26GHz in multi-threaded apps. Now if you realize how formidable Nehalem EP is already, you can imagine how fast Power 7 is. That's all while using 45nm process technology.

Based on that comparison (3.8/2.26=1.5x) it would seem that the Nehalem arch offers 1.12 X more performance on a clock per clock basis.

Looking at the product page here:
http://www-03.ibm.com/systems/power/hardware/750/browse_aix.html

The low end Cpu (8 core) 3 Ghz System with 32 GB of ram is priced at $34,152.0 (With a warranty)


Let's see:
- - - - - - - - - - -
Low End Power 7 = 8*(3.0Ghz) = 24 (Power7CpuGhz)

Upcoming Gulftown System = 6*(3.6 Ghz) * 1.12 = 24.2 (Power7CpuGhz)

(This also assumes 8 cores scale as well as 6, which we know is not true even under the most parallel workloads)

So it would seem with a very conservative overclock on a single socket Gulftown system, entry level Power 7 performance can be had on nearly 1/10th the budget.

- Jason
 

nyker96

Diamond Member
Apr 19, 2005
5,630
2
81
Based on that comparison (3.8/2.26=1.5x) it would seem that the Nehalem arch offers 1.12 X more performance on a clock per clock basis.

Looking at the product page here:
http://www-03.ibm.com/systems/power/hardware/750/browse_aix.html

The low end Cpu (8 core) 3 Ghz System with 32 GB of ram is priced at $34,152.0 (With a warranty)


Let's see:
- - - - - - - - - - -
Low End Power 7 = 8*(3.0Ghz) = 24 (Power7CpuGhz)

Upcoming Gulftown System = 6*(3.6 Ghz) * 1.12 = 24.2 (Power7CpuGhz)

(This also assumes 8 cores scale as well as 6, which we know is not true even under the most parallel workloads)

So it would seem with a very conservative overclock on a single socket Gulftown system, entry level Power 7 performance can be had on nearly 1/10th the budget.

- Jason

Although this is true, the power 7 package probably include technical support costs which is quite a bit over the life span of a system so you can't just compare it like hardware costs. I remember my old grad lab has a few Dec Alphas, they were damn fast machines for then and we ran all our speech synthesis codes on it but I always run into software problems and occasional hardware problems with them. luckily we bought a tech support package when we got these machines so I get to talk to the techies at Dec without access to these guys would have been impossible to make the machine work 100% of the time.
 

heyheybooboo

Diamond Member
Jun 29, 2007
6,278
0
0
Based on that comparison (3.8/2.26=1.5x) it would seem that the Nehalem arch offers 1.12 X more performance on a clock per clock basis.

Looking at the product page here:
http://www-03.ibm.com/systems/power/hardware/750/browse_aix.html

The low end Cpu (8 core) 3 Ghz System with 32 GB of ram is priced at $34,152.0 (With a warranty)


Let's see:
- - - - - - - - - - -
Low End Power 7 = 8*(3.0Ghz) = 24 (Power7CpuGhz)

Upcoming Gulftown System = 6*(3.6 Ghz) * 1.12 = 24.2 (Power7CpuGhz)

(This also assumes 8 cores scale as well as 6, which we know is not true even under the most parallel workloads)

So it would seem with a very conservative overclock on a single socket Gulftown system, entry level Power 7 performance can be had on nearly 1/10th the budget.

- Jason



An "entry-level" Power7 chassis will provide four cards or 'sockets' with 128 Gb per processor card. This most likely is comparable to the upcoming "Beckton" arch from Intel. Power7 doesn't do 'Windows'.

"Mid-range" Power7 will provide 2 CPUs per card with access to 'terabytes' of memory.

My understanding is that Power7 has a 'TurboCore' mode under which half the cores will be 'overclocked' beyond 4GHz.

Just thought you might want to know .....






--
 
Last edited:

IntelUser2000

Elite Member
Oct 14, 2003
8,686
3,786
136
Based on that comparison (3.8/2.26=1.5x) it would seem that the Nehalem arch offers 1.12 X more performance on a clock per clock basis.

Looking at the product page here:
http://www-03.ibm.com/systems/power/hardware/750/browse_aix.html

The low end Cpu (8 core) 3 Ghz System with 32 GB of ram is priced at $34,152.0 (With a warranty)


Let's see:
- - - - - - - - - - -
Low End Power 7 = 8*(3.0Ghz) = 24 (Power7CpuGhz)

Upcoming Gulftown System = 6*(3.6 Ghz) * 1.12 = 24.2 (Power7CpuGhz)

(This also assumes 8 cores scale as well as 6, which we know is not true even under the most parallel workloads)

So it would seem with a very conservative overclock on a single socket Gulftown system, entry level Power 7 performance can be had on nearly 1/10th the budget.

- Jason

Nah, the reason for the excellent multi-threading performance lies in its SMT capabilities and sheer bandwidth. I said more than 50% faster than Nehalem EX at 2.26GHz. Some benchmarks put it 60-80%.

-Each chip has TWO quad-channel DDR3 memory controllers for 100GB/s bandwidth
-4-way SMT is said to bring more than 40% benefit in lots of the benchmarks

I think the SMT is the killer here. Gulftown does not have the I/O bandwidth or interconnect necessary to enable good scaling in database applications while its home turf for Power 7 and Nehalem EX.

These guys are really amazing with their CPU interconnects. You are right the price/performance and power/performance is better on Gulftown, but if you can put multiple Gulftown systems together where Power 7 will go to, it'll stand absolutely no chance against it.

Intel is kinda going there with efforts like Nehalem-EX, but it isn't there yet. It's performance will be amazing in its own right as you'll see soon.
 

PlasmaBomb

Lifer
Nov 19, 2004
11,636
2
81
IBM are going to use Power7 chips in their Blue Waters project...

the computer will theoretically be capable of achieving 10 petaflops, about 10 times as fast as the fastest supercomputer today.

It may top out at 16 petaflops, with a sustained speed of 1 petaflop...

Link

At 567mm^2 and 32MB of on-die L3 cache, the new CPU is something of a beast (die shrink plz?).

Power7 Performance data
http://www-03.ibm.com/systems/power/hardware/780/perfdata.html

more
http://www-03.ibm.com/systems/power/hardware/750/perfdata.html


On Nehalem EX-

There's no questioning the formidability of an octal-core Nehalem, but the CPU packs a whopping 2.3 billion transistors, 24 MB of cache, and die size that could be as large as 700mm^2.

 
Last edited:

IntelUser2000

Elite Member
Oct 14, 2003
8,686
3,786
136
eDRAM may be bad in PC or low processor count servers, but the latency difference might be overcome by the capacity and die size advantage in servers. I heard the eDRAM advantage starts somewhere at the low-10MBs range and with 32MB, its definitely over that. Regular DRAM might be 10x slower than SRAM but versions that will be integrated into CPUs won't be as big. Of course the CPU designers will make it decently fast and low latency to be acceptable.

I really believe "Haswell" timeframe is where we'll see mass eDRAM or other space saving cache implementations with capacities that are equivalent to low end dedicated cards of today. Personally I think Intel themselves won't do eDRAM for performance CPUs, but opt for alternatives like 2T SRAM instead.

eDRAM is probably the primary difference why Power 7 is 560mm2 while Nehalem EX is 600mm2+.

There's no questioning the formidability of an octal-core Nehalem, but the CPU packs a whopping 2.3 billion transistors, 24 MB of cache, and die size that could be as large as 700mm^2.

Thing that I'm most impressed is the Ring Bus which will have bandwidth of 1.2TB/s.
 

Soulkeeper

Diamond Member
Nov 23, 2001
6,721
145
106
Don't forget the new niagara3 also
It can apparently execute 128 threads
Things just got more interesting in the server space
 
Last edited:

IntelUser2000

Elite Member
Oct 14, 2003
8,686
3,786
136
The only place Niagara has been good at is very niche apps. It does have price/performance power/performance at those sectors though I guess. It's single threaded, or per thread performance sucks.

Power 7 cache latency
-L1 cache latency went from 4 cycles in Power 6 to 2 cycles at 32KB
-L2 cache latency is 8 cycles with 256KB
-eDRAM L3 cache "Slice" latency is 25 cycles running at half the clock frequency

BTW, "TurboCore" only clocks up by 10%, so its nothing amazing.
 

PlasmaBomb

Lifer
Nov 19, 2004
11,636
2
81
The only place Niagara has been good at is very niche apps. It does have price/performance power/performance at those sectors though I guess. It's single threaded, or per thread performance sucks.

Power 7 cache latency
-L1 cache latency went from 4 cycles in Power 6 to 2 cycles at 32KB
-L2 cache latency is 8 cycles with 256KB
-eDRAM L3 cache "Slice" latency is 25 cycles running at half the clock frequency

BTW, "TurboCore" only clocks up by 10%, so its nothing amazing.

eDram L3 - does that mean the latency viewed from the CPU is 50 cycles??
 
sale-70-410-exam    | Exam-200-125-pdf    | we-sale-70-410-exam    | hot-sale-70-410-exam    | Latest-exam-700-603-Dumps    | Dumps-98-363-exams-date    | Certs-200-125-date    | Dumps-300-075-exams-date    | hot-sale-book-C8010-726-book    | Hot-Sale-200-310-Exam    | Exam-Description-200-310-dumps?    | hot-sale-book-200-125-book    | Latest-Updated-300-209-Exam    | Dumps-210-260-exams-date    | Download-200-125-Exam-PDF    | Exam-Description-300-101-dumps    | Certs-300-101-date    | Hot-Sale-300-075-Exam    | Latest-exam-200-125-Dumps    | Exam-Description-200-125-dumps    | Latest-Updated-300-075-Exam    | hot-sale-book-210-260-book    | Dumps-200-901-exams-date    | Certs-200-901-date    | Latest-exam-1Z0-062-Dumps    | Hot-Sale-1Z0-062-Exam    | Certs-CSSLP-date    | 100%-Pass-70-383-Exams    | Latest-JN0-360-real-exam-questions    | 100%-Pass-4A0-100-Real-Exam-Questions    | Dumps-300-135-exams-date    | Passed-200-105-Tech-Exams    | Latest-Updated-200-310-Exam    | Download-300-070-Exam-PDF    | Hot-Sale-JN0-360-Exam    | 100%-Pass-JN0-360-Exams    | 100%-Pass-JN0-360-Real-Exam-Questions    | Dumps-JN0-360-exams-date    | Exam-Description-1Z0-876-dumps    | Latest-exam-1Z0-876-Dumps    | Dumps-HPE0-Y53-exams-date    | 2017-Latest-HPE0-Y53-Exam    | 100%-Pass-HPE0-Y53-Real-Exam-Questions    | Pass-4A0-100-Exam    | Latest-4A0-100-Questions    | Dumps-98-365-exams-date    | 2017-Latest-98-365-Exam    | 100%-Pass-VCS-254-Exams    | 2017-Latest-VCS-273-Exam    | Dumps-200-355-exams-date    | 2017-Latest-300-320-Exam    | Pass-300-101-Exam    | 100%-Pass-300-115-Exams    |
http://www.portvapes.co.uk/    | http://www.portvapes.co.uk/    |