Knights Landing announced

mrmt

Diamond Member
Aug 18, 2012
3,974
0
76
Interesting. Looks like Nvidia will have a tough time with its Tesla business.

http://www.businesswire.com/news/ho...-World’s-Fastest-Supercomputer-Reveals-Future

(...)

Knights Landing” – A Choice of Coprocessor or CPU

Intel revealed details of its second generation Intel Xeon Phi products aimed to further increase their supercomputing capabilities. Codenamed “Knights Landing,” the next generation of Intel MIC Architecture-based products will be available as a coprocessor or a host processor (CPU) and manufactured using Intel’s 14nm process technology featuring second generation 3-D tri-gate transistors.

As a PCIe card-based coprocessor, “Knights Landing” will handle offload workloads from the system’s Intel Xeon processors and provide an upgrade path for users of current generation of coprocessors, much like it does today. However, as a host processor directly installed in the motherboard socket, it will function as a CPU and enable the next leap in compute density and performance per watt, handling all the duties of the primary processor and the specialized coprocessor at the same time. When used as a CPU, “Knights Landing” will also remove programming complexities of data transfer over PCIe, common in accelerators today.

To further boost the performance for HPC workloads, Intel will significantly increase the memory bandwidth for all “Knights Landing” products by introducing integrated on-package memory. This will allow customers to take full advantage of available compute capacity without encountering memory bandwidth bottlenecks experienced today.
 
Last edited:

mrmt

Diamond Member
Aug 18, 2012
3,974
0
76
Btw, it's official: Intel holds the supercomputer crown with Phi:

http://www.csmonitor.com/Science/2013/0617/China-supercomputer-clocks-in-as-world-s-most-powerful

(...)

The TOP500 list is compiled by computer scientists at the University of Mannheim, Germany; Lawrence Berkeley National Laboratory; and the University of Tennessee, Knoxville. Researchers measure computing power in "flop/s," that is, floating point operations per second. The bog-standard 2.5 gHz processor found in an office laptop is theoretically capable of 10 billion flop/s, or 10 gigaflop/s.

TOP500 measured the Tianhe-2 at 33.86 quadrillion flops/s, or 33.86 petaflop/s. To put things in perspective, that would be the equivalent of stringing together about 3.4 million standard processors.
 

Phynaz

Lifer
Mar 13, 2006
10,140
819
126
That machine is powered by by unreleased processors. :hmm:

I guess that's one way to do your validation testing.
 

mrmt

Diamond Member
Aug 18, 2012
3,974
0
76
That machine is powered by by unreleased processors. :hmm:

I guess that's one way to do your validation testing.

22nm is fairly mature by now, and Intel server SKUs are mostly delayed because of extensive validation. I expect Intel to - again - fill every relevant price point with Xeon release, this time wiping out AMD for good on the server market. I only wonder whether Phi will do the same with Nvidia PSB line.
 

USER8000

Golden Member
Jun 23, 2012
1,542
780
136
Last edited:

Khato

Golden Member
Jul 15, 2001
1,248
321
136
Nvidia cards and AMD CPUs made on less advanced process technology still seem to have greater efficiency.

Correct, they're currently around 13% more efficient than Intel's first product targeting the market. They're also slightly less efficient in in terms of Rmax vs Rpeak. If that trend continues on actual workloads then there's no question that Kepler is the better product for this market... but unfortunately all we get to compare these systems by is a very generic 'synthetic' type of metric.

It'll be quite interesting to see what kind of improvements Knight's Landing brings. Even if it just continues to trade blows with NVIDIA's next generation... yeah, that isn't good for NVIDIA. I'd imagine it's a fair bit more difficult for them to sell K20 at over $3k each when a comparable Xeon Phi is available for under $2k. (The Tianhe-2 is listed as using Xeon Phi 31S1P, which I'd guess to be a custom variation of the 'inexpensive' 3100 series.)
 

USER8000

Golden Member
Jun 23, 2012
1,542
780
136
Correct, they're currently around 13% more efficient than Intel's first product targeting the market. They're also slightly less efficient in in terms of Rmax vs Rpeak. If that trend continues on actual workloads then there's no question that Kepler is the better product for this market... but unfortunately all we get to compare these systems by is a very generic 'synthetic' type of metric.

It'll be quite interesting to see what kind of improvements Knight's Landing brings. Even if it just continues to trade blows with NVIDIA's next generation... yeah, that isn't good for NVIDIA. I'd imagine it's a fair bit more difficult for them to sell K20 at over $3k each when a comparable Xeon Phi is available for under $2k. (The Tianhe-2 is listed as using Xeon Phi 31S1P, which I'd guess to be a custom variation of the 'inexpensive' 3100 series.)

Titan was an upgrade of Jaguar,which is an older system(which means older infrastructure was also carried over I suspect),and the CPUs used are some of the first 32NM Bulldozer CPUs AMD produced(IIRC Titan was first the system to have them in the middle of 2011). Considering that the Piledriver based Opterons,with slight tweaks are more efficient and the Intel equivalents are even more efficient,in reality it does seem Kepler is probably a decent amount more efficient than KC,and this is masked by the CPUs and older infrastructure used in Titan. Moreover,the 28NM TSMC process,IIRC,is more optimised for density and cost than power consumption too, whereas Intel is using a process optimised for lower power consumption. Then on top of this IBM Sequoia using 45NM PowerPC A2 processors is also relatively efficient too,which really surprised me TBH!!

Knights Landing might be interesting,but IMHO,the competition won't stay standing still either.
 
Last edited:

LogOver

Member
May 29, 2011
198
0
0
Efficiency of KC does not seem that brilliant though:

http://www.anandtech.com/show/7075/june-2013-top500-list-published-xeon-phi-takes-top-spot

Titan using Nvidia cards and AMD Bulldozer CPUs made on less advanced process technology still seems to have greater efficiency and was an upgrade of an earlier system(not built from scratch too),and the same goes for the IBM powered systems using chips made using even older process technology.

This is not correct. These systems have different configurations and KS is not the only component consumes power. In fact the most power efficient system for now according to green500 is Xeon Phi based (with 2,499.44 Mflops/watt). As for R(max)/R(max) efficiency you can actually find much more efficient systems than Tianhe-2 based on Xeon Phi by navigating through top500.org
 

USER8000

Golden Member
Jun 23, 2012
1,542
780
136
This is not correct. These systems have different configurations and KS is not the only component consumes power. In fact the most power efficient system for now according to green500 is Xeon Phi based (with 2,499.44 Mflops/watt). As for R(max)/R(max) efficiency you can actually find much more efficient systems than Tianhe-2 based on Xeon Phi by navigating through top500.org

The Green500 list has no K20 or K20X based systems with Intel CPUs(since there are none around ATM),and it still does not change the fact that for a large scale system Tianhe-2 is still less efficient than a older upgraded system. Tianhe-2 is a brand new installation.

Moreover,the top system is only a relatively small scale installation(number 397 in the Top500 list),with an AMD S10000 based system coming second(52 in the Top500) and the list at the top is mostly dominated by IBM systems with CPUs made on a 45NM node,with the next Xeon Phi based systems being dozens of places down. Maybe the updated list will change that,so it will be interesting to see it.

Also,doom and gloom predictions about Nvidia and AMD being finished according to hardware enthusiasts on tech forums,is starting to get to silly levels now.

Intel has given them more competition no doubt,but its not like their competitors will all of a sudden give up. Anyway,2014 and 2015 are going to be interesting and I will leave it at that.
 
Last edited:

Vesku

Diamond Member
Aug 25, 2005
3,743
28
86
Correct, they're currently around 13% more efficient than Intel's first product targeting the market. They're also slightly less efficient in in terms of Rmax vs Rpeak. If that trend continues on actual workloads then there's no question that Kepler is the better product for this market... but unfortunately all we get to compare these systems by is a very generic 'synthetic' type of metric.

It'll be quite interesting to see what kind of improvements Knight's Landing brings. Even if it just continues to trade blows with NVIDIA's next generation... yeah, that isn't good for NVIDIA. I'd imagine it's a fair bit more difficult for them to sell K20 at over $3k each when a comparable Xeon Phi is available for under $2k. (The Tianhe-2 is listed as using Xeon Phi 31S1P, which I'd guess to be a custom variation of the 'inexpensive' 3100 series.)

While certainly synthetic I think it's safe to say a lot more care is put into making sure Spec represents performance, in tasks common to installations on the super computer list, than with a desktop synthetic benchmark for desktop tasks.
 

Khato

Golden Member
Jul 15, 2001
1,248
321
136
The Green500 list has no K20 or K20X based systems with Intel CPUs(since there are none around ATM),and it still does not change the fact that for a large scale system Tianhe-2 is still less efficient than a older upgraded system. Tianhe-2 is a brand new installation.

Luckily Green500 is simply a derivation of the top500 and hence we can easily calculate TFLOP/s*MW. Titan is 2142 with K20x and Opteron 6274. HPCC (#53 on top500) is is 2243 with K20m and Xeon E5-2665. Todi (#108 on top500), another Cray XK7 install like Titan, is actually also 2243.4 with K20 and Optern 6272. This is compared to Xeon Phi based systems at 1901 for Tianhe-2, 1146 for Stampede, 1849 for Conte, 1932 for Discover, 1264 for Endeavor, 1685 for MVS-10P, 1613 for Maia, 904 for an unnamed one at #158, 1000 for #249, 2455 for Beacon at #397...

Basically a long way of saying that the K20 based systems are far more consistent for whatever reason while the Xeon Phi ones vary a surprising amount. There's no evidence of Titan being held back by its 'old' infrastructure or its choice of CPU - the Xeon based K20 system is exactly the same efficiency as the smaller XK7 system.

While certainly synthetic I think it's safe to say a lot more care is put into making sure Spec represents performance, in tasks common to installations on the super computer list, than with a desktop synthetic benchmark for desktop tasks.
Well, you do know that the top500 list is simply a run of optimized linpack, right?
 

Vesku

Diamond Member
Aug 25, 2005
3,743
28
86
Luckily Green500 is simply a derivation of the top500 and hence we can easily calculate TFLOP/s*MW. Titan is 2142 with K20x and Opteron 6274. HPCC (#53 on top500) is is 2243 with K20m and Xeon E5-2665. Todi (#108 on top500), another Cray XK7 install like Titan, is actually also 2243.4 with K20 and Optern 6272. This is compared to Xeon Phi based systems at 1901 for Tianhe-2, 1146 for Stampede, 1849 for Conte, 1932 for Discover, 1264 for Endeavor, 1685 for MVS-10P, 1613 for Maia, 904 for an unnamed one at #158, 1000 for #249, 2455 for Beacon at #397...

Basically a long way of saying that the K20 based systems are far more consistent for whatever reason while the Xeon Phi ones vary a surprising amount. There's no evidence of Titan being held back by its 'old' infrastructure or its choice of CPU - the Xeon based K20 system is exactly the same efficiency as the smaller XK7 system.


Well, you do know that the top500 list is simply a run of optimized linpack, right?

No, thought they'd use SpecFP. Given it's just Linpack isn't it kind of silly to even be discussing the Top 500 list in terms of component performance?
 

LogOver

Member
May 29, 2011
198
0
0
The Green500 list has no K20 or K20X based systems with Intel CPUs(since there are none around ATM),and it still does not change the fact that for a large scale system Tianhe-2 is still less efficient than a older upgraded system. Tianhe-2 is a brand new installation.

As I stated before, you cannot conclude about KC efficiency based on system efficiency. Tianhe-2 is not the most efficient system, but it says nothing about KC efficiency.

Moreover,the top system is only a relatively small scale installation(number 397 in the Top500 list),with an AMD S10000 based system coming second(52 in the Top500) and the list at the top is mostly dominated by IBM systems with CPUs made on a 45NM node,with the next Xeon Phi based systems being dozens of places down. Maybe the updated list will change that,so it will be interesting to see it.

Ok. Let see top500 data.
http://www.top500.org/statistics/efficiency-power-cores/
AMD S10000 system has the best power efficiency... but R(max)/R(peak) efficiency is really pathetic (48.5%). The best Nvidia system has 77% efficiency, Xeon Phi - 75.5% efficiency.

Also,doom and gloom predictions about Nvidia and AMD being finished according to hardware enthusiasts on tech forums,is starting to get to silly levels now.

Intel has given them more competition no doubt,but its not like their competitors will all of a sudden give up. Anyway,2014 and 2015 are going to be interesting and I will leave it at that.

Let see some statistics:
http://www.top500.org/statistics/list/
November 2012:
Nvidia - 50 systems
Ati - 3 systems
Intel - 7 systems

June 2013:
Nvidia - 39 systems
Ati - 3 systems
Intel - 11 systems

The trend can clearly be seen.
 

FwFred

Member
Sep 8, 2011
149
7
81
It seems to me that running in a CPU config rather than a coprocessor config could increase the efficiency... assuming Phi is more efficient than the host dual-socket Xeon. Removing the PCIe data transfer should certainly help. It will be interesting to see what the topologies look like with Knights Landing.
 

LogOver

Member
May 29, 2011
198
0
0
No, thought they'd use SpecFP. Given it's just Linpack isn't it kind of silly to even be discussing the Top 500 list in terms of component performance?

It would put all systems with accelerators (except may by xeon phi based) out of the competition. Porting SpecFP to OpenCL/Cuda is not trivial task (if possible at all).
 

Vesku

Diamond Member
Aug 25, 2005
3,743
28
86
It would put all systems with accelerators (except may by xeon phi based) out of the competition. Porting SpecFP to OpenCL/Cuda is not trivial task (if possible at all).

OK but, given the constraints, using the Top 500 list for any sort of argument about general capability seems pretty pointless.
 

Khato

Golden Member
Jul 15, 2001
1,248
321
136
OK but, given the constraints, using the Top 500 list for any sort of argument about general capability seems pretty pointless.

Eh, it gives a better comparison point between supercomputers than their theoretical peak performance no? Given that throughput in linpack is anywhere from 50% to upwards of 95% of the theoretical peak, and actual programs likely won't see higher throughput than linpack. So it should be at least somewhat indicative of performance running actual workloads... but it's pretty much a given that each architecture will have its strengths and weaknesses.
 

meloz

Senior member
Jul 8, 2008
320
0
76
Looks like Nvidia will have a tough time with its Tesla business.

Depends on how Intel price KL. If they demand a premium most people will just continue to use Teslas. If performance/$ is competitive, for sure Intel will take a chunk out of nvidia marketshare.
 

ShintaiDK

Lifer
Apr 22, 2012
20,378
145
106
Didint take long for Intel to take the HPC crown. And Phi will be on 14nm before Tesla will be on 20nm I guess.
 
Mar 10, 2006
11,715
2,012
126
Didint take long for Intel to take the HPC crown. And Phi will be on 14nm before Tesla will be on 20nm I guess.

I expect Knights Landing to be a late 2014 part, probably November 2014, which would suggest just about the time frame of the "Maxwell" release on TSMC 20nm.

This will be fun to see play out.
 

jhu

Lifer
Oct 10, 1999
11,918
9
81
Efficiency of KC does not seem that brilliant though:

http://www.anandtech.com/show/7075/june-2013-top500-list-published-xeon-phi-takes-top-spot

Titan using Nvidia cards and AMD Bulldozer CPUs made on less advanced process technology still seems to have greater efficiency and was an upgrade of an earlier system(not built from scratch too),and the same goes for the IBM powered systems using chips made using even older process technology.

KC is way easier to program for.
 
sale-70-410-exam    | Exam-200-125-pdf    | we-sale-70-410-exam    | hot-sale-70-410-exam    | Latest-exam-700-603-Dumps    | Dumps-98-363-exams-date    | Certs-200-125-date    | Dumps-300-075-exams-date    | hot-sale-book-C8010-726-book    | Hot-Sale-200-310-Exam    | Exam-Description-200-310-dumps?    | hot-sale-book-200-125-book    | Latest-Updated-300-209-Exam    | Dumps-210-260-exams-date    | Download-200-125-Exam-PDF    | Exam-Description-300-101-dumps    | Certs-300-101-date    | Hot-Sale-300-075-Exam    | Latest-exam-200-125-Dumps    | Exam-Description-200-125-dumps    | Latest-Updated-300-075-Exam    | hot-sale-book-210-260-book    | Dumps-200-901-exams-date    | Certs-200-901-date    | Latest-exam-1Z0-062-Dumps    | Hot-Sale-1Z0-062-Exam    | Certs-CSSLP-date    | 100%-Pass-70-383-Exams    | Latest-JN0-360-real-exam-questions    | 100%-Pass-4A0-100-Real-Exam-Questions    | Dumps-300-135-exams-date    | Passed-200-105-Tech-Exams    | Latest-Updated-200-310-Exam    | Download-300-070-Exam-PDF    | Hot-Sale-JN0-360-Exam    | 100%-Pass-JN0-360-Exams    | 100%-Pass-JN0-360-Real-Exam-Questions    | Dumps-JN0-360-exams-date    | Exam-Description-1Z0-876-dumps    | Latest-exam-1Z0-876-Dumps    | Dumps-HPE0-Y53-exams-date    | 2017-Latest-HPE0-Y53-Exam    | 100%-Pass-HPE0-Y53-Real-Exam-Questions    | Pass-4A0-100-Exam    | Latest-4A0-100-Questions    | Dumps-98-365-exams-date    | 2017-Latest-98-365-Exam    | 100%-Pass-VCS-254-Exams    | 2017-Latest-VCS-273-Exam    | Dumps-200-355-exams-date    | 2017-Latest-300-320-Exam    | Pass-300-101-Exam    | 100%-Pass-300-115-Exams    |
http://www.portvapes.co.uk/    | http://www.portvapes.co.uk/    |