2990WX review thread Its live !

Markfw · Aug 13, 2018

dlerious said:
Thanks, don't know how I missed that thread. Don't recall if I saw it mentioned, but since it's your first custom loop, I'd do a leak test before plugging in the power to the motherboard.

Please reply in that thread about how to do a leak test

CHADBOGA · Aug 14, 2018

bauerbrazil2014 said:
OMG, people are complaining about gaming performance.

Some people only want to have one desktop computer in their house, so it makes sense to see if this CPU can fulfill the need for such folk.

Gikaseixas · Aug 14, 2018

CHADBOGA said:
Some people only want to have one desktop computer in their house, so it makes sense to see if this CPU can fulfill the need for such folk.

If you own a system like it, chances are you have a daily driver or dedicated gaming box too (ex: i7 8700K or Ryzen 2700). Gaming is not one of it's main purposes by far

mattiasnyc · Aug 14, 2018

Completely agree.

coercitiv · Aug 14, 2018

CHADBOGA said:
Some people only want to have one desktop computer in their house, so it makes sense to see if this CPU can fulfill the need for such folk.

Gikaseixas said:
If you own a system like it, chances are you have a daily driver or dedicated gaming box too (ex: i7 8700K or Ryzen 2700). Gaming is not one of it's main purposes by far

AMD themselves marketed the X series to gamers.

If AMD marketing doesn't know any better, why expect this from consumers?

Gikaseixas · Aug 14, 2018

coercitiv said:
AMD themselves marketed the X series to gamers.

If AMD marketing doesn't know any better, why expect this from consumers?

Yes agree, WX series is not for gamers

CHADBOGA · Aug 14, 2018

Gikaseixas said:
If you own a system like it, chances are you have a daily driver or dedicated gaming box too (ex: i7 8700K or Ryzen 2700). Gaming is not one of it's main purposes by far

Which is why I used the term "Some people" and not something like "Most people".

lyssword · Aug 14, 2018

Look at that graphic, clearly it says wx is workstation and x is gaming. Don't see a problem with that

french toast · Aug 14, 2018

2950x has 4.4ghz turbo and the ability to use one die?.. clearly the ' all round ' workhorse intended for gaming, and everything else.

Complaining that an advertised 32 core workstation chip is crap at gaming is silly.

IntelUser2000 · Aug 14, 2018

This is a reflection on the hard trade-off designers have to make. Infinity Fabric and MCM = Easier to make many core chips by ganging up dies. But costs higher in power and latency. Mesh and single die = Lower latency and power requirements. Takes extra time to put them on die.

AMD has made the right choices for now, but two years into the future when intel has icelake SP, EMIB, likely new protocols and topologies, AMD is going to need a better system.

EMIB will likely cost more in power and latency compared to a monolithic die. Perhaps its better than a regular MCM with an associated interconnect like IF used in Threadripper. Still more work needed and higher cost though.

Cooper Lake goes back to an MCM not because they assume its the absolute technically best solution, but given what they have and what they need to do they decided it was the best.

Schmide · Aug 14, 2018

ub4ty said:
(snip)
These are the instructions for a simple printf program. I'm sure you don't need this lesson as you seem informed. Now extrapolate this to a benchmark application.

Analysis of an execution flow :
https://en.wikipedia.org/wiki/Cycles_per_instruction

I like what you write.

I have a good concept of how things move through the pipeline and often count certain instructions in my code to get an idea of how efficient the code is.

The above examples, while great as abstract examples for learning, do not represent real world code. No one would ever measure a program like hello world. It basically does some setup then makes a system call.

As well the other example is loaded with dependencies, (some designed to confuse), is in order, and simplified. I do not believe any modern processor could return a dependance in 1 or 2 cycles. I would think it would be more in the order of 7-14.

So here is what I do. In my current project I'm looking at better ways of transposing a matrix of bits for faster output to the GPIO. (aka bit banging) It will be open sourced so I don't care posting some snippets here.

This is a major routine. It takes matrix, interleaves the bytes from the middle over and over. The result is a transposed matrix. There are more direct ways of doing this but this method is very friendly to a pipelined arch.

Code:

void InterleaveBytes(__m128i *in, __m128i *out, __m128i *scratch, long count, unsigned long passes, unsigned long offset)
{
   __m128i *to, *nextTo;
   __m128i *from = in;
   if (passes-- & 1) { // odd
      to = scratch;
      nextTo = out;
   } else {
      to = out;
      nextTo = scratch;
   }
   __m128i *end = &to[count];
   do {
      *to++ = _mm_unpacklo_epi8(from[0], from[offset]);
      *to++ = _mm_unpackhi_epi8(from[0], from[offset]);
      from++;
   } while (to < end);
   from = &to[-count];
   to = nextTo;
   do {
      end = &to[count];
      do {
         *to++ = _mm_unpacklo_epi8(from[0], from[offset]);
         *to++ = _mm_unpackhi_epi8(from[0], from[offset]);
         from++;
      } while (to < end);
      end = &from[-(count>>1)];
      from = &to[-count];
      to = end;
   } while (--passes > 0);
}

So typically this routine would loop 4-5 times on a matrix size of 128 bytes. For loops like this most of the instructions are superfluous. As long as you feed the major instructions at a decent rate they will be the determining factor.

On a haswell 4ghz

Code:

      /* 40+128=168 non loop operations
      10m passes comes in at
      0.48 sec so 10m/0.48 = 21m
      4g / 168 = 24m
      87% efficiency
      */

This first routine is the 40. There is a more complex routine that shifts out the bits. The 128.

I now have a metric of how certain instructions are moving through the system. It is ordered, repeatable, and logical.

The same can be said for almost any routine in the abstract. Yes there could be better terms for what is going on, but IPC is a good term that people understand.

In the pure sense, you are correct.

However, I believe there is room for many contexts for the term IPC.

Edit: The routine for AVX is a bit more complicated because of the 128bit lanes. Here is the results for it though. It almost doubles the throughput at a lesser efficiency. It has less IPC. In this case your argument against IPC shows validity.

Code:

   /* 24+64=88 non loop operations (sans byte bit swap)
      10m passes comes in at 0.3 sec so
      10m/0.3 = 33m
      4g/88 = 49m
      67% efficiency
      */

BigDaveX · Aug 14, 2018

bauerbrazil2014 said:
OMG, people are complaining about gaming performance.

Gaming's not its main purpose, sure. It's also not Skylake-X's main purpose, but that didn't stop people raking it over the coals when its gaming performance turned out to have dropped all the way back to Ivy Bridge levels in certain cases.

Gikaseixas said:
If you own a system like it, chances are you have a daily driver or dedicated gaming box too (ex: i7 8700K or Ryzen 2700). Gaming is not one of it's main purposes by far

In my case, I wouldn't physically have the room for a dedicated gaming box. Meaning that if I were putting together a one-size-fits-all rig, my choices would boil down to the 2950X (good, cheap all-rounder, but not really the performance leader in anything), 7980XE (very solid, consistent performance, plus AVX-512 support, but ridiculously expensive and less PCIe connectivity) or 2990WX (crazy fast in multi-thread situations, but wildly inconsistent in single/low thread).

There's not really an obvious "best" CPU out of that line-up, and I'm sure AMD would have loved to have marketed the 2990WX as the CPU that can be all things to all people - which, let's face it, is how Intel markets the 7980XE, despite it being by no means their best gaming chip - but clearly there's too many pitfalls as it is for them to do that.

You could also argue that having to build a second gaming box would eliminate its price advantage over Intel's line-up, but then again this thing's real competition will be the rumoured LGA3647 i9s, and right now we have absolutely no idea how they'll be priced or how well they'll do in games, so it'll still probably come out in front on that count.

french toast · Aug 14, 2018

BigDaveX said:
Gaming's not its main purpose, sure. It's also not Skylake-X's main purpose, but that didn't stop people raking it over the coals when its gaming performance turned out to have dropped all the way back to Ivy Bridge levels in certain cases.

In my case, I wouldn't physically have the room for a dedicated gaming box. Meaning that if I were putting together a one-size-fits-all rig, my choices would boil down to the 2950X (good, cheap all-rounder, but not really the performance leader in anything), 7980XE (very solid, consistent performance, plus AVX-512 support, but ridiculously expensive and less PCIe connectivity) or 2990WX (crazy fast in multi-thread situations, but wildly inconsistent in single/low thread).

There's not really an obvious "best" CPU out of that line-up, and I'm sure AMD would have loved to have marketed the 2990WX as the CPU that can be all things to all people - which, let's face it, is how Intel markets the 7980XE, despite it being by no means their best gaming chip - but clearly there's too many pitfalls as it is for them to do that.

You could also argue that having to build a second gaming box would eliminate its price advantage over Intel's line-up, but then again this thing's real competition will be the rumoured LGA3647 i9s, and right now we have absolutely no idea how they'll be priced or how well they'll do in games, so it'll still probably come out in front on that count.

The 1950x and 2950x are the 'all things to all people's' perf/$ champ though...sure it is not the best at anything, except perhaps it's value...but it is best all round workhorse for your money, with no real weaknesses in anything.
With intel skylake X you have to pay twice the RRP to get that better performance all-round...for some people it is worth it, for many others it is over priced, I feel this will be reflected in the sales.

If you have the money spare and you want a jack of all trades..top performer..buy the intel solution.
If you have a smaller budget buy the 2950x which gets slightly lower performance, but at half the price.

Both good choices, you wouldn't by the 2990wx unless you had a specific rendering workload to use it for 80% of the time.

TheGiant · Aug 14, 2018

After this release IMO I would buy definitely Intel SKL-X IF they supported 256GB ECC RAM. SKL-X oced is best allrounder and capable of high frequency OC and high IPC (except gaming, but with this core count no worse than TR).

Now 2950X is a worthy CPU. It is just not that wow for me to buy it.

https://www.techspot.com/review/1678-amd-ryzen-threadripper-2990wx-2950x/page9.html this sums it up nicely.

mattiasnyc · Aug 14, 2018

CHADBOGA said:
Which is why I used the term "Some people" and not something like "Most people".

I think the other point still stands though: Once the buyer has built their 2990wx system they've spent so much money that a top-of-the-line graphics card suited for gaming and thus gaming at high resolution seems reasonable. And at that point aren't the video cards the limiting factor?

In addition,at those prices why not just get a gaming-dedicated cheaper computer or a console?

Christopher Bohling · Aug 14, 2018

mattiasnyc said:
I think the other point still stands though: Once the buyer has built their 2990wx system they've spent so much money that a top-of-the-line graphics card suited for gaming and thus gaming at high resolution seems reasonable. And at that point aren't the video cards the limiting factor?

In most cases yes, the video card would become the limiting factor, but there are a few games where performance on the 2990WX is so bad that you would actually be bottlenecking a high-end card even at 1440p/4K. Attempting to use this as a gaming CPU even "from time to time" really isn't a good idea, at least not unless we get major Windows and game updates to make them work better with the 2990WX's unique design.

Abwx · Aug 14, 2018

mattiasnyc said:
I think the other point still stands though: Once the buyer has built their 2990wx system they've spent so much money that a top-of-the-line graphics card suited for gaming and thus gaming at high resolution seems reasonable. And at that point aren't the video cards the limiting factor?

In addition,at those prices why not just get a gaming-dedicated cheaper computer or a console?

The GFX that suit an eventual TR WX build :

https://www.anandtech.com/show/13210/amd-announces-radeon-pro-wx-8200

Hitman928 · Aug 14, 2018

The 2990wx has a game mode which can bring it down to one die with 8c/16 where it performs about equal to a 2700. That should be fine for those who want to play the occasional game on a workstation system. . .

french toast · Aug 14, 2018

Hitman928 said:
The 2990wx has a game mode which can bring it down to one die with 8c/16 where it performs about equal to a 2700. That should be fine for those who want to play the occasional game on a workstation system. . .

If money is no object buy an epyc or something, none of the draw backs, loads of everything.

BigDaveX · Aug 14, 2018

french toast said:
If money is no object buy an epyc or something, none of the draw backs, loads of everything.

Epyc's clocked a lot lower, though. Besides, the overheads which are hurting performance in certain scenarios on the 2990WX might actually be worse on Epyc, since there's four memory pools for Windows to juggle instead of two.

Markfw · Aug 14, 2018

BigDaveX said:
Epyc's clocked a lot lower, though. Besides, the overheads which are hurting performance in certain scenarios on the 2990WX might actually be worse on Epyc, since there's four memory pools for Windows to juggle instead of two.

Also, the linux performance is insanely good, I mean it crushs everything on all then benchmarks., and thats what I will be running after its all setup and OC'ed

Guru · Aug 14, 2018

Seems like the issue lies with the way they configured the CCX cores to access the memory, only 2 out of the four can access the memory directly and this causing an issue. I wonder why then did they not make all CCX access the memory? Can someone with an expertise in CPU design answer what would be the drawback to allowing the other CCX cores to access the memory as well? Does it increase the die size, is there some other penalty?

The Stilt · Aug 14, 2018

Guru said:
Seems like the issue lies with the way they configured the CCX cores to access the memory, only 2 out of the four can access the memory directly and this causing an issue. I wonder why then did they not make all CCX access the memory? Can someone with an expertise in CPU design answer what would be the drawback to allowing the other CCX cores to access the memory as well? Does it increase the die size, is there some other penalty?

You do realize there are four Zeppelin dies (MCM4) attached on the same substrate?

dnavas · Aug 14, 2018

Markfw said:
Also, the linux performance is insanely good, I mean it crushs everything on all then benchmarks., and thats what I will be running after its all setup and OC'ed

Yeah. If you skipped the phoronix reviews, go back and have a look-see. They wrote up a bunch of reviews which are long on graphs and short on words. The performance difference between linux and windows is really quite impressive.

CuriousMike · Aug 14, 2018

Guru said:
Seems like the issue lies with the way they configured the CCX cores to access the memory, only 2 out of the four can access the memory directly and this causing an issue. I wonder why then did they not make all CCX access the memory? Can someone with an expertise in CPU design answer what would be the drawback to allowing the other CCX cores to access the memory as well? Does it increase the die size, is there some other penalty?

Hardware unboxed explained it in a way I finally understood -
https://youtu.be/QI9sMfWmCsk?t=3m49s

And how that impact memory bandwidth
https://youtu.be/QI9sMfWmCsk?t=17m3s

2990WX review thread Its live !

Moderator Emeritus, Elite Member

Platinum Member

Platinum Member

Senior member

Diamond Member

Platinum Member

Platinum Member

Diamond Member

Senior member

Elite Member

Diamond Member

Senior member

Senior member

Senior member

Senior member

Member

Lifer

Diamond Member

Senior member

Senior member

Moderator Emeritus, Elite Member

Senior member

Golden Member

Senior member

Diamond Member