how to rewrite celeron 2 l2 latency setting...

potatoesann · Nov 4, 2000

i have try to use the h-hoda wcpul2 program to rewrite the l2 latency settings for my celeron 2 566...but when testing other settings other then the default 2 it will tell me is failed....i dunno what is the problem....anyone know??? and anyone know is there other program to help me to do so???

potatoesann
celeron566 @ 953mhz 1.9v

Eug · Nov 4, 2000

You can't change it, period. The reported latency of 2 is wrong anyway. (According to the experts, it is essentially impossible to have a latency of 0, but WCPUID reports 0 for the PIII Coppermines.)

DaddyG · Nov 4, 2000

Yep, EUg has it correct. The actual L2 latency of Coppermines and C2's is 7. I beleive this is made up of 3 cycles for L1 miss plus 4 cycles for L2.

Goi · Nov 5, 2000

Daddy, I don't think you should add up the cycles. The L2 latency is independent of the L1 latency. If you add them up, you're essentially calculating the total latency for a memory access assuming there's an L1 miss and an L2 hit. This isn't the L2 latency, since the L2 latency is simply the number of cycles needed to access the first critical piece of data from the L2 cache.

As for H.Oda's software for finding out L2 latency, they just don't work - period.

potatoesann · Nov 5, 2000

so there is really no way to change it???

potatoesann

BurntKooshie · Nov 5, 2000

potatoesannCorrect. You're out of luck

Goi The L2 latency is independent of the L1 latency. If you add them up, you're essentially calculating the total latency for a memory access assuming there's an L1 miss and an L2 hitEXACTLY! Which is why DaddyG is right....and why you are right too

In most processors that people are familiar with (the Itanium has some exceptions, and some Alpha chips have some exceptions too), the L2 doesn't get accessed until a L1 miss...so the latency seen by the processor assuming a L2 hit, is the L1 + L2. That's the problem, do you report the latency as seen by the processor, or as given by the time it takes exclusively for the L2 to send information?.... I've heard of some people that have a great dilema when trying to report the latency. The real solution is to explain it both ways I guess.

DaddyG · Nov 5, 2000

Goi,

I don't see any other way to calculate L2 latency other than adding the L1 miss. If L1 hit then no L2 access !!. After all latency is the total time required before data is present.

BurntKooshie · Nov 5, 2000

DaddyG - actually, that's not always the case. Such as with the Itanium. For floating point data, it completely bypasses the L1, and goes instead directly to the L2. So for the Itanium, the latency is ONLY the latency of the L2, WIHTOUT the addition of the L1 "miss" because the "miss" doesn't exist....what's even weirder is the way the Itanium works.

I quote from page 5 of my article

<< One important thing to note about the cache structure of the Itanium is that the FPU completely bypasses the L1 Dcache, and instead goes straight to the L2 cache. The L2 cache is unified and on-die, 92k, 6-way associative, and carries with it a latency of 6 cycles (that's including the 2 cycle L1 miss). That means it's merely 4 cycles of latency for the L2 cache. To boot, the width of the pipe from the L2 to the L1 is 256 bits wide, just like the "advanced transfer cache" on the Coppermine, but without the stipulation that it can only send data every other cycle. But wait, there's more to it than that! The L2 cache for FPU data is 9 cycles, not 6, even though it has bypassed the L1 cache! >>

That's one of the places where you can't just say the L2 is X many cycles. In different architectures, it can be different things, which is why its often best to specify both the time it takes for the L2 to send data, and the time it takes for a L1 miss, assuming one happens.

Renob · Nov 5, 2000

bigdaddy was not talking about the Itanium he was talking about the coppermines.

Goi · Nov 10, 2000

The L2 latency I'm talking about is a strict specification of the L2 cache itself, independent of the L1 cache. It assumes that only the L2 cache is accessed. The point I'm trying to make it that if you add up the memory latencies, then essentially your L2 cache latency is equal to your total cache latency. Personally I think its more convenient to separate them, but that's just me. Of course these are technicalities, and I understand what you guys(DaddyG and BurntKooshie) are trying to say, and I acknowledge their validity.

DaddyG · Nov 10, 2000

Goi,

I agree that both arguments have valid points. What I don't know for sure though, (not speaking about Itanium here) is if the L1 miss time is actually required. By this I mean that in many architectures, the request is sent to BOTH caches at the same time. If the L1 hit, the L2 request is ingnored. The time taken for the L1 miss is not wasted, its part of the ACTUAL L2 latency. Of course thats just my opinion, I could be wrong.

pm · Nov 10, 2000

Well, I know the Itanium architecture much better than the Pentium III coppermine architecture, but most architectures that I've seen access both the L1 and the L2 simultaneously when requesting data. If the L1 has it, then the request is ignored by the L2.

DaddyG · Nov 10, 2000

Thanks, PM . I may be old but not senile.

Goi · Nov 11, 2000

hmmn, pm, interesting. I wasn't aware of that. Now that I think of it that makes more sense, since you're not wasting time waiting for a L1 miss in order to access the L2, but rather just assume a worst case scenario and access the L2 too. Anyway, in that case, wouldn't it be wrong to add up the individual L1 and L2 cache "latencies" in order to derive an L2 cache latency? It wouldn't be the sum of the 2 since the L2 is access in parallel with the L2 cache.

potatoesann · Nov 11, 2000

Intel have inform that the coppermine and celeron II both have the same latency setting for L2 cache...so conclude that the cripple perfomance of celeron 2 is not because of the latency setting, but becoz of the size and the mapping technique of 4 way...

potatoesann

DaddyG · Nov 12, 2000

Goi, what you say is correct. The 3 cycles that are required for L1 are not wasted in the L2 latency. It just takes 4 cycles longer to get to L2. Sorry that I started the confusion.

how to rewrite celeron 2 l2 latency setting...

potatoesann

Member

Eug

Lifer

DaddyG

Banned

Goi

Diamond Member

potatoesann

Member

BurntKooshie

Diamond Member

DaddyG

Banned

BurntKooshie

Diamond Member

Renob

Diamond Member

Goi

Diamond Member

DaddyG

Banned

pm

Elite Member Mobile Devices

DaddyG

Banned

Goi

Diamond Member

potatoesann

Member

DaddyG

Banned

TRENDING THREADS