HUGE 3dfx interview w/Gary Tarolli

Mikewarrior2 · Nov 1, 2000

its obvious that dave3d's posts have changed in both content and tact from a few months ago to now.

I find it difficult to believe anything from any company person is un-biased. This merely reinforces that belief. DaveB3d was once a great poster, but now you can clearly see bias(and a lack of tact in several of his posts) now.

Saying that the 3dfx white paper is great because you helped write it is one of the most arrogant things i have ever read on any bbs.

Mike

DaveB3D · Nov 1, 2000

Doom. Do you really think I'm going to sit here and just make this stuff up? I mean come on. As for Carmack. I dunno. That is his business. But when you put NVIDIA's T&L engine on the scale of things, it does basically suck. Its functionality is very limited and it has horrible performance, especially with lighting.

DaveB3D · Nov 1, 2000

As I said, that post came out wrong. What I was trying to say is that it would have the information he needed and that because of that fact that I wrote it I knew it had the information he needed.

IBMer · Nov 1, 2000

Actually if you have read the Whitepapers you would see what he is talking about. It describes both types of FSAA used.

The reason you would use FSAA at 1600x1200 is to get rid of pixel popping, stair casing and texture swimming that still occurs at that resolution.

I feel that a lot of people reject FSAA because they know that 3DFX does it better and won't admit it. So instead of saying Nvidia's FSAA is has less quality they just say it doesn't matter because I can play at 1600x1200....

Take a look at these to pictures.

On is with FSAA, one without.

FSAA

No FSAA

No again, which one looks better?

DaveB3D · Nov 1, 2000

One of the other things that I'll agree I did lack tact on, but was getting at was free AA. Recall, before we bought out Gigapixel one of their big things was free anti-aliasing.

Soccerman · Nov 1, 2000

wow, alot of posts since I last responded (2 pages back!).

anyway..

Daveb3d, I have to say that some of your posts (regarding the variations in framerate) are kindof confusing..

I understand though why someone who is a hardcore player would want a V5 (god forbid) instead of a GTS, simply because it's framerate doesn't change as much.

the question is, WHY does the GTS framerate change more? does it change the same % wise as the V5? if so, it's not a GTS design problem is it? (I have to think about this for a minute).

that's only addressing the Fillrate part, I'll get into T&L later on.

btw, I did read the FSAA whitepaper, and I have to admit, it was well done. I personally thinkg 3dfx's solution is superior.

DaveB3D · Nov 1, 2000

Ok, let me try and explain it a bit clearer.

Say we peak in frame-rate at 90 fps.

They peak at 150 fps.

We have more fill-rate sitting idle because we are CPU limited. However, their T&L engine is capping out their fill-rate. So we have fill-rate waiting and they don't.

Now along comes a few rocket shots, a couple of people die. Suddenly your depth complexity has doubled and so your fill-rate has doubled (this is a very real situation mind you). Now they have capped out their fill-rate at 150 fps. So they drop down to 75 fps. A 50% drop. However, we have fill-rate just waiting there, so when this same situation comes we drop down to say 70 fps (giving them the benifit here of being slightly more effective in their use). So they are taking this HUGE frame-rate hit, while in comparision we are taking a small one.

Does that make sense?

Soccerman · Nov 1, 2000

"Recall, before we bought out Gigapixel one of their big things was free anti-aliasing."

hint hint..

to those who think FSAA is going away.. it's NOT.

the technology 3dfx acquired by Gigapixel is far in advance of the NV20's simple HSR..

DON'T believe the hype about 4x the fillrate because of HSR. it simply isn't true (just look at the Radeon, that's one technique it does to reduce bandwidth).

DaveB3D · Nov 1, 2000

That is SO true. I've actually been doing a lot of research on this subject and will be doing so for some time to come. ATI has claimed full HSR, so you can count on NVIDIA doing it as well. HSR isn't the only advantage of a tile based architecture though. There are plenty.

Soccerman · Nov 1, 2000

ahh I see... so in other words, because their T&L uses some of the bandwidth, when a scene becomes more complex, you get two things that require higher bandwidth, and can't get it, thus giving you that problem..

AlabamaMan · Nov 1, 2000

what hype are suppouse not to believe in? nvidia hasn't announced anything yet, but everyone "knows" that all they got is a simple hsr which is not as effective as they calim... but wait! they don't claim anything. they haven't announced anything, for christssake. hype? gimme a f**king brake.

AlabamaMan · Nov 1, 2000

how is t&l engine capping their fillrate?

ICyourNipple · Nov 1, 2000

Dave, so 3dfx is not intimidated at all by NV20 and the influence of microsoft behind it, etc.? you say that NV20 is going to be very expensive, but if it is a single chip solution and rampage may be using 2 chips plus 1-2 sage chips (totaling 4 chips) how can this possibly be cheaper? i know you can't go into too much detail but it just doesn't seem very efficient to me.

Soccerman · Nov 1, 2000

T&L caps your fillrate by using the bandwidth that your rasterizer would normally use when it needs it.

ie, if you disabled T&L when you are running a game extremely dependant on the fillrate, you would get an INCREASE in performance, becuase the T&L data isn't using the memory bandwidth as much (from what I understand anyway). the rasterizer then has the ability to use MORE of it's fillrate potential, becuase it can use more bandwidth.

EDIT: I wouldn't say 3dfx isn't intimidated, it's just that the claims of performance we will see that the nVidia pr gives us will not be what they claim.

say the NV20 runs with these specs:

-300mhz
-4 pipelines, each with 2 (or 3) texture units
-HSR

those specs would give us about 1200 Million pixels/second fillrate, with either 2400, or 3600 million texels/second.

what HSR does, is check the T&L data that was sent to if (from the CPU or T&L unit, it doesn't matter) any polygons have zbuffer values greater then other's with overlapping (x,y) coordinates.

when polygons overlap, it removes the ones that are covered from being filled in with textures etc.

theoretically, if you had an average overdraw of 4x (overdraw is the amount of polygons drawn, but not shown) you would see an increase in speed, becuase the card has to fill 1/4th of the polygons.

however, you do not get 4x the speed you would normally get. don't ask me why though, becuase I really don't know.. daveb3d told me this a while back, and I believe him now more then ever, becuase I learned from Anands review that even though Radeons have HSR, they don't run 4x the speed they normally would..

DaveB3D · Nov 1, 2000

No, no... well sorta no. let me try and make it clearer.

Because we are CPU limited, we get on average 90 fps say. Now because they don't have this limit, they get 150. So They are *using* all their fill-rate, and we are not *using* all of ours. So that gives us extra fill-rate to use while waiting.

Now the only case where it really is going to use more local bandwidth is when you are storing vertex data in local memory. They don't do this very often though, however they do allow it. So that isn't likely to happen.

As for the hype. Note that it is in Abrash's article on X-box. It already exists today. And Abrash isn't one you'd expect it from, but there is some there. The point though is that when NVIDIA does release NV20, they are going to hype HSR and claim probably full overdraw removal. And you know what, I'm betting they'll have a tech demo to show it and I bet I've got it mostly figured out how they'll do it. And if they do it, it will be NOTHING like a game situation.

My point about NV20 being so expensive is that they don't have a mid or low cost DX8 product. I'm really not in a position to talk about Rampage (whaz that? ) or what 3dfx as a company thinks of NV20. I can talk about NV20 itself and what I think it will be and stuff like that, but that is all.

DaveB3D · Nov 1, 2000

I don't think I was still entirely clear. Basically, when more fill-rate intense stuff comes into the picture, they'll slow down before we will. so they might lose 20 fps but we'll still be at 90 fps because we have extra fill-rate that isn't being used.

BFG10K · Nov 1, 2000

DaveB3D:

This is wrong. T&L has nothing to do with a fill-rate limited situation. Actually, in dealing with dynamic geometry, if NVIDIA stores vertex data in local memory, that eats bandwidth and so you're effective fill-rate actually will go down.

I disagree. Look at these benchmarks at Tom's Hardware.

In the Quake 3 Arena HQ benches the V5 is scoring about the same as the original GF1. In MDK2 (HQ, 32 bit) however, the GF1 is outperforming the V5 by a significant margin because MDK2 uses more T&L code.

Since the benchmarks themselves are testing fillrate limited situations (ie high resolution and high quality) we can see that T&L is giving the GF1 the edge in these situations as well.

Again, this goes back to having a more constant frame-rate. In Q3 this is especially important. You are better off having a board that stays closer to 90 fps all the time than one that peaks at 150 fps, but drops down to much lower numbers.

I fail to see how having T&L will cause minimum framerates to drop below that of a non-T&L card (ie a Voodoo). It sounds almost as if you are saying that T&L is useless and that video cards are better off without it.

I'm sorry but Carmack would completely disagree with you on that issue, as do I.

As for future games, I think I've already commented on games like Sacrifice. But for real future games, the GTS T&L engine doesn't supply the features or performance needed for the future.

Once again Carmack would disagree with you. The fact he has chosen the orginal GF1 as his development board speaks loudly and clearly for me. It seems he thinks nVidia's previous generation board has enough grunt to run Doom 3 because of T&L and register combines.

This is completely untrue. Carmack has never said such a thing. He actually has said the V5 has enough fill-rate so it will be able to play it.

Don't the GF1/GF2 MX have less fillrate than a V5? Yet Carmack never mentioned the GF1 or GF2 MX would have any trouble running Doom 3. In fact as I said before, the GF1 is his development board.

Not only that but it's amusing to see 3dfx's newest boards grouped together with other boards that are two generations old. The same old "you don't need this" from 3dfx has returned again.

But frankly, no board out right now is going to be able to offer the experiance in the new Doom that upcoming boards are going to. Current boards don't have the performance or the features.

True, but the the GeForces and the Radeon are much closer to being able to do so than the Voodoos. And T&L plays a major part in achieving this.

As for the new Unreal engine, I don't know what you are basing it on. However, I'm very confident in saying you are wrong.

You don't know what I am talking about yet you assume I am wrong? If you read the release paper detailing the engine you will see a list of video cards which are required to run the game properly. The Voodoos are nowhere to be seen on that list.

If you think a software T&L rendering engine is going to be as fast as a hardware one I really question your neutrality among 3D cards. The GF1's T&L unit can outperform a 1.2 GHz Thunderbird in T&L calculations.

No offenese by this at all, but this statement here proves that you really don't understand the technolgy at all.

Really? So explain to me why Carmack puts the GeForces at the top while the Voodoos come in third? The V5 has a higher fillrate than an GF1 doesn't it? Yet Carmack isn't even speculating whether the GF1 will be able to play Doom 3.

And going with the idea of v-sync. If you do that, your T&L engine is a waste because you aren't using the additional performance it brings. You might as well use the CPU and get the exact same thing. Get my point?

I disagree with your point. I wasn't aware that all T&L can do is raise maximum framerates. According to Anandtech's bechmarks the minimum framerates are raised as well. I'll do some benchmarks with VSYNC turned on when I have time.

That said, T&L is doing you absolutely NOTHING.

I don't believe that T&L can only affect peak framerates. That is just plain false. You constantly speak of CPU limited situations, yet it is in these very situations that T&L helps out the most by offloading calculations from the CPU.

I was just making a point. However, if you are running with v-sync on, you need need a frame-rate higher than your refresh rate. So, again, T&L is pointless in its current form.

I thoroughly disagree with this, as do the majority of other reputable technical websites, John Carmack and Tim Sweeney. Pardon me but I'll take their word over yours.

Soccerman · Nov 1, 2000

bfg, read in your first quote this part again, until you understand it..

"if NVIDIA stores vertex data in local memory"

that means, storing T&L data in the local memory, requiring it to be using the local bandwidth alot.. therebye reducing bandwidth for other applications.

this doesn't have to do ONLY with hardware T&L though..

IBMer · Nov 1, 2000

Dave I might be able to explain what you mean just a little bit better.

What he is trying to say is this.

In a CPU limited instance you have a card with T&L and one without.

The T&L card is going to achieve a higher framerate because the Videocard isn't wating for the Transform data on the CPU. In this situation it can use all of its fillrate to render the frames.

On the other hand a card that doesn't have a T&L engine would have to wait for the CPU to complete the Transforms, therefore not allowing it to use its fillrate to its full potential.

In a Fillrate limited instance, this doesn't matter because neither card is waiting for the CPU, thus doesn't need the help from the T&L engine. It can't render the frames any faster because there is no fillrate to fill the frames.

So if one card got 150fps in a CPU limited situation because of its T&L engine then in a Fillrate limited situation it wouldn't score as high because the T&L engine can't help out.

ICyourNipple · Nov 1, 2000

geez, back and forth i think i'm getting dizzy!! its so hard to know what to believe. does anyone know when NV20 is going to be announced so we can end the speculation? it cant be that much farther off, can it?

Soccerman · Nov 1, 2000

as for the situation where the 3dfx card doesn't fluctuate as much, I don't know.. if IBMer is right. good. but if not, then it might be something I'll never know. I don't mind..

"does anyone know when NV20 is going to be announced so we can end the speculation?"

well the fillrate speculation will end. this argument won't end, becuase it has to do with the design of their chips..

as for the games requiring a better video card, well we will see.. I don't know who to believe there.

DaveB3D · Nov 1, 2000

I disagree. Look at these benchmarks at Tom's Hardware.

In the Quake 3 Arena HQ benches the V5 is scoring about the same as the original GF1. In MDK2 (HQ, 32 bit) however, the GF1 is outperforming the V5 by a significant margin because MDK2 uses more T&L code.

Since the benchmarks themselves are testing fillrate limited situations (ie high resolution and high quality) we can see that T&L is giving the GF1 the edge in these situations as well.

Look. Disagree all you want. But you aren't disagreeing with me, are you disagreeing with laws and rules. Simple facts you are arguing with. This isn't an opinion thing. I'm sure there is an explaination for the benchmarks, but I haven't looked at the. So if you want to argue this, you are also arguing with every graphics engineering in the world.

I fail to see how having T&L will cause minimum framerates to drop below that of a non-T&L card (ie a Voodoo). It sounds almost as if you are saying that T&L is useless and that video cards are better off without it.

I'm sorry but Carmack would completely disagree with you on that issue, as do I.

Try reading a few of my real recent posts. That should clear it up.

And I'm not saying T&L is useless at all. It is the future, without question. I'm just saying that in comparision to the future, what we have now pretty well sucks and isn't all that usefull.

Once again Carmack would disagree with you. The fact he has chosen the orginal GF1 as his development board speaks loudly and clearly for me. It seems he thinks nVidia's previous generation board has enough grunt to run Doom 3 because of T&L and register combines.

Nothing against Carmack, as I like the guy, but you hold him a bit to much up as a god.

And note, he said he had to pull tricks to get it to work right. That said, you aren't going to get the full experiance with the current products. If you read ALL his text, you'll see the V5 will work as well.

sigh.. you continually quote Carmack. I've already addressed that stuff to and so I'll move on.

True, but the the GeForces and the Radeon are much closer to being able to do so than the Voodoos. And T&L plays a major part in achieving this.

No, T&L does not. Rather his mine thing is that he is pulling some tricks with the register combiners.

Don't the GF1/GF2 MX have less fillrate than a V5? Yet Carmack never mentioned the GF1 or GF2 MX would have any trouble running Doom 3. In fact as I said before, the GF1 is his development board.

Not only that but it's amusing to see 3dfx's newest boards grouped together with other boards that are two generations old. The same old "you don't need this" from 3dfx has returned again.

So what is your point if it has less fill-rate? He has to have a baseline and say "yes it will run on this" But the thing is, just because it runs doesn't mean you want to play it like that. Are you going to want to play it low resolution, low detail? That with just an acceptable frame-rate level? I dunno about you, but not me. If you want to do that, a GF1 or MX will do you dandy.

You don't know what I am talking about yet you assume I am wrong? If you read the release paper detailing the engine you will see a list of video cards which are required to run the game properly. The Voodoos are nowhere to be seen on that list.

If you think a software T&L rendering engine is going to be as fast as a hardware one I really question your neutrality among 3D cards. The GF1's T&L unit can outperform a 1.2 GHz Thunderbird in T&L calculations.

Don't put words in my mouth. I never said i don't know what you are talking about. First, to address the part on the Thunderbird and the GF1 T&L unit. This is probably true in CAD benchmarks. As I've said before, it has what is effectively a CAD T&L engine. I expect this. However, in games this isn't the story.

As for the Voodoo being on the list, I'm not sure about the situation. However, I'm faily confident in saying that it will work. But even if it doesn't, the key point is that even the GF/GTS boards won't be giving that great experiance that the upcoming boards will.

I don't believe that T&L can only affect peak framerates. That is just plain false. You constantly speak of CPU limited situations, yet it is in these very situations that T&L helps out the most by offloading calculations from the CPU.

In future games and future hardware, it will do more. However in current hardware that is all it is really doing with games.

I thoroughly disagree with this, as do the majority of other reputable technical websites, John Carmack and Tim Sweeney. Pardon me but I'll take their word over yours.

Disagree with me all you want. But provide evidence of it. That is what I'm doing. I'm 1) proving real world situations and show what happens and 2) explain the issues at hand.

As for Carmack and Sweeney. Well I don't really know Tim, but I'm sure he is a nice guy and from the emails I've exhange with John I'm sure he is too. But they are software people, not hardware people. They aren't gods either, as you make them out to be. And as for tech websites (which I generally call review sites because they aren't very technical), quote them all you want. But I will put anything I say and have said against what they say. I raw a tech site for 2 years and now I work in the industry. I think I know the ropes just a little. Again, read the two articles on the GF T&L that I pointed to earlier.

Dufusyte · Nov 1, 2000

I think this is what Dave means (correct me if I'm wrong):

In a heavy scene that is fillrate intensive, the V5 and GTS will be about similar, maybe with a slight edge to the nVidia card, say 70 vs 75 fps. This is in cases that are fillrate intensive, and the fillrate is the bottleneck (restraining factor) on both cards.

In a less heavy scene, which is not fillrate intensive, then the V5 can go as fast as the cpu provides the transform data, which becomes the limiting factor for the V5, and the V5 will have fillrate to spare. In these cases, the cpu might pump out enough transform data for 90fps, and the V5 will draw the 90 fps, with fillrate still to spare.

Meanwhile, in such a "less heavy scene", the GTS will not be limited by the cpu, because it has the spiffy T&L engine which will do the transform operations, and this will allow it to achieve, say, 150fps. In fact, the limiting factor might be the fillrate, since the T&L engine can pump out more frames than the fillrate is able to fill, so the T&L engine allows the nVidia card to max out it's fillrate in a simple scene at 150fps in a non-heavy scene, and then dropping to 75fps in a heavier scene (more fill rate intensive scene).

The end result is that the V5 will fluctuate between 70-90fps (depending on the scene) and the GTS will fluctuate between 75-150fps (depending on the scene. Supposedly, the more consistent framerate on the V5 will be a boon to gamers.

Folks, I'm a V3 owner and a 3dfx fan in general, next upgrade for me will be Fear since the V3 is running everything fine for me for the next year or so. Nevertheless, as a gamer I do not mind if my framerate spikes upwards during simple scenes. A spike upwards is never a problem for me (but a spike downward certainly is). Then again, I play mainly UT, and not Q3A. Now, it is known that Carmack's engine and Carmack's physics *vary* depending on your framerate, for example, the height/distance that you can jump in Q3A (or Q2, or Q1, for that matter) varies depending on your framerate. That being the case, if you play a Carmack engine game, then yes, I suppose having a consistent framerate would be a boon, since it would give you consistent (dependable) gameplay. In the UT engine, however, physics is not dependent on framerate, so I do not mind upward spikes.

Of course, if someone pops in a 2ghz cpu, then the V5 is no longer limited by the cpu; the V5 can produce, say 145 fps in a simple scene, maxing out its fillrate, and you would thus get wild variations in framerate on the V5. The moral of the story is to not buy a 2ghz cpu if you have a V5 and play Q3A.

pg22 · Nov 1, 2000

Genesis does what NintenDon't.

Oh sh*t.......where'd my elemantary school go?

Dulanic · Nov 1, 2000

I guess you dont understand the wording In my Opinion.

HUGE 3dfx interview w/Gary Tarolli

Diamond Member

Senior member

Senior member

Golden Member

Senior member

Elite Member

Senior member

Elite Member

Senior member

Elite Member

Junior Member

Junior Member

Member

Elite Member

Senior member

Senior member

Lifer

Elite Member

Golden Member

Member

Elite Member

Senior member

Senior member

Platinum Member

Diamond Member