nVidia NV20 info...my gawd!! ;)

Page 5 - Seeking answers? Join the AnandTech community: where nearly half-a-million members share solutions and discuss the latest tech.

BFG10K

Lifer
Aug 14, 2000
22,709
2,995
126
There is another thread which has the NV 20 specs here.

It doesn't mention anything about tiling but it talks about a 256K on-chip cache which will speed things up. Also the price is between $350-$400 which is excellent considering the GF2 Ultra is $500.
 

ElFenix

Elite Member
Super Moderator
Mar 20, 2000
102,358
8,447
126
i'm guessing that the 30 million transistors of the gf2 already has 256k of cache. no way that is all logic.
 

Soccerman

Elite Member
Oct 9, 1999
6,378
0
0
thanks for responding jpprod.

ok, I'm not trying to tell you that you are wrong DaveB3D, I'm just trying to figure out where you got the info of nVidia's HSR.

now, about Tile based rendering, you know what we're talking about, we're talking about PowerVR's technology. how different is it from nVidia's HSR? please put some more specifics into it, so I can learn more, and don't have to ask too many questions.. k?

sorry for insulting you, I had to be sure to get your attention..
 

BenSkywalker

Diamond Member
Oct 9, 1999
9,140
67
91
Soccerman-

Dave is speculating just as everyone else is, though he does know people nVidia is extremely tight lipped. PowerVR's tiling is a pure tiler, you can read about them in several different reviews/articles.

HSR as of now, is up in the air(many different ways to do it in theory at least). It is possible that nVidia is doing an early Z-buffer sort to avoid a larger amount of overdraw then what is currently availble. Think of it this way(oversimplification)-

Before you draw anything you pull the vertice data into a Z-Buffer check unit. Calculate primitives and eliminate any that fall in the same X+Y axis as something else that lacks transparency at a deeper Z-depth. Then have a buffer between the Z-Buffer check unit and the shading unit keeping the shading unit busy at all times while eliminating, before it is used, overlapping Z data.

That is one, oversimplified, possibility. I don't know how they are doing it exactly, except that they are not using a tiler(which sorts the scene data in a completely differnet way).

These numbers look a lot more like X-Box(NV25) specs, particularly considering that "final" numbers have been posted now. They may hit those with the NV20, though I doubt it. The ship date is definately up in the air, rumors are circulating now that they taped out a couple of days ago with the part which may meant that they could ship by the end of this year, though I doubt it will be for the consumer market.

If they do ship this year, it will more then likely be for X-Box dev kits and won't be readily available at a consumer price point until the first few months of next year. On the build size, it does appear that the chip will be .15u, they have the technological capabilities and with the transistor count numbers that are circulating around it seems that they would be foolish not to utilize .15 when it is available, but again that is just trying to apply logic and not based on anything concrete.

Also, the multi-sampling issue. I would be interested to hear what Dave uses as a definition of "true" multi-sampling. SGI's old definition? That is what many comments he has made in the past make it seem like.
 

_Silk_

Junior Member
Mar 2, 2000
22
0
0
By the way... the X-Box seems to be based off the NV20 and not the NV25 as originally thought.


An interview
here with J Allard of Microsoft says:
GS: So what are some of the key technical challenges you see to hitting the specs you've released? It's been said that the graphics are based on the Nvidia NV25 chip that's two generations away. With the NV20 said to be slipping a bit, will it be hard to make your schedule?

JA: I can't really go into a ton of detail about the Nvidia relationship and our approach to the engineering of it. But I can say the NV20 is the core of the Xbox graphics. The guys who are working on the NV2A - we call it the NV2A inside of Xbox - are working in parallel with the NV20 folks, so they sort of marry back upafter the NV20




The speculation is that the X-Box just has an NV20 with built in Northbridge and memory controller features.

Ohh... And it is possible that the NV20 will actually be manufactured at .13micron process. nVidia use TSMC for their manufacturing. TSMC reported earlier, that they are ready for .13 micron. They said, at least seven customers' products will be taped out this month. TSMC is expecting first silicon by the end of the year. This might just work out if they plan volume shipment in Jan with cards in Feb.
 

DaveB3D

Senior member
Sep 21, 2000
927
0
0
hehe, well I won't hold it against you. No need to do it though. If you ever have a question, I'm always willing to answer.

So how does deferred rendering work (as seen on PowerVR and Gigapixel)?

Well, like with any graphics card, the first step if your basic T&L. I won't go into that as I assume you know how that works. Now you do a coarse rasterization of the incoming triangles, sorting them into the appropriate bins, and updating the graphics state (texture state, etc.) in each bin. When the frame is over, you have a binned display list which has stored the scene from the previous frame. Now you write your triangles to the tile buffer, which is a small cache that holds a tile. You iterate across the bins, processing each one separately. From here, you proceed to do what is known as ray casting. This is a massively parallel design which determines what pixels are visible in the tile. After determing pixel visibility, you know what you see and so all other data is dropped.


Now while your data is still in the tile buffer, you can do any additional texture layers you need, as well as anti-aliasing. This reduces memory bandwidth usage.

To finish things up, you finish going through each tile and after you've finished, you have your final image which is displayed.

As for NVIDIA. They work on a traditional architecture. Basically, they do their T&L. From there, they take their Z values, and right them to a Z-buffer in the order they receive them. This determines visibility. From there, they texture and then output to the back-buffer. Basically, the same thing that everybody else does.

Now I'm not completely sure how occlusion detection works on NV20. With early Z-checks, if they do that, you basically do a Z check before you read a texture value. This saves you time in doing texture reads that wouldn't be needed. This isn't going to make a huge difference, but everything helps.

Occlusion detection is a bit more difficult. There are a variety of ways you can do it, both in hardware and software.

You could do a low resolution Z buffer check. Now you have to use a CPU for T&L with this, because hardware T&L lacks the flexibility. Now you do your transformations in software and write a your Z values to system memory. From there, you check the Z value and see if it is occluded. If it is you drop it. This works entirely on a per-triangle basis.

Another option requires direct hardware support. Here you break up your screen into bounding boxes. You'd likely do this with each primitive, as you receive them. From there you'd check for visibility and if you don't see anything, nothing within the box is rendered (so basically you drop all the data).


Ben,

Where do you get the idea that I'm speculating? If I don't say I'm speculating I'm not speculating.

As for multi-sampling, it is the def. that most all other engineers use these days, which involves the reuse of a single texel value for each sample.


 

nippyjun

Diamond Member
Oct 10, 1999
8,447
0
0
The NV20 won't be out until the spring. The specs you show are probably for the NV25 x-box product that will be available late NEXT year.
 

BenSkywalker

Diamond Member
Oct 9, 1999
9,140
67
91
Dave-

"The chip is not done and 4800 mpps got pulled out of somebody's rectum. That is the truth guys. NV20 will be out within the first few months of the next year, as long as no problems come up (and everything is looking ok so far).


As I said, 4800 MPPS got pulled out of somebody's rectum. That is not the fill-rate. Even if it were, fill-rate numbers will mean nothing. A large factor is going to be pipeline effiency with DX8."


I doubt it is the number either, but can you say for absolute certainty that it isn't? You seem a lot more sure in this thread then you have in the past-

"Well I wouldn't say that number is BS as it is a possible number, just they'll never reach it (like with the GTS). However, what is BS is that it is coming out this year. It ain't done. People just don't seem to be able to get that."

Link.

If the chip has just taped out, and that is a big if, how can anyone say for sure? I think the odds are very slim that they will hit that number, though they may in a heavily optimized benchmark using an "effective" rating.

"As for multi-sampling, it is the def. that most all other engineers use these days, which involves the reuse of a single texel value for each sample."

Do you consider that progress? In terms of performance it has its' benefits, but the quality is extremely poor compared to what is now defined as SS in every implementation that I have ever seen. Is there anything you can say about how they will implement this with acceptable quality that you are not under NDA on? Blurring has been a serious issue with AA based on reusing the same texel data, anything you can say on how they are going to avoid it?
 

DaveB3D

Senior member
Sep 21, 2000
927
0
0
Ben,

Yes I'm sure about the spec not being 4800 MPPS. Even if you look at Abrash's article, you need like 10x overdraw to get the 4x improvement they are talking about, if this were a real number.. and I think that article is probably overly optimistic.

There shouldn't be any issue with blurring unless it is a poor implementation (or somebody screws with the mip-map bias). Multi-sampling does not have any inherit blur factors. So just by doing that, there shouldn't be an issue. Now there is the overall issue of quality. Super-sampling, on a whole, delivers better results, as multi-sampling doesn't help texture AA (so with multi-sampling your textures look identical to the way they do w/o AA). However, advanced filtering (high-tap anisotropic) types combined with multi-sampling will actually give improved results over super-sampling with bilinear filtering.
 

BenSkywalker

Diamond Member
Oct 9, 1999
9,140
67
91
Dave,

How exactly is this type of multi-sampling supposed to work? I am curious, multi-sampling in the past has been used for creating a motion blur type of effect in moving images(by reusing texel data) and did work on everything on the screen, just blurring it all(though very slight compared to the type of blur that 3dfx uses in the T-Buffer).

What you are describing sounds like it would have no advantage over L&E AA that I can think of, why waste the die space? If you are going to rely on high tap(16?, 32?, more?) anisotropic and mips to handle texture AA, why not utilize L&EAA instead of MS? I'm asking as I am curious how they are going to deal with the issues(and why I asked if you are NDAed on this topic).
 

Ferocious

Diamond Member
Feb 16, 2000
4,584
2
71
hehe, I'm still waiting for Nvidia to make a gaming card as good as a 3dfx product.

I think their next one might finally be it...since their developers have been using their cards as the standard for over a year now.

But I'm not holding my breath.
 

_Silk_

Junior Member
Mar 2, 2000
22
0
0
Actually, by looking at several rumors floating around from various sources and then an OFFICIAL X-Box spec and an interview with J Allard of Microsoft, I'd say the 4800Mpix/sec is the truthful spec for the NV20.

J Allard revealed in his interview that the X-Box is not based on the NV25 as originally expected but in fact the NV20. They call the chip for the X-Box the NV2A - I'm speculating that's its just an NV20 with an extra 10million transistors for NorthBridge and Memory Controller features.

Then by looking at the X-Box OFFICIAL spec from Microsoft, you see that 4800Mpix/sec number again.

This is how it's obtained according to several rumor sources: (take it for what you will)
300Mhz chip with 4 Pixel Pipelines

Now that only gives you 1200Mpix/sec - but the chip has some sort of HSR (hidden surface removal). They assume an average overdraw factor of 4, thus arriving to the 4800Mpix/sec number.

Still no number of how many texture units per pipeline. Assuming they stick to two, we have 2400Mtex/sec raw fill rate.

To me this all seems very truthful and correct - could be wrong, but I don't think so.

My main concern is memory bandwidth... I know in real world apps, HSR will dramatically reduce Z-Buffer accesses, thus reducing bandwidth used. However, by looking at 3dMark 2000 fill rate benchmarks, the GeForce 2 only gets 1200MTex/Sec (maximum of 1600Mtex/sec) at 16bit color and considerably less at 32bit color. Looking at this test HSR will not help at all, as the test consists of just one big poly filling the screen. It is a raw fill rate test - and the GeForce II did not have enough bandwidth to reach its potential (especially at 32bit color).

By using the fastest DDRAM available, they will probably get close to reaching the max fill rate at 16bit color (like the GeForce II), but 32bit will be a different story. That is... unless they have something else up their sleeve to help bandwidth.

Have to wait and see...

Ohh... As for schedule it looks like early next year. A couple days ago TSMC (manufacturing for nVidia) announced their .13 micron process is ready and seven customer products were taping out this month. They also said they expect first .13 micron silicon before the end of the year.

Going by the rumors that the NV20 just taped out, along with TSMC announcement.... If first silicon in Dec, volume shipment by Jan, cards in Feb.

Later
 

NFS4

No Lifer
Oct 9, 1999
72,636
46
91


<< Hey there's nothing wrong with playing games. On the other hand shelling out that kind of money to play them seems outrageous, especially since it will only be cutting edge for a short time. >>


Might I ask, what kind of video card do YOU have? :Q
 

jpprod

Platinum Member
Nov 18, 1999
2,373
0
0
Dave, About multisampling vs. supersampling... You said that in multisampling only one textel is used per subsample. Does this mean textel by it's original definition (one texture element) or in it's 3dfx definition (textured output pixel w/ bilinear)? Four texture fetches needs to be done for each bilinear pixel, are these used with different weights in multisampling or does it really create all subsamples by using just one texture element?
 

DaveB3D

Senior member
Sep 21, 2000
927
0
0
Ben,

What does L&amp;E stand for? I'm about clueless on that one and I haven't been able to find anyone yet who knows.

I could see the confusion on with the blurring issue. I think that what you're thinking of is more similar to a multi-sampling effect (A/T-buffer) type deal. Texel reuse would allow this. However, you can also do the same thing and get the jitter that 3dfx currently uses. So by doing a multi-sampling jitter you'll get your AA without the blurring. If you do a complete offset without an averaging you'll get a motion blur type effect.

Jppord,

I'm talking about texture elements. If uses a single texel value for your 4 values (assuming you do 4x AA). So instead of having to do 4 texel lookups, you do a single one. This does nothing for texture aliasing though.


 

BenSkywalker

Diamond Member
Oct 9, 1999
9,140
67
91
Dave-

Line and edge, though I suppose line support wouldn't be very useful in a gaming situation.

Yes, I am thinking of an A/T -buffer MS type effect, I don't think I'm grasping what type of implementation you are talking about, but I'll give it a shot. Are you talking about using MS to eliminate the need for multiple texture reads? If that is the case, and I'm not missing something, all you would be saving is bandwith, though that is of course a major concern, the fill hit would still be the same, or am I missing something?

I have heard mention of some type of parallel data processing, is this the current assumption for the NV20/Rampage boards(I know your NDAd on the second one, but is that answerable)? If so, how many texturing units are we talking about? It would seem that would require eight or sixteen to be practical unless an enormous clockrate could be reached. Is this the reason for the rumors of 50million transistors on the NV2X?

Edit- Thinking about it more it seems that they would need to process all sub-samples at once to take advantage of the bandwith savings, as long as I'm not missing something. If that is the case, what can we expect, four pixel with three or four texturing each?
 

Soccerman

Elite Member
Oct 9, 1999
6,378
0
0
ahh it appears I've provoked an intelligent conversation..

notice how there's two discussions going on? Red Dawn and others talking about the insane price, and Benskywalker, DaveB3D, and Jpprod talking about etheral things such as multi sampling etc..

now, ok I sortof understand that PowerVR's (and also former Gigapixel's) Tile rendering is more then nVidia's HSR (hidden surface removal), becuase with the Tile based rendering, you are able to store each tile in cache, so while your processor is crunching away on the Z-Values, you can also have it do other things to the data (at this point I'm kind of lost).

I already knew the basics of removing hidden surfaces (one of the major parts of Tile rendering I know, I've never been told of the other parts till now) it basically frees up your rasterizer to do other stuff that you can actually see.

this also means (does it not?) that because it doesn't have to fill in the triangles where you can't see them, you free up bandwidth?

also, when you're talking occlusion detection (one of the techniques for not drawing polygons you don't see), there are the methods to do this with the T&amp;L unit itself (whether the CPU or a dedicated chip which supports it) or with a processor that takes that data, and eliminates triangles that are blocked by others (right?).
 

jpprod

Platinum Member
Nov 18, 1999
2,373
0
0
I'm talking about texture elements. If uses a single texel value for your 4 values (assuming you do 4x AA). So instead of having to do 4 texel lookups, you do a single one. This does nothing for texture aliasing though.

You do mean one textel fetch per texture present within the four subsamples?

Let's say there's an edge of a polygon goes trough the output pixel (consisting of four subsamples) in a manner that two sub samples hit the polygon and two hit whatever's visible behind it. If you are restricted to only one textel fetch per pixel, you can either fetch the texture element needed for the foreground polygon or for the background: resulting output pixel color would be something not quite right, since there isn't enough information present to find correct color to blend between the two polygons. So with just one textel (texture element) fetch per pixel, and assuming all surfaces are textured, MS does nothing for polygonal aliasing either.
 

MfA

Junior Member
Apr 26, 2000
3
0
0
With &quot;standard&quot; (SGI) multisampling the framebuffer is the same as if you are supersampling, only instead of truly supersampling only the Z values are calculated for each subpixel. The color of the subpixels is all equal, with a mask indicating which subpixels are covered by the polygon.

But 3dfx/m$'s definition of multisampling is slightly different.
 

DaveB3D

Senior member
Sep 21, 2000
927
0
0
Hey,

Sorry for taking so long to respond, but I'm out of town. I'll be real brief:

Yes, there are things you can do within the T&amp;L engine to reduce overdraw. What is the best way of doing it and how do you go about actually doing it in the hardware? I really don't know. But I know there are concepts out there based around the idea. There may be more, and if anyone knows of such please let me know as I'd like to read it.

Multi-texturing's big thing is a bandwidth improvement by reducing frame-buffer reads. However, you can also improve fill-rate if you have data fast enough. However, if data isn't supplied fast enough your pipeline will stall. So if you have pipes with 2 texturing units each, and one texel isn't supplied to your block, the whole pipe has to stall and wait for that texel. However, as long as this doesn't happen you'll be more effective with fill-rate.

Ok.. I think that is it. As I said, I'm rushing on this one. If I missed something, I'll catch it when I get back.

 

DaveB3D

Senior member
Sep 21, 2000
927
0
0
Missed something.

How does it help other aliasing artifacts? Good question. Well you need to keep in mind that just because you are using a single texel lookup (so basically, a color value) for all four samples. However, you still have 4 seperate Z compares at the sub-sample. So you're comparing Z values and you'll intern get a 50/50 value based on that results in the removal of aliasing artifacts. Now just pop in some 32-tap anisotropic filtering (higher the better of course) and you're good to go.

 
sale-70-410-exam    | Exam-200-125-pdf    | we-sale-70-410-exam    | hot-sale-70-410-exam    | Latest-exam-700-603-Dumps    | Dumps-98-363-exams-date    | Certs-200-125-date    | Dumps-300-075-exams-date    | hot-sale-book-C8010-726-book    | Hot-Sale-200-310-Exam    | Exam-Description-200-310-dumps?    | hot-sale-book-200-125-book    | Latest-Updated-300-209-Exam    | Dumps-210-260-exams-date    | Download-200-125-Exam-PDF    | Exam-Description-300-101-dumps    | Certs-300-101-date    | Hot-Sale-300-075-Exam    | Latest-exam-200-125-Dumps    | Exam-Description-200-125-dumps    | Latest-Updated-300-075-Exam    | hot-sale-book-210-260-book    | Dumps-200-901-exams-date    | Certs-200-901-date    | Latest-exam-1Z0-062-Dumps    | Hot-Sale-1Z0-062-Exam    | Certs-CSSLP-date    | 100%-Pass-70-383-Exams    | Latest-JN0-360-real-exam-questions    | 100%-Pass-4A0-100-Real-Exam-Questions    | Dumps-300-135-exams-date    | Passed-200-105-Tech-Exams    | Latest-Updated-200-310-Exam    | Download-300-070-Exam-PDF    | Hot-Sale-JN0-360-Exam    | 100%-Pass-JN0-360-Exams    | 100%-Pass-JN0-360-Real-Exam-Questions    | Dumps-JN0-360-exams-date    | Exam-Description-1Z0-876-dumps    | Latest-exam-1Z0-876-Dumps    | Dumps-HPE0-Y53-exams-date    | 2017-Latest-HPE0-Y53-Exam    | 100%-Pass-HPE0-Y53-Real-Exam-Questions    | Pass-4A0-100-Exam    | Latest-4A0-100-Questions    | Dumps-98-365-exams-date    | 2017-Latest-98-365-Exam    | 100%-Pass-VCS-254-Exams    | 2017-Latest-VCS-273-Exam    | Dumps-200-355-exams-date    | 2017-Latest-300-320-Exam    | Pass-300-101-Exam    | 100%-Pass-300-115-Exams    |
http://www.portvapes.co.uk/    | http://www.portvapes.co.uk/    |