Which is more advanced, GPU's or CPU's? (Gamewise)

mizzou · Apr 7, 2008

Sort of a silly question, but why do I always get the feeling that neither are well in tune?

We are finally getting life-like images of people and also our environment. What is the lag with our ability to obtain the behavior of our life-like environment? (Earth, wind, fire, water, tensile strength, weight, etc.)

I would consider most of those as *unseen* processes that would typically be considered processes more focused on the CPU then the GPU.

The GPU however is much younger then the CPU, so my assumption that the GPU is more advanced is probably laughable. But I can' t help but feel it is true.

firewolfsm · Apr 7, 2008

Well both are made on the the same process technology (65-45nm) so neither is really more advanced in that sense. GPUs are typically larger chips with more transistors and output more data (hence the fast, dedicated ram.) GPUs only work with highly parallelized tasks and are much better at them the CPUs. Still, it's not feasible to program everything to run that way.

I would say the GPU is more advanced in that it CAN do more than CPUs when given the right problems, much more. But it is also the bottleneck in most games because what we've been asking is higher resolutions and more lighting effects.

Lord Banshee · Apr 8, 2008

I would say a CPU is more advance than a GPU just because most of a GPU is the same components over and over (so i think). Where a CPU has many more different components to be designed and linked together.

But i would personally like to see this kind of technology and process,
Computer
- x86 CPU
- GPU Card
- FPGA Card
- Sound Card
- In/Out Chipset
- RAM
- Storage

Where the big one there is the FPGA Card. What if a user could buy a reconfigurable card for the price of a GPU (in time if the with process get more accepted and more companies design these units). What would you do with such a card? My answer is ANYTHING. If there was a standard for this type card and the OS drivers, so any software can see what kind you have and how much area, and given bandwidth. The companies that write software and games and then just have a switch in their code if FPGA found, program custom circuit for "Fire Physics" for example. Now when the game starts "full focus" you would have your x86 running typical non-parallel code, gpu taking care of all the graphics, then you would have this custom circuit designed for the game form that game company doing a task they seem fit that would be too slow on a CPU to do. It could really be anything really, just give the software companies the option. There are so many options here, you can distribute the area so multi-programs have sections of the FPGA (new FPGA can do dynamic programming). I know my wording is most likely not the best, but i think you get the idea. Instant Application specific speedup for any kind of code that shows parallelism. Then i guess you would ask who would write/make these custom hardware circuit? Well if high-level synthesis isn't matured enough, then why not hire computer engineers and (digital) Electrical Engineers, there pay grade is pretty much the same as a software engineer, seem all logical to me?

I know there is research being done so that this type of process is done completely behind the scenes, so you just have some software and the custom hardware with adept to the normal binary code to create speed up where needed and update the binary code going to the CPU, such devices called "Warp Processors", http://www.cs.ucr.edu/~vahid/warp/. This research is great, and would be damn cool one day but i think there needs to be an slower approach to set in these types of devices and the possible speed-up into the computer industry as a whole not just the scientific and HPC crowd.

Just my thoughts, sorry if this is so off topic, but i think such an idea and device would help give user the power to run these types of loads instead of just waiting...... Also i am taking a Reconfigurable Computing Class right now and so all these types of idea are in my head of way to speed up typical things i do everyday and so i had to ramble to someone about one of them lol.

Modelworks · Apr 8, 2008

The cpu.
Mainly because it can do many different processes and isn't locked into any one field.
Its more general and that takes lots of work to make something that is all around good and not just good in any one area.

Programming fpga on the fly is already done with large applications, think 4000 cpu+.
There really is no reason though that it couldn't be done on the desktop now.
The hardware is really cheap to do that now.

You can program fpga on pc now for as little as 50.00

BrownTown · Apr 8, 2008

I would say the CPU (although its silly to compare really, 2 totally different functions), just because they are usually designed agt a transistor level as opposed to a GPU which is done using higher level synthesis. Also, a GPU usually has tons of copies of the same thing whereas a CPU you have maybe 2 or 4 cores, not dozens of identical units.

Lord Banshee · Apr 8, 2008

Originally posted by: Modelworks
The cpu.
Mainly because it can do many different processes and isn't locked into any one field.
Its more general and that takes lots of work to make something that is all around good and not just good in any one area.

Programming fpga on the fly is already done with large applications, think 4000 cpu+.
There really is no reason though that it couldn't be done on the desktop now.
The hardware is really cheap to do that now.

You can program fpga on pc now for as little as 50.00

Not sure what you meant by this statement? "Programming fpga on the fly is already done with large applications, think 4000 cpu+." But if are are saying having 4000+ cpus being used to program FPGA on the fly, then that kinda defeats the purpose as it will not be anytime soon desktop solutions will have that kind of power. If you are saying that having FPGA with 4000+ softcore CPUs programed on them and have them running 4000+ different programs that change, that not really being program real-time as the hardware is still static, that is just reading and outputting memory. So not sure what you meant by that statement?

$50 is just for a FPGA board that is program via computer cable not something that is part of the PC though. And if you consider middle/high-end FPGA they can cost hundreds of dollars just for the chip.

The boards we are using for our class are Virtex4 PCIE boards with C libraries and they are a couple thousand dollars. Made by Nallatech. Also to make a bit file for this type of architecture task at least 20 minutes, so not very useful either if you plan on only running your application for 10 minutes.

But i can see if they would start with these low-end FPGAs first (cyclone and spartan) i can see a PCIE board costing around $200-$300 SRP. Not bad for something that can have endless applications. But i think the hard part is getting an efficient driver/library system and getting the industry on board. There are many more issues but i still don't see why there is not more talk about this type of combination of technology in the desktop application space.

And not to mention that there is already Hyper Transport socket FPGAs and soon there will have QuickPath versions.

Nathelion · Apr 9, 2008

Define "advanced".

Cogman · Apr 9, 2008

Both are equal in technological advancements. I tend to side with the CPU in design advance ments just because (like many have said) it is able to do a lot of jobs fairly decently compaired to the GPU's ability to do a couple of jobs really well.

jonmullen · Apr 9, 2008

I don't think there is really any question, as far as technical abilities I think GPU's take the cake, but you have to understand that is entirely based upon where the GPU designers are coming from apposed to where the CPU designers are coming from. It is one thing to build a new custom computers on a board (aka a graphics card) with brand new memory bus systems and new ram technology what seems like every 6 months. Not to mention this PC you are building is only asked to perform one task, not to mention your only concern as far as compatibility is concerned falls upon your driver and their ability to squeeze every last bit of performance out of the card given the current DirectX/OpenGL requirements. The hardware is certainly newer and in that sense more advanced. As for their short comings in functionality that was a result of no one asking them to do anything else but graphics, and as if Nvidia CUDA should be an indication that is all changing. The raw computational power hidden in these high end cards is largely unharnessed. CPU manufactures for a large part have been held back by longer development cycles and x86 compatibility. The Cell however is a really interesting beast and is probably an indication of where the industry is going. One powerful general processing units with a backwards compatible instruction set backed by 8 specialized execution units much akin to shader units in a GPU. The real question in my mind is that as the hardware moves toward not just dual or quad core's but massively parallel units much like GPU's can all the barely competent programmers we have now for the large part really make effective use of the hardware with the current tools we have now. Threading has clearly shown its limitations when it comes to parallel processing, or let me put it more bluntly THREADING IS FOR THE ILLUSION OF MULTIPLE PROCESSORS NOT THE PROGRAMMING OF THEM.

BrownTown · Apr 9, 2008

jonmullen, I think you are vastly overestimating the GPU here, the reason why they can be developed so quickly is exactly BECAUSE they are so simple. Graphics rendering is a very highly parellel task so all you have to do is make a ton of simple and identical shader units and just copy and paste. Plus its not desinged from the transistor up, they are desinged from a higher level of abstraction which makes it alot simpler. There is a reason that a new CPU takes 5 years to develop and that is BECAUSE they are made in such a complex way. A GPU has to be good at a tiny number of things, a CPU has to be pretty good at everything. Also talking about GPUs like they are so powerfull is silly. There have so many functional units because they already know exaclty that they have to do with them. For a CPU only like 5% is actual computation units, the rest of hardware to try to extract as much parellelism from the code (a problem GPUs dont have). As for your comments on threading, you do realise that GPUs use hunderds of threads and yet you mock a CPU for trying to use 4?

degibson · Apr 9, 2008

This is something of a pie-in-the-sky debate, because both are pretty darn advanced at this point. The rub lies in the fact that GPUs scale a lot better than CPUs. GPUs do one thing -- numeric computation -- over and over and over and over lots of data. LOTS of data. That means its straightforward to marshall data into nice, parallel chunks to do lots of calculation in parallel. It may seem like computers are all numbers (which is true), but in general purpose computing (i.e. every part of a game that isn't graphics), that kind of numeric parallelism is rare.

CPUs are basically general-purpose tools, like swiss army knives, for any kind of computation. Yes, you can do graphics on a CPU -- but it'll be slow. It would be like shredding garlic with a swiss army knife. Wouldn't you rather use a garlic press (GPU)? Of course. Now, can you use the garlic press to peel carrots? No, not with any real success.

So getting back to the original question, I'd say that CPUs are more advanced, since GPUs are essentially special-purpose vector processors, which have been around for decades.

jonmullen · Apr 9, 2008

Originally posted by: BrownTown
jonmullen, I think you are vastly overestimating the GPU here, the reason why they can be developed so quickly is exactly BECAUSE they are so simple. Graphics rendering is a very highly parellel task so all you have to do is make a ton of simple and identical shader units and just copy and paste. Plus its not desinged from the transistor up, they are desinged from a higher level of abstraction which makes it alot simpler. There is a reason that a new CPU takes 5 years to develop and that is BECAUSE they are made in such a complex way. A GPU has to be good at a tiny number of things, a CPU has to be pretty good at everything. Also talking about GPUs like they are so powerfull is silly. There have so many functional units because they already know exaclty that they have to do with them. For a CPU only like 5% is actual computation units, the rest of hardware to try to extract as much parellelism from the code (a problem GPUs dont have). As for your comments on threading, you do realise that GPUs use hunderds of threads and yet you mock a CPU for trying to use 4?

OK I will start with your threading comment since you this is one of my biggest pet peeves. You do realize that threads have nothing to do with GPU or CPU hardware. Threads are an Operating System concept to implement a multi-programming environment. Notice I said multi-programming as in appear to let you run multiple programs and not multi-processing which allows you to actually execute multiple programs at any given time. Threads have been hijacked from this realm for which they were designed wedged in as a mechanism to feed multiple processors.

Second I have no idea where you pulled the 5% number from. The main difference between GPU's and CPU is not that CPUs are very complex beasts and GPUs are not the result of multiple copy and pasts. That is like me saying all integrated circuits are just a lot of copy and pasted NAND gates, which they are but that is still a pretty pointless comment. The main difference as I see it is the instruction philosophy. CPUs revolve around the Multiple Instructions Multiple Data instructions (MIMD) where as GPUs have moved toward a Single Instruction Multiple Data (SIMD) instruction set. Yes CPU handle a wider variety of tasks with a lot of context switching very well, but that is a designed decision and not the result of CPUs be so much more advanced that GPUs, but rather a decision based upon the desired result of a GPU.

Lastly since you say CPUs are designed from the transistor up, why is the Core based upon the P3 architecture, so who exactly here is doing all this copying and pasting of processor components you are talking about.

Lord Banshee · Apr 9, 2008

Originally posted by: jonmullen

Originally posted by: BrownTown
jonmullen, I think you are vastly overestimating the GPU here, the reason why they can be developed so quickly is exactly BECAUSE they are so simple. Graphics rendering is a very highly parellel task so all you have to do is make a ton of simple and identical shader units and just copy and paste. Plus its not desinged from the transistor up, they are desinged from a higher level of abstraction which makes it alot simpler. There is a reason that a new CPU takes 5 years to develop and that is BECAUSE they are made in such a complex way. A GPU has to be good at a tiny number of things, a CPU has to be pretty good at everything. Also talking about GPUs like they are so powerfull is silly. There have so many functional units because they already know exaclty that they have to do with them. For a CPU only like 5% is actual computation units, the rest of hardware to try to extract as much parellelism from the code (a problem GPUs dont have). As for your comments on threading, you do realise that GPUs use hunderds of threads and yet you mock a CPU for trying to use 4?

Click to expand...

OK I will start with your threading comment since you this is one of my biggest pet peeves. You do realize that threads have nothing to do with GPU or CPU hardware. Threads are an Operating System concept to implement a multi-programming environment. Notice I said multi-programming as in appear to let you run multiple programs and not multi-processing which allows you to actually execute multiple programs at any given time. Threads have been hijacked from this realm for which they were designed wedged in as a mechanism to feed multiple processors.

Second I have no idea where you pulled the 5% number from. The main difference between GPU's and CPU is not that CPUs are very complex beasts and GPUs are not the result of multiple copy and pasts. That is like me saying all integrated circuits are just a lot of copy and pasted NAND gates, which they are but that is still a pretty pointless comment. The main difference as I see it is the instruction philosophy. CPUs revolve around the Multiple Instructions Multiple Data instructions (MIMD) where as GPUs have moved toward a Single Instruction Multiple Data (SIMD) instruction set. Yes CPU handle a wider variety of tasks with a lot of context switching very well, but that is a designed decision and not the result of CPUs be so much more advanced that GPUs, but rather a decision based upon the desired result of a GPU.

Lastly since you say CPUs are designed from the transistor up, why is the Core based upon the P3 architecture, so who exactly here is doing all this copying and pasting of processor components you are talking about.

I could be mistaken but when BrownTown was talking about CPU being designed at the transistor level he meant this.

When intel for example, makes a new processor they start the layout from the transistor layer, i.e. they have engineers working on layout at a single transistor at a time, source, drain, contracts, all that good stuff, for the mask. Where nvidia, for example, might do some of that, most of it is done using automated tools from designs at a higher RTL level. Graphics card could never come out every 6 months if they worked from the Si up like many CPU designs are done. I believe that is all he was saying about that, not the copy and paste analogy you used.

I remember even reading an article from either nvidia or ati about their process and that's pretty much how they describe it, and i think it was like a 50% automated, 50% engineered.

BrownTown · Apr 9, 2008

Originally posted by: Lord Banshee
I could be mistaken but when BrownTown was talking about CPU being designed at the transistor level he meant this.

When intel for example, makes a new processor they start the layout from the transistor layer, i.e. they have engineers working on layout at a single transistor at a time, source, drain, contracts, all that good stuff, for the mask. Where nvidia, for example, might do some of that, most of it is done using automated tools from designs at a higher RTL level. Graphics card could never come out every 6 months if they worked from the Si up like many CPU designs are done. I believe that is all he was saying about that, not the copy and paste analogy you used.

I remember even reading an article from either nvidia or ati about their process and that's pretty much how they describe it, and i think it was like a 50% automated, 50% engineered.

Yes, thats what I meant, "copy and paste" means that a GPU with say 16 pixel shaders all 16 are exactly the same, just like in a dual core the two cores are the exact same.

As for jonmullen's comments on threading I would note that while you may want to think of threads in a software context they very much influence the design of the hardware. As mentioned concerning the GPU, if you only had one thread running then why have 32 shader units? It is exactly because they can run huge numbers of threads that they can utilize all these units. The parellelism of the task is what allows a GPU to do so many calculations. I mean its not just that a GPU is a SIMD unit, its has dozens (now hundreds) of SIMD units running hundreds of threads all at the same time.

jonmullen · Apr 9, 2008

Originally posted by: Lord Banshee
I could be mistaken but when BrownTown was talking about CPU being designed at the transistor level he meant this.

When intel for example, makes a new processor they start the layout from the transistor layer, i.e. they have engineers working on layout at a single transistor at a time, source, drain, contracts, all that good stuff, for the mask. Where nvidia, for example, might do some of that, most of it is done using automated tools from designs at a higher RTL level. Graphics card could never come out every 6 months if they worked from the Si up like many CPU designs are done. I believe that is all he was saying about that, not the copy and paste analogy you used.

I remember even reading an article from either nvidia or ati about their process and that's pretty much how they describe it, and i think it was like a 50% automated, 50% engineered.

I understand that the design process between the two is different in some regards, and your absolutely they could not keep up to their release schedule without automated design tools, but i also think it is worth wild to note that the component that usually gets the upgrade or at least the more frequent upgrade on a GPU is the memory subsystem. The GPU release cycle seems to follow a template kinda like this:

(New Core) --> (Bump memory and core speed due to refined manufacturing) --> (add new memory system think DDR27 comming to a GPU near you) --> (repeat)

Granted this is overly simplified, but you get the idea. I think it would be a fallacy to say that Unified Shader units don't undergo a similar design process as CPU's. The automation aspect comes into much clearer perspective when we remember that GPUs are fast becoming/are independent computers on a PCI/AGP card. For the most part GPU release cycles read like and AnandTech Guide to building a high end PC with the latest in memory, chip sets, optical drives, hard drives and processors. When you have that many different elements it makes sense to use some automation tool to handle their inner-connections, but at some point someone designed the memory controller and the Shader units of a GPU with the same thought and process that goes into CPU design.

But to try and tie this whole tangent we have gone on back to the OP question in this thread my argument is that GPUs are more advanced in that they not only always have the latest in memory subsystems, but are leading the path of parallelism which is or only salvation for more speed when that dreaded day of Moore's Law's demise finally comes.

Modelworks · Apr 9, 2008

Originally posted by: Lord Banshee

Not sure what you meant by this statement? "Programming fpga on the fly is already done with large applications, think 4000 cpu+."

What I meant was that it is done on the fly to change how large supercomputers interact with each node. At cray we used several fpga in between each set of nodes to change the way the data flowed. Sending data to specific nodes based on criteria handled by the fpga.

$50 is just for a FPGA board that is program via computer cable not something that is part of the PC though. And if you consider middle/high-end FPGA they can cost hundreds of dollars just for the chip.

Actually its for a fully in place fpga on a pci card.
Cost drops dramatically when you do large scale, 100K+ units.
Granted its not going to have top of the line fpga at that price.

The boards we are using for our class are Virtex4 PCIE boards with C libraries and they are a couple thousand dollars. Made by Nallatech. Also to make a bit file for this type of architecture task at least 20 minutes, so not very useful either if you plan on only running your application for 10 minutes.

The reason for the high price for the boards you use isn't because the hardware is that expensive. Its because of the sector its targeting. Development kits are always more expensive than the actual components would be.
Xilinx is a great company, but the prices they charge for kits has scared off quite a few developers.

jonmullen · Apr 9, 2008

Originally posted by: BrownTown

As for jonmullen's comments on threading I would note that while you may want to think of threads in a software context they very much influence the design of the hardware. As mentioned concerning the GPU, if you only had one thread running then why have 32 shader units? It is exactly because they can run huge numbers of threads that they can utilize all these units. The parellelism of the task is what allows a GPU to do so many calculations. I mean its not just that a GPU is a SIMD unit, its has dozens (now hundreds) of SIMD units running hundreds of threads all at the same time.

I think you have some confusion as to what threads are Wikipedia. Processors execute instructions, threads are a design concept that are for right or wrong ever increasingly being used to manage and feed multiple processors instructions. You are exactly right that when a SIMD instruction is executed on a GPU it can kick off multiple execution units which can easily be conceived through the thread analogy, but to call the two things a thread would be like me calling this tower next to my monitor a CPU.

Lord Banshee · Apr 9, 2008

Originally posted by: Modelworks

What I meant was that it is done on the fly to change how large supercomputers interact with each node. At cray we used several fpga in between each set of nodes to change the way the data flowed. Sending data to specific nodes based on criteria handled by the fpga.

Still does not sound like FPGA is being programmed on the fly for a specific task. That examples sounds like an FPGA is "Already" programed with logic that see inputs and effect the outputs of switches for the nodes. Sounds like a typical circuit? What would be "on the fly" is if one of the cpus send a command to the FPGA to have complete different criteria than before and writes a whole new hardware circuit to the FPGA during the middle of its operation. This is the idea i am talking about. When every new application loads, it reprograms the FPGA for its purpose only, or portion of it for its purpose.

Actually its for a fully in place fpga on a pci card.
Cost drops dramatically when you do large scale, 100K+ units.
Granted its not going to have top of the line fpga at that price.

Is this custom boards? if not have a link to buy? And i assume there isn't a lot of huge datasets traveling to these cards, the delay of the PCI would kinda a show stopper unless there are some very good drivers and hardware to take care of this. The boards we use that are PCIE are pretty damn slow at transferring data from computer to card. You would only get speed up in large datasets.

The reason for the high price for the boards you use isn't because the hardware is that expensive. Its because of the sector its targeting. Development kits are always more expensive than the actual components would be.
Xilinx is a great company, but the prices they charge for kits has scared off quite a few developers.

true true, but i am not sure what other kind of hardware is on this board but it has some very specific stuff that isn't cheap.

BrownTown · Apr 10, 2008

Originally posted by: jonmullen
I think you have some confusion as to what threads are Wikipedia. Processors execute instructions, threads are a design concept that are for right or wrong ever increasingly being used to manage and feed multiple processors instructions. You are exactly right that when a SIMD instruction is executed on a GPU it can kick off multiple execution units which can easily be conceived through the thread analogy, but to call the two things a thread would be like me calling this tower next to my monitor a CPU.

no, unfortunately it is you who are confused. Maybe you should try reading a review of a modern GPU. The one in question handles THOUSANDS of threads with 320 stream processors each of which is 5 way superscalar. This isn't a case of contect switching or software, this is all handled in hardware and all done in parellel. The GPU is issues hundreds of SIMD instructions per cycle from hundreds of different threads, this has nothing to do with the OS its all done in hardware, its like 320 tiny litte CPUs, not just a single vector CPU like you appear to think it is.

Modelworks · Apr 10, 2008

Originally posted by: Lord Banshee

Originally posted by: Modelworks

What I meant was that it is done on the fly to change how large supercomputers interact with each node. At cray we used several fpga in between each set of nodes to change the way the data flowed. Sending data to specific nodes based on criteria handled by the fpga.

Click to expand...

Still does not sound like FPGA is being programmed on the fly for a specific task. That examples sounds like an FPGA is "Already" programed with logic that see inputs and effect the outputs of switches for the nodes. Sounds like a typical circuit? What would be "on the fly" is if one of the cpus send a command to the FPGA to have complete different criteria than before and writes a whole new hardware circuit to the FPGA during the middle of its operation. This is the idea i am talking about. When every new application loads, it reprograms the FPGA for its purpose only, or portion of it for its purpose.

Its not done in the middle of an operation but during switching from one task to another.
The way it worked is that some nodes were designed for specialized processing of things like sound data. The fpga decided what node was best for the data present. If the data changed then the fpga would be reprogrammed to fit the new data so that the nodes would be used the most efficiently. The fpga could be reporgrammed several times during a session. It actually started to be a problem, we were exceeding the number of times the fpga could be reprogrammed.

Actually its for a fully in place fpga on a pci card.
Cost drops dramatically when you do large scale, 100K+ units.
Granted its not going to have top of the line fpga at that price.

Is this custom boards? if not have a link to buy? And i assume there isn't a lot of huge datasets traveling to these cards, the delay of the PCI would kinda a show stopper unless there are some very good drivers and hardware to take care of this. The boards we use that are PCIE are pretty damn slow at transferring data from computer to card. You would only get speed up in large datasets.

Click to expand...

Yeah, thats custom boards.
Its not for something that would do huge amounts of data, more for what hobbyist would be interested in.

Lord Banshee · Apr 10, 2008

Originally posted by: Modelworks

Originally posted by: Lord Banshee

Originally posted by: Modelworks

What I meant was that it is done on the fly to change how large supercomputers interact with each node. At cray we used several fpga in between each set of nodes to change the way the data flowed. Sending data to specific nodes based on criteria handled by the fpga.

Click to expand...

Still does not sound like FPGA is being programmed on the fly for a specific task. That examples sounds like an FPGA is "Already" programed with logic that see inputs and effect the outputs of switches for the nodes. Sounds like a typical circuit? What would be "on the fly" is if one of the cpus send a command to the FPGA to have complete different criteria than before and writes a whole new hardware circuit to the FPGA during the middle of its operation. This is the idea i am talking about. When every new application loads, it reprograms the FPGA for its purpose only, or portion of it for its purpose.

Click to expand...

Its not done in the middle of an operation but during switching from one task to another.
The way it worked is that some nodes were designed for specialized processing of things like sound data. The fpga decided what node was best for the data present. If the data changed then the fpga would be reprogrammed to fit the new data so that the nodes would be used the most efficiently. The fpga could be reporgrammed several times during a session. It actually started to be a problem, we were exceeding the number of times the fpga could be reprogrammed.

Actually its for a fully in place fpga on a pci card.
Cost drops dramatically when you do large scale, 100K+ units.
Granted its not going to have top of the line fpga at that price.

Is this custom boards? if not have a link to buy? And i assume there isn't a lot of huge datasets traveling to these cards, the delay of the PCI would kinda a show stopper unless there are some very good drivers and hardware to take care of this. The boards we use that are PCIE are pretty damn slow at transferring data from computer to card. You would only get speed up in large datasets.

Click to expand...

Click to expand...

Yeah, thats custom boards.
Its not for something that would do huge amounts of data, more for what hobbyist would be interested in.

I see.

About the the number writes FPGA can handle. I assume you are talking about some sort of EEPROM not the FPGA it self? I not aware of anything in the FPGA that has a limit of the amount of writes. From my understanding all the control signals in the fpga are just registers(of some sort) that get programed though a serial stream of data. A register is just a bunch of transistors. Are your FPGAs created differently or am i forgetting something?

Modelworks · Apr 10, 2008

Originally posted by: Lord Banshee

About the the number writes FPGA can handle. I assume you are talking about some sort of EEPROM not the FPGA it self? I not aware of anything in the FPGA that has a limit of the amount of writes. From my understanding all the control signals in the fpga are just registers(of some sort) that get programed though a serial stream of data. A register is just a bunch of transistors. Are your FPGAs created differently or am i forgetting something?

The fpga we used at the time had an early version of flash storage on the chip.
After enough writes it would no longer maintain the configuration data and would have to be replaced.

CanOWorms · Apr 10, 2008

Originally posted by: Lord Banshee

I see.

About the the number writes FPGA can handle. I assume you are talking about some sort of EEPROM not the FPGA it self? I not aware of anything in the FPGA that has a limit of the amount of writes. From my understanding all the control signals in the fpga are just registers(of some sort) that get programed though a serial stream of data. A register is just a bunch of transistors. Are your FPGAs created differently or am i forgetting something?

Some FPGAs are one-time programmable (antifuses) and some are even flash-based instead of SRAM.

jonmullen · Apr 10, 2008

Originally posted by: BrownTown

Originally posted by: jonmullen
I think you have some confusion as to what threads are Wikipedia. Processors execute instructions, threads are a design concept that are for right or wrong ever increasingly being used to manage and feed multiple processors instructions. You are exactly right that when a SIMD instruction is executed on a GPU it can kick off multiple execution units which can easily be conceived through the thread analogy, but to call the two things a thread would be like me calling this tower next to my monitor a CPU.

Click to expand...

no, unfortunately it is you who are confused. Maybe you should try reading a review of a modern GPU. The one in question handles THOUSANDS of threads with 320 stream processors each of which is 5 way superscalar. This isn't a case of contect switching or software, this is all handled in hardware and all done in parellel. The GPU is issues hundreds of SIMD instructions per cycle from hundreds of different threads, this has nothing to do with the OS its all done in hardware, its like 320 tiny litte CPUs, not just a single vector CPU like you appear to think it is.

Since you obviously don't bother to read anything I post, I don't know why I continue and bother with this, but in the link you posted I assume you are referring to the "Ultra-Threaded Dispatch Processor" as this is the only thing that seems relevant in the article. Not to mention the fact that you are citing a hardware review site, which is effectively posting slightly technical MARKETING paper of new features. You will also see that the article quotes:

Ultra-Threaded Dispatch Processor controls execution of threads by execution units. It decides which task will be executed by this or that unit depending on requirements and priorities.

Hmm depending on requirements and priorities. Does that sound like marketing speak for a schedular. And if you look at where the "Ultra-Threaded Dispatch Processor" is located in the block layout of the processor as a whole its job is to Queue up commands be those (Vertex, Geometry or Pixel) Shader commands and feed those to the actual processing units. Here we are once again victims of marketing speak your precious "Ultra-Threaded Dispatch Processor" is nothing more than an instruction schedular whose job is to make sure is to make sure there are as few bubbles in the "Stream Processing Units" instruction pipeline. Once again the ancient computer science term and analogy of the thread is being used to visualize a process unrelated to the technical meaning of the term.

And since you have such a firm grasp of hardware you should know that Operating Systems are not just Windows / Linux / BSD that run on top of your entire computers hardware. As anyone with experience in the embedded market will tell you all kinds of Operating Systems run on all sorts of hardware, especially hardware so advanced as GPUs. On that note most of the problems you are seeing with drivers in linux especially wireless card drivers is that in order the lower the cost of the actual hardware many of those Operating System functions that would coordinate the actual function of the card are being moved off the card and into drivers. With the increased burdened of work placed on the driver as it is not longer expected to just shuttle information, but rather to process is as well processing which was historically done in hardware. Thus driver complexity grows and with that so does IP concerns and thus lack of Open Source driver releases.

BrownTown · Apr 10, 2008

Whatever jonmullen, until you can show some poof that you have an advanced degree on this topic, or a job in processor architecture I am forced to assume you simply don't know whay you are talking about.

Which is more advanced, GPU's or CPU's? (Gamewise)

Diamond Member

Golden Member

Golden Member

Lifer

Diamond Member

Golden Member

Senior member

Lifer

Platinum Member

Diamond Member

Golden Member

Platinum Member

Golden Member

Diamond Member

Platinum Member

Lifer

Platinum Member

Golden Member

Diamond Member

Lifer

Golden Member

Lifer

Lifer

Platinum Member

Diamond Member