Ok, here's the problem.
I've got a cluster of DEC Alphas connected by 100baseT switched ethernet.
The network throughput is good between the cluster nodes (about 94 mbits/s half duplex, about 88-89 mbits/s each way full duplex).
But the throughput from the server to the nodes is terrible. About 70 mbits/s half-duplex, between 20 and 50 mbits/s each way full duplex.
The benchmark numbers are obtained with netperf, using the tcp_stream_script
All of the systems are running Suse 7.3 with a 2.4.17 kernel.
All of thesystems are using a Tulip based network card. There are two different tulip cards usedon the cluster, but the problem doesn't correlate with that.
The server has two of these NICs, one to talk to the cluster, and one for talking to the rest of the world.
At this point, we've tried rebuilding the kernel with module NIC drivers instead of building it into the kernel. We've tried loading the alternate tulip driver (DE45?), but it wouldn't load properly.
Next on the list of things to try:
Try another cable between the server and cluster switch.
Try another port on the switch.
Try another network card.
The network cables (all of them) may be to close to the power cables. We could rearrange them, but why wouldn't it be interfering with the node-to-node connections?
Any other ideas?
I've got a cluster of DEC Alphas connected by 100baseT switched ethernet.
The network throughput is good between the cluster nodes (about 94 mbits/s half duplex, about 88-89 mbits/s each way full duplex).
But the throughput from the server to the nodes is terrible. About 70 mbits/s half-duplex, between 20 and 50 mbits/s each way full duplex.
The benchmark numbers are obtained with netperf, using the tcp_stream_script
All of the systems are running Suse 7.3 with a 2.4.17 kernel.
All of thesystems are using a Tulip based network card. There are two different tulip cards usedon the cluster, but the problem doesn't correlate with that.
The server has two of these NICs, one to talk to the cluster, and one for talking to the rest of the world.
At this point, we've tried rebuilding the kernel with module NIC drivers instead of building it into the kernel. We've tried loading the alternate tulip driver (DE45?), but it wouldn't load properly.
Next on the list of things to try:
Try another cable between the server and cluster switch.
Try another port on the switch.
Try another network card.
The network cables (all of them) may be to close to the power cables. We could rearrange them, but why wouldn't it be interfering with the node-to-node connections?
Any other ideas?