GPU server must be down ?

Markfw · Oct 27, 2008

And its not my fault ! What is with Stanford ? If it isn;t EUE units, then its their server down not mine ! You can;t win for loosing.

arg.....

jonesthewine · Oct 28, 2008

Yep, noticed the same thing since yesterday afternoon; lots of down time for the GPU client. Manually ending the process and restarting the app will usually result in a new WU download, for me anyway.

Insidious · Oct 28, 2008

Same at my place.

How am I supposed to keep this house warm with no crunching?

:sun:

Markfw · Oct 28, 2008

Originally posted by: jonesthewine
Yep, noticed the same thing since yesterday afternoon; lots of down time for the GPU client. Manually ending the process and restarting the app will usually result in a new WU download, for me anyway.

I tried that many times, to no avail....

Gravity · Oct 28, 2008

Somehow I knew that when my gpu's weren't phoning home that this thread would be written.

Keep folding,

Gravity

Insidious · Oct 28, 2008

Originally posted by: jonesthewine

Manually ending the process and restarting the app will usually result in a new WU download, for me anyway.

Purely coincidence. When the client doesn't make a successful connection it trys again after a set time delay. With more failed attempts, the delay lenghtens. All restarting the client does is make it revert back to the more frequent initial checks. If the servers are back, then it connects, if they aren't.... it doesn't and the interval lengthens again.

-Sid

Foxery · Oct 28, 2008

Nearly half of Stanford's servers are dead or overloaded today:
Server Status page
Including the main assignment server for classic CPU clients! It's a mess.

I'm normally very patient with them, but I'm losing it. I'm trying to contribute to a nonprofit org, and their office is falling apart around them.

Foxery · Oct 28, 2008

Oh, bleh. It turns out this was a planned outage, but the only notice they gave was on the news blog - and only yesterday. They've otherwise been ignoring the million reports and cries for help at the forum until now. /sigh

geokilla · Oct 28, 2008

If you restart your clients, it'll help you improve your chances of getting a GPU WU.

Gravity · Oct 28, 2008

I'm losing interest in keeping up with the gpu.

Denithor · Oct 28, 2008

Originally posted by: geokilla
If you restart your clients, it'll help you improve your chances of getting a GPU WU.

Kinda hard to do from work...

Originally posted by: Gravity
I'm losing interest in keeping up with the gpu.

:Q

Egad, I'm not...I just put together a 4xGPU box just to crank out more ppd...

Insidious · Oct 28, 2008

It gets even BETTER :roll:

They released a new project today..... 5801

It's crashing every installation except for a very few cards. The thing gives out so many EUEs in a short time that the client pauses itself for 24 hours. (4 of my 5 were paused when I came home today)

If you want, you can stop the client, empty the work directory (work folder, queue.dat, unitinfo.txt) and then restart it, but it will most likely just happen again in a very short time.

I cleaned it out an hour ago and already have one client on a 24 hour pause. I'm just going to let it stay paused and check on them tomorrow.

THIS SUCKS! :|

Insidious · Oct 28, 2008

They are claiming to have taken the 5801 WUs out of the servers now, so after the last strays get purged, we might be back to only waiting on the servers. :roll:

Cutthroat · Oct 28, 2008

Originally posted by: Insidious
It gets even BETTER :roll:

They released a new project today..... 5801

It's crashing every installation except for a very few cards. The thing gives out so many EUEs in a short time that the client pauses itself for 24 hours. (4 of my 5 were paused when I came home today)

If you want, you can stop the client, empty the work directory (work folder, queue.dat, unitinfo.txt) and then restart it, but it will most likely just happen again in a very short time.

I cleaned it out an hour ago and already have one client on a 24 hour pause. I'm just going to let it stay paused and check on them tomorrow.

THIS SUCKS! :|

Damn, I just came to post about this, it has happened to me. I've been getting 5801, and it can't even read the WU. It redownloaded it a bunch of times and then started saying my machine was unstable and pausing for 24 hours.

Here's some of the log in case you're interested.

[21:27:52] Folding@Home GPU Core - Beta
[21:27:52] Version 1.15 (Mon Oct 13 11:11:30 PDT 2008)
[21:27:52]
[21:27:52] Compiler : Microsoft (R) 32-bit C/C++ Optimizing Compiler Version 14.00.50727.762 for 80x86
[21:27:52] Build host: amoeba
[21:27:52] Board Type: Nvidia
[21:27:52] Core :
[21:27:52] Preparing to commence simulation
[21:27:52] - Assembly optimizations manually forced on.
[21:27:52] - Not checking prior termination.
[21:27:52] - Expanded 42895 -> 246265 (decompressed 574.1 percent)
[21:27:52] Called DecompressByteArray: compressed_data_size=42895 data_size=246265, decompressed_data_size=246265 diff=0
[21:27:52] - Digital signature verified
[21:27:52]
[21:27:52] Project: 5801 (Run 9, Clone 64, Gen 0)
[21:27:52]
[21:27:52] Assembly optimizations on if available.
[21:27:52] Entering M.D.
[21:27:59] Working on p5801_supervillin_e1
[21:27:59] Client config found, loading data.
[21:27:59] Starting GUI Server
[21:28:10] mdrun_gpu returned
[21:28:10] NANs detected on GPU
[21:28:10]
[21:28:10] Folding@home Core Shutdown: UNSTABLE_MACHINE
[21:28:14] CoreStatus = 7A (122)
[21:28:14] Sending work to server
[21:28:14] Project: 5801 (Run 9, Clone 64, Gen 0)
[21:28:14] - Read packet limit of 540015616... Set to 524286976.
[21:28:14] - Error: Could not get length of results file work/wuresults_07.dat
[21:28:14] - Error: Could not read unit 07 file. Removing from queue.
[21:28:14] Trying to send all finished work units
[21:28:14] + No unsent completed units remaining.
[21:28:14] - Preparing to get new work unit...
[21:28:14] + Attempting to get work packet
[21:28:14] - Will indicate memory of 4094 MB
[21:28:14] - Connecting to assignment server
[21:28:14] Connecting to http://assign-GPU.stanford.edu:8080/
[21:28:15] Posted data.
[21:28:15] Initial: 43AB; - Successful: assigned to (171.67.108.11).
[21:28:15] + News From Folding@Home: GPU folding beta
[21:28:15] Loaded queue successfully.
[21:28:15] Connecting to http://171.67.108.11:8080/
[21:28:15] Posted data.
[21:28:15] Initial: 0000; - Receiving payload (expected size: 43407)
[21:28:16] - Downloaded at ~42 kB/s
[21:28:16] - Averaged speed for that direction ~53 kB/s
[21:28:16] + Received work.
[21:28:16] Trying to send all finished work units
[21:28:16] + No unsent completed units remaining.
[21:28:16] + Closed connections
[21:28:21]
[21:28:21] + Processing work unit
[21:28:21] Core required: FahCore_11.exe
[21:28:21] Core found.
[21:28:21] Working on queue slot 08 [October 28 21:28:21 UTC]
[21:28:21] + Working ...
[21:28:21] - Calling '.\FahCore_11.exe -dir work/ -suffix 08 -checkpoint 15 -forceasm -verbose -lifeline 3952 -version 620'

[21:28:21]
[21:28:21] *------------------------------*
[21:28:21] Folding@Home GPU Core - Beta
[21:28:21] Version 1.15 (Mon Oct 13 11:11:30 PDT 2008)
[21:28:21]
[21:28:21] Compiler : Microsoft (R) 32-bit C/C++ Optimizing Compiler Version 14.00.50727.762 for 80x86
[21:28:21] Build host: amoeba
[21:28:21] Board Type: Nvidia
[21:28:21] Core :
[21:28:21] Preparing to commence simulation
[21:28:21] - Assembly optimizations manually forced on.
[21:28:21] - Not checking prior termination.
[21:28:21] - Expanded 42895 -> 246265 (decompressed 574.1 percent)
[21:28:21] Called DecompressByteArray: compressed_data_size=42895 data_size=246265, decompressed_data_size=246265 diff=0
[21:28:21] - Digital signature verified
[21:28:21]
[21:28:21] Project: 5801 (Run 9, Clone 64, Gen 0)
[21:28:21]
[21:28:21] Assembly optimizations on if available.
[21:28:21] Entering M.D.
[21:28:27] Working on p5801_supervillin_e1
[21:28:28] Client config found, loading data.
[21:28:28] Starting GUI Server
[21:28:39] mdrun_gpu returned
[21:28:39] NANs detected on GPU
[21:28:39]
[21:28:39] Folding@home Core Shutdown: UNSTABLE_MACHINE
[21:28:43] CoreStatus = 7A (122)
[21:28:43] Sending work to server
[21:28:43] Project: 5801 (Run 9, Clone 64, Gen 0)
[21:28:43] - Read packet limit of 540015616... Set to 524286976.
[21:28:43] - Error: Could not get length of results file work/wuresults_08.dat
[21:28:43] - Error: Could not read unit 08 file. Removing from queue.
[21:28:43] EUE limit exceeded. Pausing 24 hours.

This sucks, now I guess I have to reinstall the client to get it to go again, hope that works anyway.

Insidious · Oct 28, 2008

Originally posted by: Insidious

If you want, you can stop the client, empty the work directory (work folder, queue.dat, unitinfo.txt) and then restart it

you don't have to do a full re-install

Foxery · Oct 28, 2008

Originally posted by: Insidious
They released a new project today..... 5801

It's crashing every installation except for a very few cards. The thing gives out so many EUEs in a short time that the client pauses itself for 24 hours. (4 of my 5 were paused when I came home today)

This exemplifies my recent frustration. Pande Group is full of brilliant people who have created some amazing things; hence why I am doubly baffled when they do incredibly stupid things.

(i.e. New projects are alpha tested. They didn't see this? FFS.)

Now, they have hired some sort of outside contractor to rewrite some of their server code. Let's hope this gives the project the help it needs.

Cutthroat · Oct 28, 2008

Originally posted by: Insidious

Originally posted by: Insidious

If you want, you can stop the client, empty the work directory (work folder, queue.dat, unitinfo.txt) and then restart it

Click to expand...

you don't have to do a full re-install

Yeah, I did this and it started up OK again.

GPU server must be down ?

Markfw

Moderator Emeritus, Elite Member

jonesthewine

Senior member

Insidious

Diamond Member

Markfw

Moderator Emeritus, Elite Member

Gravity

Diamond Member

Insidious

Diamond Member

Foxery

Golden Member

Foxery

Golden Member

geokilla

Platinum Member

Gravity

Diamond Member

Denithor

Diamond Member

Insidious

Diamond Member

Insidious

Diamond Member

Cutthroat

Golden Member

Insidious

Diamond Member

Foxery

Golden Member

Cutthroat

Golden Member

TRENDING THREADS