SETI@Home Wow!-Event 2018

Page 2 - Seeking answers? Join the AnandTech community: where nearly half-a-million members share solutions and discuss the latest tech.

StefanR5R

Elite Member
Dec 10, 2016
5,687
8,258
136
Once Seti@home gets its servers back up, I'll be in
The userid required for registration can also be found locally on your host, in the boinc data directory in a file called sched_reply_setiathome.berkeley.edu.xml.

I could have switched over to primegrid and crunched with them... But, normally, I try to do my own downtime, cleaning fans and dusting out the radiators or upgrades with the seti downtime, and often times, I'm not really around to check when the rig is thirsty. I feel a bit guilty for starting and stopping working on different projects when one goes down (for meantime). Unless it was somehow automated and I didn't really have to do anything. I suppose I could split the difference, and I know some people do that. Tho if you did do it 50/50, and seti goes down, does that pool the resources to 100% of the other project?
The resource splits by percentages virtually never work as expected. The boinc client has a complicated and less than intuitive algorithm to decide which of the enabled projects to run, involving the resource percentages, the "recent" history of per-project credits, tasks deadlines, and whatnot.

But for the use case of running SETI@home whenever possible and a second project only during SETI's downtime, the following should work reliably:
  • Leave SETI@home's resource share at a non-zero value, e.g. 100 %.
  • Set the second project to a resource share of 0 %. (Yes, null percent.)
  • Leave the client's queue settings (store at least __ days of work, etc.) at values which you normally use when running SETI@home.
The 0 % setting tells the client to never cache tasks for the second project in advance. It will only fetch work for the 2nd project if it fails to get work for the primary project, or if the user suspends the first project. Also, as soon as the 1st project has work again (and it's not suspended), the client will finish the presently running work of the second project and switch back to the primary project without the user needing to intervene.

There are some additional considerations if you care for the primary project to resume as soon as possible after the server downtime:
  • The secondary project should be one with tasks that take at most a few hours, or less. Multi day tasks would be bad as a filler, obviously.
  • The client needs to continue requesting work from SETI@home's server during the downtime, in reasonable intervals between requests. Here I am not sure whether the clients keeps asking, say, in ~1 hour intervals, or if it automatically increases its request intervals to multiple hours or even days. (In the latter case, manually forced project updates or a little bit of automation are required, in order to get back to SETI in a timely fashion.)
Sorry, I'm getting off topic here for this WOW event. Hopefully during this WOW event, they don't pull 2.5 days of downtime.
IMO it's perfectly on-topic. I am still undecided myself how I will manage Maintenance Tuesday (Is it still on Tuesdays?) during the Wow!-Event. Let the hosts sit idle? Probably not, but maybe if weather is hot. Set up multiple clients per host, as Tony suggested? If so, do I run them in parallel or one after the other? I am not sure, but I believe I had several clients running in parallel during maintenance downtime at last year's Wow!-Event. Or do I take it easier with SETI this time and run a 0 % backup project instead? ... Decisions, decisions.

Because the scheduler sees them as 1 device, no matter how many CPU cores there are. GPUs however you can say you have 3 and you'll get 300 WUs
Oh right, this would be another alternative: Patch the client to have a GPU equivalent of cc_config.options.ncpus.
 

Kiska

Golden Member
Apr 4, 2012
1,025
291
136
Oh right, this would be another alternative: Patch the client to have a GPU equivalent of cc_config.options.ncpus.

You may specify this in cc_config using:
Code:
<coproc>
  <type>some_name</type>
  <count>1</count>
  <device_nums>0 2</device_nums>
  [ <peak_flops>1e10</peak_flops> ]
  [ <non_gpu/> ]
</coproc>
 
Reactions: StefanR5R

StefanR5R

Elite Member
Dec 10, 2016
5,687
8,258
136
If the client says it has 11 GPUs, it should get 1000+ tasks, right? (The client won't request more tasks once it has 1000 or more runnable tasks.)

My biggest GPU host has three 1080TIs, which probably means that I still need to employ multiple client instances, or should look into patching the client.
 

ericlp

Diamond Member
Dec 24, 2000
6,133
219
106
IMO it's perfectly on-topic. I am still undecided myself how I will manage Maintenance Tuesday (Is it still on Tuesdays?) during the Wow!-Event. Let the hosts sit idle? Probably not, but maybe if weather is hot. Set up multiple clients per host, as Tony suggested? If so, do I run them in parallel or one after the other? I am not sure, but I believe I had several clients running in parallel during maintenance downtime at last year's Wow!-Event. Or do I take it easier with SETI this time and run a 0 % backup project instead? ... Decisions, decisions.


Oh right, this would be another alternative: Patch the client to have a GPU equivalent of cc_config.options.ncpus.

Yes it is still Tuesdays. Lately, they have been pretty quick, and get the servers up in a timely fashion. The problem is... you just never know when a suprise extended outage may occur. They don't really communicate well about downtime. So, that makes it even harder to schedule around their time.

Do other projects have these issues? Or is this just a seti thing?
 

StefanR5R

Elite Member
Dec 10, 2016
5,687
8,258
136
Do other projects have these issues? Or is this just a seti thing?
Of the big projects, World Community Grid never seems to have an outage. (I run it only occasionally, hence may have missed some.) Folding@home has work servers at several sites, and when some of them are down, the F@H client usually succeeds to switch to another working one.

Of the smaller projects, PrimeGrid stands out as exceedingly well run and being available all the time.

All other projects seem to have their occasional sudden downtimes, often just very briefly, sometimes longer (even months). Their habits of communicating these things vary wildly. Projects which are run by small university departments typically rely on campus IT, and may not always be kept very well informed either what's going on in their IT department.

Re SETI@home's regular maintenance downtime + occasional unpredictable unexplained outage, two projects which are run in a strangely opposite mode come to my mind:
  • Whereas SETI@home is down for some hours every Tuesday, XANSONS for COD is up for some hours only every other Saturday.
  • Whereas SETI@home is down for some days when nobody expects it, MindModeling@Home (Beta) is up for some days when nobody expects it.
Like SETI@home's twins from an evil mirror universe.
 

[H]Coleslaw

Member
Apr 15, 2014
157
133
116
PrimeGrid's servers get free hosting from rackspace which is pretty reliable. WCG is also now hosted in the IBM cloud infrastructure and most of their downtime is planned outtages. Both projects do really well with notifying donors in advance.
 

ao_ika_red

Golden Member
Aug 11, 2016
1,679
715
136
You may specify this in cc_config using:
Code:
<coproc>
  <type>some_name</type>
  <count>1</count>
  <device_nums>0 2</device_nums>
  [ <peak_flops>1e10</peak_flops> ]
  [ <non_gpu/> ]
</coproc>
I need to learn using this.
Now it says:

Code:
01/08/2018 21:05:26 |  | Unrecognized tag in cc_config.xml: <coproc>
 

Kiska

Golden Member
Apr 4, 2012
1,025
291
136
I need to learn using this.
Now it says:

Code:
01/08/2018 21:05:26 | | Unrecognized tag in cc_config.xml: <coproc>

Are you putting the coproc tag in the <options> tag?

ie
Code:
<options>
<allow_multiple_clients>1</allow_multiple_clients>
<allow_remote_gui_rpc>1</allow_remote_gui_rpc>
<coproc>
  <type>some_name</type>
  <count>1</count>
  <device_nums>0 2</device_nums>
  [ <peak_flops>1e10</peak_flops> ]
  [ <non_gpu/> ]
</coproc>
</options>
 

ao_ika_red

Golden Member
Aug 11, 2016
1,679
715
136
I tried it earlier today but BOINC Manager kept saying "gpu missing".
fyi:
I replaced "some_name" in
<type>some_name</type> with ATI.
 

Ken g6

Programming Moderator, Elite Member
Moderator
Dec 11, 1999
16,282
3,904
75
I'm using the Zi3v CUDA 9.0 app. I heard it gets less invalid results than others. Are these apps newer and better?
 

ericlp

Diamond Member
Dec 24, 2000
6,133
219
106
Thanks. It seems there's quite a lot development on GPU app (already used it since SETI.WOW last year). I'll try it later.
One question: Do we have to uninstall old lunatics app?

I would, wouldn't hurt to just flush out the settings and redo em. Easy fast and clean...
 

lane42

Diamond Member
Sep 3, 2000
5,721
624
126
There is a newer app. for volta class GPU'S, Titan v, x41p_vo.96 special, Cuda 9.2
The guy's running 30-35 seconds a work unit almost 3x faster then my 1080ti.
Think it's still in testing, just a few have it.
 
Reactions: TennesseeTony

StefanR5R

Elite Member
Dec 10, 2016
5,687
8,258
136
@lane42, is it known whether it makes any difference in throughput at all, compared to simply running multiple CUDA 9.0 jobs in parallel?
 

StefanR5R

Elite Member
Dec 10, 2016
5,687
8,258
136
I mean better card utilization via <ngpus>0.##</ngpus> in app_config.xml.

BTW, it is faster, but not 3 times faster than 1080Ti:
petri33 at the setiathome forum said:
If and only if the current surge of Gbt vlar (blc162bit guppi) WUs keep coming on I'll be hitting more than or over 400 000 credit a day.
{
Titan V: 33 seconds
1080Ti : 58 seconds
1080 (*2): 77 seconds
}

A reduction of memory writes and subsequent reads in the first phase of pulse calculations has had an effect of a 25% speed upgrade in all vlar tasks.
To differentiate the good work from some development stages I have renamed the latest results to x41p_V0.93.

And yes. I know I have 2800 inconclusives. It will drop.
source, found via list of user posts, which I looked up after coming across another post

The ratio shown there is reasonable if we go by shader count: 5120 vs. 3584 vs. 2560. (Though chip clock matters too of course.)

PS,
it is not clear to me whether petri33 refers to task run times, or mean time between task completions.
 
sale-70-410-exam    | Exam-200-125-pdf    | we-sale-70-410-exam    | hot-sale-70-410-exam    | Latest-exam-700-603-Dumps    | Dumps-98-363-exams-date    | Certs-200-125-date    | Dumps-300-075-exams-date    | hot-sale-book-C8010-726-book    | Hot-Sale-200-310-Exam    | Exam-Description-200-310-dumps?    | hot-sale-book-200-125-book    | Latest-Updated-300-209-Exam    | Dumps-210-260-exams-date    | Download-200-125-Exam-PDF    | Exam-Description-300-101-dumps    | Certs-300-101-date    | Hot-Sale-300-075-Exam    | Latest-exam-200-125-Dumps    | Exam-Description-200-125-dumps    | Latest-Updated-300-075-Exam    | hot-sale-book-210-260-book    | Dumps-200-901-exams-date    | Certs-200-901-date    | Latest-exam-1Z0-062-Dumps    | Hot-Sale-1Z0-062-Exam    | Certs-CSSLP-date    | 100%-Pass-70-383-Exams    | Latest-JN0-360-real-exam-questions    | 100%-Pass-4A0-100-Real-Exam-Questions    | Dumps-300-135-exams-date    | Passed-200-105-Tech-Exams    | Latest-Updated-200-310-Exam    | Download-300-070-Exam-PDF    | Hot-Sale-JN0-360-Exam    | 100%-Pass-JN0-360-Exams    | 100%-Pass-JN0-360-Real-Exam-Questions    | Dumps-JN0-360-exams-date    | Exam-Description-1Z0-876-dumps    | Latest-exam-1Z0-876-Dumps    | Dumps-HPE0-Y53-exams-date    | 2017-Latest-HPE0-Y53-Exam    | 100%-Pass-HPE0-Y53-Real-Exam-Questions    | Pass-4A0-100-Exam    | Latest-4A0-100-Questions    | Dumps-98-365-exams-date    | 2017-Latest-98-365-Exam    | 100%-Pass-VCS-254-Exams    | 2017-Latest-VCS-273-Exam    | Dumps-200-355-exams-date    | 2017-Latest-300-320-Exam    | Pass-300-101-Exam    | 100%-Pass-300-115-Exams    |
http://www.portvapes.co.uk/    | http://www.portvapes.co.uk/    |