Recent Changes in projects

Page 10 - Seeking answers? Join the AnandTech community: where nearly half-a-million members share solutions and discuss the latest tech.

mmonnin03

Senior member
Nov 7, 2006
248
233
116
The MW@H tasks are highly variable. Some last 1 second, I had another take 5300 seconds (x16) as they use up to 16 threads.
 

StefanR5R

Elite Member
Dec 10, 2016
6,056
9,106
136
Yes. The "Milkyway@home N-Body Simulation v1.83" tasks had two types of tasks, one shorter and one longer, and rather little variability of CPU times within each of these two types. In contrast, the few tasks of the new "Milkyway@home N-Body Simulation with Orbit Fitting v1.86" which I completed so far (28) have CPU times all over the place. Ones with less than a second CPU time are quite frequent too. Looks like one of those physics projects in which workunits with physically nonsensical starting values are generated often and need to be weeded out by the clients.

The longest one which I had so far had 10 hours CPU time, and server_status.php agrees with that. (server_status.php says "runtime", not CPU time, but I am under the impression that the figure there is actually the latter.)
The MW@H tasks are highly variable. Some last 1 second, I had another take 5300 seconds (x16) as they use up to 16 threads.
It may make sense to restrict the number of program threads to a lot less than those 16t. On modern CPUs, even running them single-threaded looks feasible. I haven't taken the time yet to measure what a good thread count would be, for host efficiency.

Edit: Oh, I just realized that I have yet to extend my app_config.xml to the new application. I run the v1.83 with 4 threads for now. The new one executes with its default 16 threads on my computers, and the very low CPU time : run time ratios of the completed tasks shows that this is similarly inefficient as it was with the previous application.
 
Last edited:

Skillz

Senior member
Feb 14, 2014
970
999
136
Yafu is also another one of those projects with "multi threaded" apps, but it doesn't make any sense with them.

I have a 64 core/128 thread EPYC running 1 task at a time on it. Checking top, I see that it's using as much as 100 threads at once at times, sometimes it dips to around 70, but it's always a good bit above 64 which is what it should be.

My points are all over the place despite this.


I even started a second instance on the host to run some ODLK on 32 threads on the same host. It's just wild how the run times are so vastly different on yafu and the PPD varies so much as well.
 
Reactions: crashtech

StefanR5R

Elite Member
Dec 10, 2016
6,056
9,106
136
SNAFU@Home uses CreditNew, but the CreditNew algorithm can never converge to a proper host performance estimation due to the dramatic and unpredictable differences between workloads of the workunits. During the run time of the application, there are different code portions with different degree of parallelization: From single-threaded to perfectly scaling n-threaded. But how much of these portions occur during the run time, and even if some of these portions are occurring at all, differs between workunits and can't be predicted. (The application is actually pieced together from a few different programs; some serial and some parallel.)

A while ago when I ran YAFU in Formula Boinc, I got good host utilization and good RAC by means of 1.) restricting the maximum number of tasks per threads, 2.) overcommitting the host's hardware threads. teamanandtech.org may have some info.
 
Last edited:
Reactions: crashtech

StefanR5R

Elite Member
Dec 10, 2016
6,056
9,106
136
MilkyWay@Home
I just realized that I have yet to extend my app_config.xml to the new application.
Example syntax:
XML:
<app_config>
    <app_version>
        <app_name>milkyway_nbody</app_name>
        <plan_class>mt</plan_class>
        <avg_ncpus>4</avg_ncpus>
        <cmdline>--nthreads 4</cmdline>
    </app_version>
    <app_version>
        <app_name>milkyway_nbody_orbit_fitting</app_name>
        <plan_class>mt</plan_class>
        <avg_ncpus>4</avg_ncpus>
        <cmdline>--nthreads 4</cmdline>
    </app_version>
</app_config>
 
Reactions: crashtech

StefanR5R

Elite Member
Dec 10, 2016
6,056
9,106
136
MilkyWay@Home
– added a single-threaded app_version of "NBody Simulation" and "N-Body Simulation with Orbit Fitting"
– added these options to project preferences:
Max # of jobs for this project [default: No limit; options: 1…100]
Max # of CPUs for this project [default: No limit; options: 1…256]​

I am not sure if the latter really means CPUs per project, or actually threads per task.

The single-threaded app_version is supposedly slightly better performing if you run just 1 thread per task. According to the admin, the single-threaded app_version is sent to clients which set Max # of CPUs = 1.

Unfortunately, by default folks will now have a wild mixture of multithreaded and single-threaded tasks in their workqueues. It is an open question how to prevent the downloading of single-threaded tasks if a user prefers to run multi-threaded tasks only (apart from writing an app_info.xml).

(IOW, it's another fine mess.)
 

StefanR5R

Elite Member
Dec 10, 2016
6,056
9,106
136
Unfortunately, by default folks will now have a wild mixture of multithreaded and single-threaded tasks in their workqueues. It is an open question how to prevent the downloading of single-threaded tasks if a user prefers to run multi-threaded tasks only (apart from writing an app_info.xml).
On a quick look, it seems as if switching "Max # of CPUs for this project" from "No limit" to something greater than 1 causes the server to send only multithreaded tasks to the host.
 
Reactions: Skillz

Fardringle

Diamond Member
Oct 23, 2000
9,199
765
126
Is there a way to ONLY get the single threaded tasks? Since the project preferences on the site don't seem to have that option?
 

StefanR5R

Elite Member
Dec 10, 2016
6,056
9,106
136
Try setting "Max # of CPUs for this project" = 1. This should give you single-threaded tasks only. (But should still give you as many tasks at once as you possibly could want. Unless you dial down "Max # of jobs for this project".)

If the "Max # of CPUs for this project" setting at 1 does not work the way you want, add the app_config.xml from #232 and change <avg_ncpus> and <cmdline> of both appversions to 1 thread. But this really shouldn't be needed anymore.

And for completeness: As @Skillz pointed out, hosts which only have 1 active logical CPU should now receive (singlethreaded) tasks. Before the new singlethreaded appversion was added by the admin, such hosts did not receive any work at all.

________
OCD edit, the "for this project" words have been removed from the preferences page in the meantime. Or they never were there and I merely imagined them. — Edit 2: It's "Max # of CPUs" on the preferences page and "Max # of CPUs for this project" on the edit preferences page. :-)
 
Last edited:

Fardringle

Diamond Member
Oct 23, 2000
9,199
765
126
Try setting "Max # of CPUs for this project" = 1. This should give you single-threaded tasks only. (But should still give you as many tasks at once as you possibly could want. Unless you dial down "Max # of jobs for this project".)

If the "Max # of CPUs for this project" setting at 1 does not work the way you want, add the app_config.xml from #232 and change <avg_ncpus> and <cmdline> of both appversions to 1 thread. But this really shouldn't be needed anymore.

And for completeness: As @Skillz pointed out, hosts which only have 1 active logical CPU should now receive (singlethreaded) tasks. Before the new singlethreaded appversion was added by the admin, such hosts did not receive any work at all.

________
OCD edit, the "for this project" words have been removed from the preferences page in the meantime. Or they never were there and I merely imagined them.
I thought I had tried that previously and it didn't work, but I'll try it again when I'm in the mood to run MilkyWay again.
 

StefanR5R

Elite Member
Dec 10, 2016
6,056
9,106
136
gaia@home
The master URL changed from http://150.254.66.104/gaiaathome/ to https://gaiaathome.eu/gaiaathome/. I presume you can continue to use the old URL for the time being. But it's better to detach and re-attach the project (at a time when you don't have any gaia@home work or unreported results queued on the host).

After a long pause, the project is back with a new application and new work since yesterday. Credit is fixed to 200/result but CPU time is very variable. Minimum quorum is 1 (i.e. near instant validation).

@mmonnin03, I fetched one task and the workunit still has got the 86,400 GFLOPS rsc_fpops_bound which caused some tasks to error out with "exceeded elapsed time limit". Are you working around it locally, or is there a fix in place at the project which I am missing?

Edit, CPU time is not only variable, it is also grossly inconsistent for one and the same workunit. If you look at various workunits which had an error result on a host with very fast hardware, it is often the case that the subsequent successful result took less time on a host with weaker hardware. Perhaps something to verify with repeated offline tests of one and the same workunit. Maybe there are starting conditions randomly generated locally by each task.
 
Last edited:
Reactions: biodoc

mmonnin03

Senior member
Nov 7, 2006
248
233
116
Ah yeah it did have an IP. I still have PCs using the URL address.

I had a batch early on where they all aborted at around 2:50 or 3 hr and some go on past 4hr.
 

StefanR5R

Elite Member
Dec 10, 2016
6,056
9,106
136
PS, according to the gaia@home message board, they are still working out various kinks with the new application and workunit batches. One among various problems is that run time may be quite long, yet checkpointing is not implemented. That is, in the unlikely event of a system reset or the more common case that tasks are suspended to disk rather than held in RAM, they lose their computing progress and start from scratch when resumed.
 

StefanR5R

Elite Member
Dec 10, 2016
6,056
9,106
136
Amicable Numbers
Boinc Games sprint effect?
On 28 Oct 2024 Sergei Chernykh said:
WU size and credits are doubled now
First of all, thanks to everyone participating in the project!

I've noticed today that solved work units come in so fast that the server can't keep up. I had to double the size of each work unit (together with the credit for it) to reduce the load on the server.

Keep crunching! Thanks again!
Looks like a good change though independent of contest situations, IMO.
(Amicable Numbers message board)
 

StefanR5R

Elite Member
Dec 10, 2016
6,056
9,106
136
Nice! Perhaps the CPU-only stretch at the beginning of each task stood the same? (I joined the sprint only after the change was already made.)
 

StefanR5R

Elite Member
Dec 10, 2016
6,056
9,106
136
MilkyWay@Home
The SSL certificate of milkyway.cs.rpi.edu expired. :-(
This makes HTTPS connection impossible.
Plain old HTTP connections don't work either (with a web browser at least, haven't tried BOINC) since the site enforces HTTPS by HTTP Strict Transport Security (HSTS).

Edit:
I tried BOINC now: Set it to no new work, shut it down, edited MW@H's scheduler URL in client_state.xml from https:// to http://, restarted BOINC, forced a project update, and it succeeded reporting some results which were stuck in the queue.
 
Last edited:
Reactions: crashtech

StefanR5R

Elite Member
Dec 10, 2016
6,056
9,106
136
Should be fixed: The SSL certificate of MilkyWay@Home's server was renewed about a day ago.
 
Last edited:

StefanR5R

Elite Member
Dec 10, 2016
6,056
9,106
136
Wolrd Community Grid
Right now I have all sorts of problems getting units from WCG, and a bunch are stuck in downloading and have been for hours.
They restarted the African Rainfall Project (arp1) last week. Each task has got more than ten input files, and several of them are tens of Megabytes big. The result files of arp1 (seven files per result) are even larger IIRC. This is probably just too much for Krembil's internet connection.

Each arp1 task takes at the order of half a day to complete, that is, hosts are processing arp1 tasks an order of magnitude slower than e.g. mcm1 tasks. This reduces the rate of HTTP transactions due to arp1 somewhat. But the datarate is nevertheless much higher. I read elsewhere that WCG, after the move to Krembil and for as long as the arp1 project was active, had about the same issues like now whenever they submitted a new arp1 work batch.
 
Last edited:
sale-70-410-exam    | Exam-200-125-pdf    | we-sale-70-410-exam    | hot-sale-70-410-exam    | Latest-exam-700-603-Dumps    | Dumps-98-363-exams-date    | Certs-200-125-date    | Dumps-300-075-exams-date    | hot-sale-book-C8010-726-book    | Hot-Sale-200-310-Exam    | Exam-Description-200-310-dumps?    | hot-sale-book-200-125-book    | Latest-Updated-300-209-Exam    | Dumps-210-260-exams-date    | Download-200-125-Exam-PDF    | Exam-Description-300-101-dumps    | Certs-300-101-date    | Hot-Sale-300-075-Exam    | Latest-exam-200-125-Dumps    | Exam-Description-200-125-dumps    | Latest-Updated-300-075-Exam    | hot-sale-book-210-260-book    | Dumps-200-901-exams-date    | Certs-200-901-date    | Latest-exam-1Z0-062-Dumps    | Hot-Sale-1Z0-062-Exam    | Certs-CSSLP-date    | 100%-Pass-70-383-Exams    | Latest-JN0-360-real-exam-questions    | 100%-Pass-4A0-100-Real-Exam-Questions    | Dumps-300-135-exams-date    | Passed-200-105-Tech-Exams    | Latest-Updated-200-310-Exam    | Download-300-070-Exam-PDF    | Hot-Sale-JN0-360-Exam    | 100%-Pass-JN0-360-Exams    | 100%-Pass-JN0-360-Real-Exam-Questions    | Dumps-JN0-360-exams-date    | Exam-Description-1Z0-876-dumps    | Latest-exam-1Z0-876-Dumps    | Dumps-HPE0-Y53-exams-date    | 2017-Latest-HPE0-Y53-Exam    | 100%-Pass-HPE0-Y53-Real-Exam-Questions    | Pass-4A0-100-Exam    | Latest-4A0-100-Questions    | Dumps-98-365-exams-date    | 2017-Latest-98-365-Exam    | 100%-Pass-VCS-254-Exams    | 2017-Latest-VCS-273-Exam    | Dumps-200-355-exams-date    | 2017-Latest-300-320-Exam    | Pass-300-101-Exam    | 100%-Pass-300-115-Exams    |
http://www.portvapes.co.uk/    | http://www.portvapes.co.uk/    |