After months, I just tried Folding@Home, and it's desolate.

StefanR5R · Jan 2, 2024

In the Holiday Race thread StefanR5R said:
it's not my Internet link at home which is at fault individually, as I noted in the other thread. It must be something with the whole route which makes F@H's work servers and collection servers (or F@H's gateways or whatever) break up my uploads randomly.

On third thought, it could also be a router within my ISP's network which cuts me off randomly. Or it could be a combination of some sort of traffic shaping by my ISP, and the F@H servers reacting badly to that. I have no way to know. But so far I never noticed other cases which would indicate that my ISP discriminated certain traffic.

One of the two results from Sunday went through on Monday morning BTW, after a large number of retries by the client of course; I haven't counted. The other result still wasn't uploaded today in the morning.

StefanR5R · Jan 23, 2024

(Continuing this here rather than in the stats thread…)

StefanR5R said:
CPDN's upload server (upload7.cpdn.org) is presumably crowded and I've got two trickle files for each task in the client's upload retry queue by now. (But there are also two successful trickles listed for each of my two tasks on the web site, so it's not completely hopeless.)

Uhm no, it is hopeless for me: While the trickles are listed on the web site and got credit granted normally, the files of these trickles haven't been uploaded yet (one ~95 MB file per trickle). Meanwhile, I've got no so such problems with PrimeGrid (never had). One big difference: PrimeGrid's server is located here in Germany.

traceroute www.primegrid.com shows me 13 hops, and ~41 ms roundtrip times of the TTL probes to the very last one.

traceroute -m 200 upload7.cpdn.org shows me 200 hops (obviously bogus), and ~270…350 ms roundtrip times to the last visible hop, which is the 15th at 141.223.253.61 (upload7.cpdn.org resolves to 141.223.16.156).

sudo traceroute -I upload7.cpdn.org shows me 18 hops, and ~330 ms ICMP roundtrip time to the very last one, which is eawah.postech.ac.kr (141.223.16.156).

That is, I've got issues shipping large files to either the US (Folding@Home) or to South Korea (CPDN, current WAH2 batch).

I'll have to look whether or not client-side settings can reduce the connection losses.

Edit:

StefanR5R said:
I took a traceroute to two work servers and one collection server of F@H, while there was no transfer was going on. The traceroutes showed ~16…20 ms when it got to Frankfurt, and jumped to ~105…110 ms on the very next step when it got to either Boston or Philadelphia. From there, latency practically didn't get any higher when the last responding hosts in the routes were reached

I repeated this with traceroute -I now:
work server 131.239.113.97: 15 hops, ~110 ms
work server 158.130.118.24: 16 hops, ~135…150 ms
collection server 158.130.118.26: 16 hops, ~110 ms

However, neither 110 nor 150 nor even 330 ms seem so bad to me when all what is to be accomplished is a large file transfer without quality of service requirements.

Edit 2:

StefanR5R said:
I'll have to look whether or not client-side settings can reduce the connection losses.

Hmm. The default values of the most relevant transfer related options should already be quite resistant to high latencies during transfers:
<http_transfer_timeout>300</http_transfer_timeout> (Abort HTTP transfers if idle for 300 seconds. 300 is the default according to documentation.)
<http_transfer_timeout_bps>10</http_transfer_timeout_bps> (An HTTP transfer is considered idle if its transfer rate is below 10 bits per second. The default values is not documented, but 10 seems to be it in the initial population in cc_config.xml files.)

Skillz · Jan 23, 2024

Whats your connection look like going to teamanandtech.org and 192.111.155.210 (Both servers I own)

StefanR5R · Jan 23, 2024

I guess I'll check this tomorrow, because right now the link is fully utilized with mass uploads to PrimeGrid, and it's getting late in the day.

StefanR5R · Jan 23, 2024

Oh, and while several PrimeGrid attached hosts are (flawlessly) spamming PG with myriads of result files, the laptop is slowly but steadily working on two active uploads of the CPDN trickle files. The speeds of these two uploads are down to a few KBps each now, but from the looks of it, they are not breaking up!

I am going to configure a transfer speed limit in the CPDN attached client tomorrow and see if it helps with connection stability to South Korea.

Edit:
Nope, the two uploads failed again eventually at different points in time, both with the usual nondescript "transient HTTP error".

StefanR5R · Jan 24, 2024

Skillz said:
Whats your connection look like going to teamanandtech.org and 192.111.155.210 (Both servers I own)

teamanandtech.org: 19 hops, ~135…165 ms roundtrip times
192.111.155.210: 20 hops, ~135…200 ms roundtrip times
from three quick tests right now, with moderate PrimeGrid traffic in the background

StefanR5R · Jan 24, 2024

I am happy to report that I succeeded to upload two ~95 MB sized trickle files to upload7.cpdn.org. The conditions under which this happened:
– Circa half of my internet link is used by the PrimeGrid attached hosts for normal PPSE operation. I.e. I am currently not in recovery from a previous internet outage.
– The CPDN client is configured to the default of 2 simultaneous transfers per project, and to a non-default maximum transfer speed of 50 kB/s. That is, the two files were uploaded in parallel at 25 kB/s each.
I am trying 60 kB/s for the next couple of pending files…
Edit: There still remains a chance of transfers failing with "transient HTTP error". But it's better than before when no transfers succeeded at all.

I don't think FAHClient has got a similar option to throttle the upload speed.

As reported in #9, my internet link is performing at ~8 Mbit/s = ~1000 kByte/s for large transfers to German peers.

StefanR5R · Jan 25, 2024

Over night, eight more of CPDN's ~95MB trickle files were uploaded. Just two files remain in retries, waiting for CPDN's upload handler to release file locks.

So it seems as if I can keep my aggregate upstream network utilization somewhere below the limit which my ISP imposes, then BOINC's HTTP errors are reduced to a level which gets me reasonably going with sites at other continents.

I still have no idea how I could test this theory with FAHClient. Maybe a VPN with a server in Germany — or just a SOCKS5 proxy or even just a HTTP proxy located in Germany — could get me going with F@H instead. (Because, as mentioned, large and/or many transfers to PrimeGrid's BOINC server work for me.)

The central mystery remains: I did not have these issues when I was last active at F@H, which was in January 2023. What has changed since then?

StefanR5R · Oct 6, 2024

TL;DR, latency of my internet link seems to be the troublemaker. I can work around it by limiting the transfer speeds on my hosts. This is directly supported by BOINC but not by FAHClient. For the latter, I can resort to external tools.

The story so far:

StefanR5R said:
upload transfers always fail after a random percentage (anywhere between 3 % and 30 %, mostly at circa 10%).

StefanR5R said:
maybe it's the high latency of my connection (>400 ms average and >1000 ms max during uploads).

StefanR5R said:
The ISP seems to have fixed something in the meantime. Latency during uploads is now at 80 ms avg, 100 ms max. Transfers to PrimeGrid with 30 MB payload go through without issue. I haven't retried Folding@Home yet.

StefanR5R said:
It's no use.

I made two results today but can't get them sent. The client repeatedly reports the transfer as failed, after a random upload percentage.

I am now trying wondershaper. This is a reasonably convenient command line tool which was developed for the purpose of reducing internet latencies which are caused by large buffers at some ISPs, or in consumer-level DSL modems and cable modems respectively. It is implemented as a wrapper script around the tc command line tool.

The computer on which I am testing runs Linux Mint 21.1. This, like a variety of other recent Linux distributions, is no longer shipped with the CBQ networking scheduler which is required by older versions of wondershaper, including the one which is available in Mint 21.1 stock software repositories. #-( Therefore I downloaded the most recent version of wondershaper from https://github.com/magnific0/wondershaper instead which is using the HTB networking scheduler.

When I set wondershaper to limit download and upload speeds to about 90% of my rated connection speeds, latency during upload gets down from ≈80 ms to ≈25 ms in the Cloudflare speed test vis-à-vis European Cloudflare servers.

Today I tried one F@H result upload without wondershaper; it needed several retries to eventually succeed. Then I proceeded with more F@H workunits with the 90% speed limit in place (on one and the same dual-GPU host, without notable internet traffic caused by other hosts at the same time), and the first five results so far went up without retries.

Edit, and several more result files were uploaded without a need for retries.

StefanR5R · Sunday at 6:31 AM

StefanR5R said:
TL;DR, latency of my internet link seems to be the troublemaker.

Throttling the networking traffic locally by means of wondershaper — thus circumventing the bad throttling/ buffering performed somewhere along my internet link = by the modem router at home or by equipment further up — fixed also another "upload" problem of mine:

The BOINC client's option to limit upload speed affects file transfers, but evidently not scheduler requests. But when a host has a large number of tasks in progress at a single project, the message size of scheduler requests gets large too. Thus, scheduler requests are getting mutilated by my internet link such that they time out.

Today I realized that I can avoid this problem with BOINC scheduler requests in the same way as my F@H upload troubles: By means of capping the host's upload speed using wondershaper.
(success report at the MilkyWay@Home message board)

PS:
A more thorough solution for when I am using several computers at once would be to set up a router of my own (in addition to or instead of the l4m3 modem router which my ISP supplied), to shape the traffic of all hosts combined, not per host.

Skillz · Monday at 2:17 AM

StefanR5R said:
PS:
A more thorough solution for when I am using several computers at once would be to set up a router of my own (in addition to or instead of the l4m3 modem router which my ISP supplied), to shape the traffic of all hosts combined, not per host.

That does sound like an overall better solution. Just set the config in a router that supports it and you shouldn't have to worry about it anymore.

Most modern routers these days (as in consumer router) have QoS settings where you can easily just set the upload/download cap per project URL/IP/Hostname that should suffice I would think. This way it wont effect other things you do with the internet connection such as downloading Linux ISOs, streaming videos, etc...

StefanR5R · 2024-11-26T16:05:51-0500

My journey continues:

As my differing experiences with Folding@Home and PrimeGrid have already hinted at, how much I need to throttle uploads differs with the target servers, or rather, with the network paths to the target servers and the overall latency.

I have one CPDN WAH2 task running right now. The trickle files need to go to to upload7.cpdn.org = eawah.postech.ac.kr, that is, across quite a lot of network hops. I get this only working if I restrict upload speed to less than 1/10th of what worked with MilkyWay@Home and Folding@Home! :-(

My current workaround for CPDN is the simplest possible one: I set this low upload speed limit right in the BOINC client instance which runs this CPDN task. (This instance happens to run CPDN only, therefore no other project is impaired by this extraordinarily low speed limit. It's a Windows BOINC binary running on Linux by means of Wine.) In contrast, the wondershaper script, as it is, does not explicitly support different throttling rates based on target addresses. But it does support a kind of prioritization of outbound traffic to different targets (in the form of netmasks), showing that the underlying kernel interface could most likely be used to define different upload speed caps for different destinations.

However, this is all quite ridiculous.
– Experimenting which speed works towards which destination is not a good solution.
– Setting a general conservative speed cap which works even with the worst destination encountered so far is not a good solution.
– Testing a different Internet Service Provider (notably DSL instead of TV cable), from which I only know beforehand that less theoretic networking performance will be offered for same monthly fee, or more typically for even higher monthly fee, is not a good solution.
– Sitting it out until an undetermined future time in which the TV cable gets replaced by FTTH is not a good solution, as I still have no solid information when this is going to happen.

This problem with my Internet link started (or maybe didn't actually start but worsened a lot) at some unknown point in time during the year 2023. Perhaps my ISP changed some networking equipment on his side at that time. Or perhaps a worse firmware was installed on the modem router at my home at this time; I have no control over this.

Edit: *If* the modem router is at fault, then replacing it by a better cable modem router could be a solution. I would need certain cooperation of my ISP for this, but it might be doable. In contrast, adding a router which sits between my clients and the ISP's modem router is evidently not going to be a good solution because it would be hampered by the same need of testing which speed works towards which destination.

After months, I just tried Folding@Home, and it's desolate.

StefanR5R

Elite Member

StefanR5R

Elite Member

Skillz

Senior member

StefanR5R

Elite Member

StefanR5R

Elite Member

StefanR5R

Elite Member

StefanR5R

Elite Member

StefanR5R

Elite Member

StefanR5R

Elite Member

StefanR5R

Elite Member

Skillz

Senior member

StefanR5R

Elite Member

TRENDING THREADS