Geoff's Stats : To-Do List ** OPEN TO EVERYONE **

GeoffS

Lifer
Oct 10, 1999
11,583
0
71
So... Sunday morning I get on a plane to head to a conference in Texas returning late Wednesday night. I'm going to have a bit of time on my hands! Here's what I want to do, and feel free to add to this list!

FAD
- DONE add some milestones to the daily team stats page (mimicing the certificates) (only on the summary page that gets used for the daily team stats... on everyone's team page tomorrow )
- change the update to 4 times/day to match the official stats
- STARTEDoverhaul the first page, and the first members and teams listings I've added sort capabilities to the Teams-At-A-Glance page, and I suspect that will ultimately replace the Team page. I will be makeing the same changes to the Members-At-A-Glance page.
- DONE add selection criteria to the show nodes page (dropdowns for the date and report type)
- overtake values for within the team
- change the team overtake calc... drop data points in a 20-30 day range that are more than 1 standard deviation from the mean over that period of time.
- do the same for individuals... the threats & opportunities are grossly influenced by name changes, etc... Perhaps even don't even include any members that don't have fewer than 10 days of results from the list... then the outlier data should be ignored (meaning basically the first day after a name change)
- get on to a GMT schedule instead of EST to get in sync with the official stats
- account for commas and HTML characters in a member name... currently those get kicked out of the daily update
- on nodes pages, display last available node data (for example, today, Saturday, there have been no historyxxxx.csv files available)
- merge members with name;xx into name (there were 1154 in the 2005-03-18 stats)

DPAD
- make it work right :roll:

D2OL
- change hourly node update to run every 6 hours (the client only connects every 6 hours now). This will eliminate the delays encountered at the bottom of the hour every hour to 4 times/day.

Other existing stats
- try to migrate some of the newer pages from the FaD stats to the existing stats pages

Ambitious? Probably, but I would like to get input and hope that everyone will participate with feedback

** Particularly what the next project should be **

Geoff
 

GeoffS

Lifer
Oct 10, 1999
11,583
0
71
Actually, now that I've thought about it a bit more and chatted with a couple of people (including someone with a masters degree in statistics), the best way to do the overtake calc to make it immune to major fluctuations due to arrivals and departures is to take a larger period of time than a week (I've been told this in the past by members...), calculate the daily differences and toss the outlyers (2-3 standard deviations from the mean), and then recalculate the mean without the outlyers. A little more work, but I think it would be worth it
 

BlackMountainCow

Diamond Member
May 28, 2003
5,759
0
0
Sounds more than fine to me! Ambitious? Yeah! If you want to accomplish all these task during that conference ... kudos!! :beer::beer:

I would have one addition to the milestones: I guess it would be a good thing if you also add the milestone like "1,000,000 GFlops" and so on (2 million, three million). I don't know why the first official GFlops milestone is 5 million but I think that is just too high.
 

BadThad

Lifer
Feb 22, 2000
12,095
47
91
Man, I had some ideas for you....but I can't recall them. Going to have to look over the pages again to make suggestions.

Thanks for your hard work Geoff!
 

mrwizer

Senior member
Nov 7, 2004
671
0
0
A small one...but I like the cell phones stats on the BOINC projects. Anyway to make another stats page with the same information but very simple HTML? Not sure how many would really use this, but for us stats freaks it is nice to be able to see progress when not at a computer.
 

GeoffS

Lifer
Oct 10, 1999
11,583
0
71
mrwizer - if you can tell me the essential pieces of information you want, I can manage something... I do it for the TeAm daily stats that are posted on AT here:

http://fad.tastats.com/team_stats_ta.php

Note that the formating codes in there are specific to TA... it's pretty easy to set up plain pages with no images/backgrounds/etc...
 

mrwizer

Senior member
Nov 7, 2004
671
0
0
I think it would be cool to have this page in a simple display in addition to the current one.

http://fad.tastats.com/show_nodes_1.php?userid=mrwizer&dayoffset=0

The main problem you face is fitting a lot of information in a little screen. But this may also be helpful to people with dialup???

To give you an idea of the BOINC pages, go here. I like they way that I can see my latest trickles. See what I mean? (for some reason the link will not work for me, although it works for my cell)

Thanks!
 

Wolfsraider

Diamond Member
Jan 27, 2002
8,305
0
76
Originally posted by: mrwizer
I think it would be cool to have this page in a simple display in addition to the current one.

http://fad.tastats.com/show_nodes_1.php?userid=mrwizer&dayoffset=0

The main problem you face is fitting a lot of information in a little screen. But this may also be helpful to people with dialup???

To give you an idea of the BOINC pages, go here. I like they way that I can see my latest trickles. See what I mean? (for some reason the link will not work for me, although it works for my cell)

Thanks!


XML news feed?
Taken from that link in dreamweaver.

?xml version="1.0" encoding="iso-8859-1"
 

WizzardOfOzz

Junior Member
Jul 11, 2004
21
0
0
the best way to do the overtake calc to make it immune to major fluctuations due to arrivals and departures is to take a larger period of time than a week (I've been told this in the past by members...),

20 Million points lets say joining your team would be a 10% increase, how long does it take for your team to produce that? then multiply that length of time by 2 and you are still at 25% normalization and 2 months average.. for teams there is a way to calculate it much more precisely (if you keep the right data)

tt and I were debating the length of time to use, he was of the impression that a greater average would be best, I still feel that a 2 week average is NOT going to reflect new members output, and it would take 2 weeks of no new joins or no new machines for it to be even close to accurate. I've chosen 5 days as an average (but I have an alternate data source for this), if the output was relatively consistant, then a greater average would work best, but it isn't and people do join and leave teams, just something for you to ponder..

For members, it's pretty much hopeless to provide anything but a ballpark, since a member can "split" thier member and move partial points from member to member there is no way to accurately re-produce what thier output is.

Oh, another thing to add to your to do list, I'll try to explain this well.

When you click on a team that has negative production (a member split themselves) the only ones that get a Positive % for the daily output are the ones with negative output. hmm, don't know if that will read well.. perhaps

Team X Daily output = -500k points

Member y = -750k points (150% of teams production)
Member z = 250k points (-50% of teams production)

Since member z was working against the negative flow it gave him a negative output %.

I can see the math, it is correct, but it's not the way it should show. (at least IMHO)

the joy of movable points, it never ends

Good luck with the mods.
 

GeoffS

Lifer
Oct 10, 1999
11,583
0
71
Thanks guys!

WoO... I'm working through the math now of eliminating outliers in the data... I keep team_total and team_daily tables... both have date, points, and a bunch of other stuff... the daily file is the production for the day derived by the difference of today's total and yesterday's total. I'm thinking of eliminating from the average calculation any values outside one standard deviation... so the SQL would be something like...

select avg(points) as avgpts, stddev(points) as stddevpts
from team_daily
where date between x and y
and team=z

and then calculate a new team average eliminating the outliers...

select avg(points)
from team_daily
where date between x and y
and team=z
and points between (avgpts-stddevpts) and (avgpts+stddevpts)

For our team, over the last 30 days, that would eliminate 2 obvious outliers on Feb-27 and Mar-11

The negative thing is a whole different story! :roll:

Thanks for dropping by with your comments!
 

WizzardOfOzz

Junior Member
Jul 11, 2004
21
0
0
No Problem Geoff, I consider Our sites to be quite different, so I figure I help where I can..

I track a lot of extra information that up until this recent thing was useless, but I can get the raw production for a team, this excludes any movements etc..

If you want a better way, work with the Upload histories, ignore the standard CSV outputs they are tainted.. if you mark each uploaded file with the team# it was uploaded for, (you do this elsewhere, tho my method is probably different then yours) then you can do

SELECT Sum(Points) as Points FROM History WHERE (Date BETWEEN Today AND Today-5) AND Team=z;

This will result in the raw produced points for that team for the last 5 days, if a member joins then only thier production from that point on will count. the points they brought will only close the gap by the relative difference..

Team a is calc'd to overtake Team b in 10 days.

team b is at 40mil points, team a is at 20 mil points, but a member joins with 10 mil points, that number is now calc'd @ 5 days. using any other method will result in 2 or less days. by far the best soloution I could find.
 

Ken g6

Programming Moderator, Elite Member
Moderator
Dec 11, 1999
16,284
3,905
75
I'd like a page that shows those averages (7-day, 30-day, outliers or no, whatever looks best) in a table. Yesterday's stats just don't give me a good impression of everyone's speeds, when jobs can take two or more days.
 

GeoffS

Lifer
Oct 10, 1999
11,583
0
71
Originally posted by: Ken_g6
I'd like a page that shows those averages (7-day, 30-day, outliers or no, whatever looks best) in a table. Yesterday's stats just don't give me a good impression of everyone's speeds, when jobs can take two or more days.


The overall team stats page graphs them.... 14-day avg and overall avg...

http://fad.tastats.com/team_stats_history.php?Team=TeAm+Anandtech&t=AllTime

Would you like to see them here too?

http://fad.tastats.com/team_stats_history2.php?Team=TeAm+Anandtech
 

GeoffS

Lifer
Oct 10, 1999
11,583
0
71
Added....

FAD - Run the jobs using GMT instead of EST to get in sync with the official stats!
 

JTWill

Senior member
Feb 2, 2005
327
0
0
Geoff is there a way to intergrate my stray into Rebel_Alliance, it still shows as RA 1 , the info is right but its still a stray
 

JTWill

Senior member
Feb 2, 2005
327
0
0
I have'nt recieved an an answer to the problem yet, all info is correct and rechecked a dozen times.
 

Doc Brown

Senior member
Aug 28, 2004
627
0
0
Originally posted by: JTWill
Geoff is there a way to intergrate my stray into Rebel_Alliance, it still shows as RA 1 , the info is right but its still a stray

You need to join the team (rebel_alliance)
 

JTWill

Senior member
Feb 2, 2005
327
0
0
Originally posted by: Doc Brown
Originally posted by: JTWill
Geoff is there a way to intergrate my stray into Rebel_Alliance, it still shows as RA 1 , the info is right but its still a stray

You need to join the team (rebel_alliance)

I did, identical info all down the line, checked it a dozen times. Resaved it several times. BTW the PM wouldnt let me send a response.
 

Freewolf

Diamond Member
Feb 15, 2001
9,673
1
81
Originally posted by: Doc Brown
Originally posted by: JTWill
Geoff is there a way to intergrate my stray into Rebel_Alliance, it still shows as RA 1 , the info is right but its still a stray

You need to join the team (rebel_alliance)


Hmmm When I do it I use Big R and Big A.
Rebel_Alliance not rebel_alliance
Don't know if it makes a different.



 

Freewolf

Diamond Member
Feb 15, 2001
9,673
1
81
Originally posted by: GeoffS
So... Sunday morning I get on a plane to head to a conference in Texas returning late Wednesday night. I'm going to have a bit of time on my hands! Here's what I want to do, and feel free to add to this list!

FAD
- add some milestones to the daily team stats page (mimicing the certificates)
- change the update to 4 times/day to match the official stats
- overhaul the first page, and the first members and teams listings
- add selection criteria to the show nodes page (dropdowns for the date and report type)
- overtake values for within the team
- change the team overtake calc... drop data points in a 20-30 day range that are more than 1 standard deviation from the mean over that period of time.
- do the same for individuals... the threats & opportunities are grossly influenced by name changes, etc... Perhaps even don't even include any members that don't have fewer than 10 days of results from the list... then the outlier data should be ignored (meaning basically the first day after a name change)
- get on to a GMT schedule instead of EST to get in sync with the official stats
- account for commas and HTML characters in a member name... currently those get kicked out of the daily update
- on nodes pages, display last available node data (for example, today, Saturday, there have been no historyxxxx.csv files available)
- merge members with name;xx into name (there were 1154 in the 2005-03-18 stats)

DPAD
- make it work right :roll:

D2OL
- change hourly node update to run every 6 hours (the client only connects every 6 hours now). This will eliminate the delays encountered at the bottom of the hour every hour to 4 times/day.

Other existing stats
- try to migrate some of the newer pages from the FaD stats to the existing stats pages

Ambitious? Probably, but I would like to get input and hope that everyone will participate with feedback

** Particularly what the next project should be **

Geoff

Sure is going to be boring around here.
Sunday I load the family up in the car and drive eight hours or so to disney world and we return home Friday.
 

mondobyte

Senior member
Jun 28, 2004
918
0
71
Geoff,

I see that I have 22 nodes ... and yes ... at one time or another they were all my nodes.

In reality, it is the terminology that is incorrect. I have had 22 member numbers during my tenure with FaD. For most folks, 1 member number = 1 node. A member number could always be assigne to a multiple processor computer which means that 1 member number = 2-3-4 ... nodes. Officially FaD terms nodes as instances of the client running. Each instance is a node. A multiprocessor computer could have 1-2-3-4 ... etc. nodes. A single HT processor usually runs 2 nodes.

Referring to member numbers, I only have 9? "ACTIVE" member numbers at present. (i. i.e., that are active in receiving, processing and sending jobs.) The remainder are "retired" or "inactive" due to re-installs or computers which have "quit" FaD for one reason or another, etc.

At minim, I'd like to see the number be "Active Member Numbers", i.e., the computers that have reported jobs within some recent period (last 3-6 weeks ???) dunno ...

This will tell the number of actively crunching computers with "servers" but it will not really do anything for folks like me that have over a hundred computers crunching but only have 7-9 computers with servers.

Let's obfuscate the issue further. If I do a network FaD client running off a Queue Server and then install FaD as a stand-alone it gets the same member number as the Queue Server that it WAS running from. To complicate it further, lets suppose that this new computer becomes the nexus for a new cluster of clients as a Queue Server. It is technically possible that I might only have a single active member number and many computers that are running server using that single member number and they don't even need to be geographically close to one another.

Proposal: Show active member numbers ... but permit us to enter the number of "crunching" nodes (instances of FaD) for each member number (honor system) and show the sum of those entries (assume 1 if it is not entered through the node config screen).

I guess the real question is how do you represent someone like myself that makes any sense ...

For mondobyte:
22 nodes (member numbers) is bogus.
9 nodes (member numbers) understates my true power by more than an order of magnitude (which is not insignificant) --- I'd stack my 9 member numbers up against anyone else on the team with 9 member numbers and win every day of the week!!!

Your Node screen that shows processor ratings is interesting to me only if that member number consists of fairly uniform computers. i.e., the White Horde - Western Contingent consists of 30 PIII 550's, 2 PIII 450's, and 1 Celeron 633. On the other hand, Golden Horde consists of a mix of 19 instances of FaD running on processors ranging from a PIII-600 to an Athlon 3.0 GHz. White Horde - Toctamesh consists of 24 instances of FaD running on processors ranging from a Celeron 633 to a P4 3.2 HT.

Node Activity should technically be retermed CPU Ratings by Member Number or some such but not Node Activty. I'll leave the semantics to your discretion as I am not overly concerned with the name as long as it is recognized for what it is.

A rough approximation of the number of instances of FaD running might be arrived at by summing the number of jobs reported in a day that exceed a runtime of 24 hours. By default, this is the absolute minimum number of nodes that finished jobs in that day although it says nothing about a maximum limit. (One cannot use jobs running less than 24 hours in this presumption) ... One could also sum the time for all jobs running less than 24 hours (also add 24 hours for every job running 24 hours or longer) and divide by 24 (add 1 for any remainder) and get another minimum limit of the number of nodes. The larger of (> 24 hour reports) and (<24 hours/24) would give you a node count that could be valid but could also be understated.

So ... we have a realistic way of estimating the minimum number of nodes that must be crunching for a single member number in a day.

Now to the more complex issue of the CPU Rating for those nodes. If one takes that sample used for the establishing minimum value and adds the individual CPU ratings of that sample together, that represents ... ostensibly the actual CPU Rating for that day ... so ... for example, you might see a CPU Rating of over 2,000 quite frequently for the Golden Horde. Now that would be an interesting statistic because it provides the minimum real CPU rating of each of my hordes. The average would also be meaningful to me too. Of course, you would need to add a total row! That would be much more interesting to me than what is shown right now ...

The User Node Listing is great just as it is except that I would like to be able to see all member numbers that I have ever used (and potentially name them) (and potentially, indicate that they are retired "more or less permanently"). (you could also artificially indicate "retired" status if nothing is reported in some time frame (see above) ...

Perhaps a checkbox that permits me to expand the listing to show all instead of just those with current activity.

As I suggested before, append the team to the individual stats ... I hate searching to find out which member is on which team (like OF - OldGuy) ...

mondo




 
sale-70-410-exam    | Exam-200-125-pdf    | we-sale-70-410-exam    | hot-sale-70-410-exam    | Latest-exam-700-603-Dumps    | Dumps-98-363-exams-date    | Certs-200-125-date    | Dumps-300-075-exams-date    | hot-sale-book-C8010-726-book    | Hot-Sale-200-310-Exam    | Exam-Description-200-310-dumps?    | hot-sale-book-200-125-book    | Latest-Updated-300-209-Exam    | Dumps-210-260-exams-date    | Download-200-125-Exam-PDF    | Exam-Description-300-101-dumps    | Certs-300-101-date    | Hot-Sale-300-075-Exam    | Latest-exam-200-125-Dumps    | Exam-Description-200-125-dumps    | Latest-Updated-300-075-Exam    | hot-sale-book-210-260-book    | Dumps-200-901-exams-date    | Certs-200-901-date    | Latest-exam-1Z0-062-Dumps    | Hot-Sale-1Z0-062-Exam    | Certs-CSSLP-date    | 100%-Pass-70-383-Exams    | Latest-JN0-360-real-exam-questions    | 100%-Pass-4A0-100-Real-Exam-Questions    | Dumps-300-135-exams-date    | Passed-200-105-Tech-Exams    | Latest-Updated-200-310-Exam    | Download-300-070-Exam-PDF    | Hot-Sale-JN0-360-Exam    | 100%-Pass-JN0-360-Exams    | 100%-Pass-JN0-360-Real-Exam-Questions    | Dumps-JN0-360-exams-date    | Exam-Description-1Z0-876-dumps    | Latest-exam-1Z0-876-Dumps    | Dumps-HPE0-Y53-exams-date    | 2017-Latest-HPE0-Y53-Exam    | 100%-Pass-HPE0-Y53-Real-Exam-Questions    | Pass-4A0-100-Exam    | Latest-4A0-100-Questions    | Dumps-98-365-exams-date    | 2017-Latest-98-365-Exam    | 100%-Pass-VCS-254-Exams    | 2017-Latest-VCS-273-Exam    | Dumps-200-355-exams-date    | 2017-Latest-300-320-Exam    | Pass-300-101-Exam    | 100%-Pass-300-115-Exams    |
http://www.portvapes.co.uk/    | http://www.portvapes.co.uk/    |