A program to count BOINC tasks

cecht
cecht
Joined: 7 Mar 18
Posts: 1516
Credit: 2850903014
RAC: 2023100

Well, "indiscriminant" is

Well, "indiscriminant" is such a harsh word; it's more like, "taken in order reported by boinccmd --get_old_tasks". But point well taken. My program bundles all BOINC output without regard to Project. I've tried to identify which Project is currently running for purposes of automating Project suspension and resumption, but with mixed success. The program structure just isn't well set up for that.

Your suggestion to analyze the job_log_einstein.phys.uwm.edu.txt file is worth pursuing, which has prompted me to try to develop a new analysis and plotting program. My job_log file currently has 345794 entries dating back to April 20, 2019, so I'm going to need some new Python skills to handle large data arrays. How large, and how far back, are others' job_log files?

I've identified the job log's data tags for each task's timestamp, CPU time, name (Project), and elapsed time. This would allow analysis of task times and counts, or their statistical metrics, over date-time intervals. So, what would folks like to see for a graphic or analytic data break down? How is the best way to represent or inspect useful and interesting data categories? Between Python's NumPy, Pandas, and Matplotlib packages, I think anything is possible. Whether my hobbiest programming skills are up to it, well... ¯\_(ツ)_/¯

I see two unknown job log data tags, ue and fe, which values remain fairly constant over time. Does anybody know what they represent?

EDIT:  Ah! I missed Gary's reply. Got it! Thank you.

 

Ideas are not fixed, nor should they be; we live in model-dependent reality.

Ian&Steve C.
Ian&Steve C.
Joined: 19 Jan 20
Posts: 3923
Credit: 45239572642
RAC: 63135272

Personally I think you should

Personally I think you should just grab all data from the job log txt files. They are already being generated and written to by BOINC and each task is recorded individually. So your program wouldn’t even need to do any data collection, only analysis. The timed interval could be set to just grab the recent new data from the job logs rather than reported tasks. This would remove the dependency to have the program running constantly to collect data, (unless the host is brand new with no data in the job log). Not only can you group project data based on which file the data came from, H the name in the job log can even give you the ability to analyze the performance of different sub projects, or even different frequency ranges within the same subproject. 
 

you could give the option to plot all the raw data as-is or to plot a sample size of data (your interval selections, 10min, 1hr, etc) with min/max/avg like you have now. Both would be useful I think. 
 

i also have about 350,000 entries (~40MB) in my Einstein job log for my 1-GPU system. And likely many many more in my faster systems. One of my logs is nearly 200MB lol. You could probably more easily handle the amount of data by loading all data into a database (SQLite?) and querying that for making the plots. 

_________________________________________________________________________

cecht
cecht
Joined: 7 Mar 18
Posts: 1516
Credit: 2850903014
RAC: 2023100

As an example of a slightly

As an example of a slightly larger data representation, here is a plot of avg., max., and min. FGRP task times for 1 hr intervals over the past several months.
The plot has 2732 interval readings covering 37992 reported tasks.

Tasks were run at 3x on a RX 5600xt. The uptick in average times at the end is when I switched Projects from FGRP to GW.

Areas of the plot can be zoomed in for a more granular inspection, but the point here is that there are limits to plotting lots of stuff. Given that typical job_logs will have 10 to 50 times more samples, I agree that options for users to limit sample sizes and time ranges are a must.

Any additional thoughts on data consolidation or representations other than scatter plots? Subproject analysis would be interesting, but I'll save that for once a program is up and running.

By the way, can anyone explain the ~2-week cyclic nature of the average completion times for gamma-ray binary pulsar tasks?

Ideas are not fixed, nor should they be; we live in model-dependent reality.

Ian&Steve C.
Ian&Steve C.
Joined: 19 Jan 20
Posts: 3923
Credit: 45239572642
RAC: 63135272

cecht wrote: By the way, can

cecht wrote:

By the way, can anyone explain the ~2-week cyclic nature of the average completion times for gamma-ray binary pulsar tasks?

this is where being able to compare the data by name could be helpful.

as a spot check, can you correlate these cycles to different frequency ranges?

_________________________________________________________________________

Gary Roberts
Gary Roberts
Moderator
Joined: 9 Feb 05
Posts: 5870
Credit: 116708784689
RAC: 36545235

cecht wrote:By the way, can

cecht wrote:
By the way, can anyone explain the ~2-week cyclic nature of the average completion times for gamma-ray binary pulsar tasks?

It corresponds exactly with the ~2-week cycle for the life of each data file - eg LATeah3012L11.dat currently.

The first tasks in a given cycle always have the lowest frequency and these run more quickly than the higher frequency tasks at the end of the cycle.  You can see the frequency in the task name, eg. one of my current tasks is LATeah3012L11_764.0_0_0_0.0_16406964_0.  The frequency is 764Hz.  Later tasks will end up around 900 from past experience.

The final 8-digit parameter also may have a very minor affect when comparing at constant frequency.  The lowest values have only 6 digits.  The highest values reach well over 3nnnnnnn.  From rather limited (and very casual) observation, since the effect is quite minor, tasks with 6-digit values seem to run just slightly faster (maybe) than the larger 8-digit values.  I certainly haven't looked at enough results to be sure about that and there's enough 'noise' in result times to confuse the issue anyway.

Cheers,
Gary.

cecht
cecht
Joined: 7 Mar 18
Posts: 1516
Credit: 2850903014
RAC: 2023100

As mentioned, one feature

As mentioned, one feature will be to plot task metrics for a specified Project and Project subcategories. I can read times and task names from the job_log file into a data array. From that, Project and task information can be derived from the task name, but I only have tasks from GW (h1_...) and FGRP (LATeah_...) in my job_log.

Example: h1_0672.20_O3aC01Cl1In0__O3AS1a_672.50Hz_977_2
This name for Gravitational Wave search O3 All-Sky #1 is underscore delimited, allowing various parameters, like frequencies, to be parsed and analyzed.

What are examples of task names for Projects other than GW and FGRP that can be expected in job_log files?

It seems that there is a fixed time span of records in the job_log. At least it is in mine, because the earliest recorded timestamp (epoch time) has advanced over the past few days. That span seems to be ~1147 days or ~99100800 seconds. If this is true for everyone, then there would be no need to consider Projects or task name structures that came and went earlier than 1147 days ago.

Thanks!

Ideas are not fixed, nor should they be; we live in model-dependent reality.

Ian&Steve C.
Ian&Steve C.
Joined: 19 Jan 20
Posts: 3923
Credit: 45239572642
RAC: 63135272

the BRP4 tasks have a name

the BRP4 tasks have a name that looks like this:

p2030.20180721.G56.62-02.96.N.b0s0g0.00000_416_0

 

though I don't know what the different sections necessarily mean.

_________________________________________________________________________

Gary Roberts
Gary Roberts
Moderator
Joined: 9 Feb 05
Posts: 5870
Credit: 116708784689
RAC: 36545235

cecht wrote:It seems that

cecht wrote:
It seems that there is a fixed time span of records in the job_log. At least it is in mine, because the earliest recorded timestamp (epoch time) has advanced over the past few days. That span seems to be ~1147 days or ~99100800 seconds.

Are you sure about that?  My experience (admittedly just casual observation) has been that the file just continues to grow, unless disturbed - eg. a full reinstall would wipe it unless the BOINC tree was specifically saved.

I took a guess at what might be my longest running machine without a disturbance that might have cleared the BOINC directory.   The oldest record had 1440672822.  The latest record had 1654890674.  The difference represents ~2479 days.

I don't think anybody would be nutty enough to have extremely old data that they would be interested in processing :-).

A bigger issue might be how you handle different app versions processing the same type of tasks in quite different times - eg. Petri's new app.  There's no app version information in the job_log file.  I guess a user would know when they started using a different app and would therefore understand step changes that showed up at that time.

Cheers,
Gary.

cecht
cecht
Joined: 7 Mar 18
Posts: 1516
Credit: 2850903014
RAC: 2023100

Gary Roberts wrote:Are you

Gary Roberts wrote:
Are you sure about that?

Ummm, no.

Gary Roberts wrote:
A bigger issue might be how you handle different app versions processing the same type of tasks in quite different times - eg. Petri's new app.  There's no app version information in the job_log file.  I guess a user would know when they started using a different app and would therefore understand step changes that showed up at that time.

Good to keep in mind. Thanks.

Ideas are not fixed, nor should they be; we live in model-dependent reality.

cecht
cecht
Joined: 7 Mar 18
Posts: 1516
Credit: 2850903014
RAC: 2023100

This is just a

This is just a tooting-my-own-horn progress update with a program for plotting data from job_log_einstein.phys.uwm.edu.txt. The basic functions are written to plot task times or search frequencies by E@H Project. Once it's in a form fit for public consumption, I will post it as a new repository on GitHub.

Here is a screenshot of GW and FGRP frequencies for ~350,000 tasks run over the past ~3 years:

And here is a zoomed-in portion covering ~3.5 months of just FRGPG1 frequencies:

Ideas are not fixed, nor should they be; we live in model-dependent reality.

Comment viewing options

Select your preferred way to display the comments and click "Save settings" to activate your changes.