Is there any "user-tasks" statistics API available?

Marcin Szydłowski 1984
Marcin Szydłows...
Joined: 6 Jan 22
Posts: 11
Credit: 1958700
RAC: 0
Topic 227482

Hello,

I'm curious is there any E@H statistics API available for user's tasks. For example, I want to know how my computer is processing different tasks compared to other hosts and is it efficient or not. Based on that I can better configure my computing preferences. I know that this page displays main numbers about the entire project, also every user has access to his tasks on his account page, but it's problematic to entering every workunit's link for checking run/cpu time or the status.

Currently I prepared a simple Python app which checks my account page and collects statistics about the execution of all performed tasks. For example, here are statistics of my tasks for two projects, presenting run times of tasks: BRP and FGRP5.

Legend:

  • green dots - my tasks, completed and validated,
  • red big dots - my tasks, validation pending,
  • black big dot at the left of FGRP5 - my task, aborted,
  • cyan pluses - replicated by other hosts, completed and validated,
  • black pluses - replicated by other hosts, errors (aborted, timeout, computing/validating errors, completed but too late, etc.)

Each marker can be hovered and clicked for opening in the browser and checking the info about the workunit:

 

Based on such data I saw that it was better in my case to stop BRP because there are other hosts which do this much faster (two cyan pluses about ~5 mins. mean time in 7th and 8th April) instead of my Mac (~40min. mean time) so I can concentrate on FGRP5 only.

I store statistics locally so once visited tasks aren't trigger opening workunits' sites. However, the downside of all of this is when I run the script it still needs to check every pagination links of my tasks, download the html body of the page and parse it to get the tasks info.

So my question is: is there any way to do this more efficiently without checking the bare HTML? Or maybe, something there is a link for existing and better solution?

Marcin Szydłowski 1984
Marcin Szydłows...
Joined: 6 Jan 22
Posts: 11
Credit: 1958700
RAC: 0

I created a website for

I created a website for displaying server and host statistics in form of charts. Page is available here. It looks best on laptop or tablet screens. Please let me know if it is worth for further development. Proper email address is available on "Contact" link in the bottom right side of the page.

Keith Myers
Keith Myers
Joined: 11 Feb 11
Posts: 5002
Credit: 18872412356
RAC: 6157623

Just a FYI, but none of your

Just a FYI, but none of your links and graphs are visible to anyone else but you.

Doubt you get any traction for this thread.

I don't know of any API that you are looking for.

Only solution is to scrape the html that you are already doing.

 

Ian&Steve C.
Ian&Steve C.
Joined: 19 Jan 20
Posts: 4028
Credit: 47820905479
RAC: 39669907

also, if you are just looking

also, if you are just looking for your own data, there is a jobs log text file for each project generated. it contains run times and completion dates/times. you can get this data from each host, then compile them together. but it would require software running on the host to get the data, or manually grabbing them.

_________________________________________________________________________

Marcin Szydłowski 1984
Marcin Szydłows...
Joined: 6 Jan 22
Posts: 11
Credit: 1958700
RAC: 0

Ok... I'm back again and now

Ok... I'm back again and now I see that the attached link isn't visible so people cannot check it.

Here's the link once again: https://einsteinathomestats.org

I hope it works now.

Keith Myers
Keith Myers
Joined: 11 Feb 11
Posts: 5002
Credit: 18872412356
RAC: 6157623

Marcin Szydłowski 1984

Marcin Szydłowski 1984 wrote:

Ok... I'm back again and now I see that the attached link isn't visible so people cannot check it.

Here's the link once again: https://einsteinathomestats.org

I hope it works now.

Nope.

 

Ian&Steve C.
Ian&Steve C.
Joined: 19 Jan 20
Posts: 4028
Credit: 47820905479
RAC: 39669907

works for me, but pulling

works for me, but pulling stats is painfully slow.

pulling ~52,000 tasks from my host is estimating that it'll take about 20 hours lol.

_________________________________________________________________________

Marcin Szydłowski 1984
Marcin Szydłows...
Joined: 6 Jan 22
Posts: 11
Credit: 1958700
RAC: 0

Ian&Steve C. wrote:works

Ian&Steve C. wrote:

works for me, but pulling stats is painfully slow.

pulling ~52,000 tasks from my host is estimating that it'll take about 20 hours lol.

That's why I asked about some API at the beginning of this topic because as Keith said the idea is based on visiting pages of tasks for given host. In your case ~52k (without getting related tasks) means 2600 visits (20 tasks per page). It is acceptable for small (~1k tasks per host) but I have no good idea how to do it better in other cases.

Site is hosted by AWS and available in the Europe zone so far. I'll try to improve performance in the next week and if the problems will gone I'll also try to make it available everywhere.

Marcin Szydłowski 1984
Marcin Szydłows...
Joined: 6 Jan 22
Posts: 11
Credit: 1958700
RAC: 0

Some changes: Getting

Some changes:

  • Getting tasks is about 10 times faster, but for > 50k tasks it's still about 20 to 30 minutes (results are cached so next shots will be much faster).
  • Site is hosted in Europe zone but should be available everywhere. "Bad requests" may occur for few times but after a minute or two the site should be available.

Comment viewing options

Select your preferred way to display the comments and click "Save settings" to activate your changes.