BRP6 times

Betreger
Betreger
Joined: 25 Feb 05
Posts: 992
Credit: 1622876913
RAC: 790457
Topic 198026

Has there been a recent change in this app? The reason I ask is when running 3 at a time on my GTX660 I was regularly seeing run times of 5h 15m and in the last couple of days it has gone up to 5h 30m. There have been no changes that I know of on the computer.

ExtraTerrestrial Apes
ExtraTerrestria...
Joined: 10 Nov 04
Posts: 770
Credit: 584794775
RAC: 144748

BRP6 times

No change in the standard 1.39 app. Einstein task completion times always depend somewhat on data and one may get mostly this or that batch due to the locality scheduling. Your observation could stem from other things, though. Did you recently upgrade the driver? Maybe the machine needs a reboot?

Besides: there has been a significant change to the app, providing the "nice" to "significant" performance boost depending on your system. It's still beta but works very well. You can enable it by allowing beta tasks in your profile. Afterwards you should get application 1.52 instead of 1.39 tasks once you download new task.

MrS

Scanning for our furry friends since Jan 2002

Betreger
Betreger
Joined: 25 Feb 05
Posts: 992
Credit: 1622876913
RAC: 790457

Thanx for the reply, the

Thanx for the reply, the machine was rebooted a couple of days ago and I've been using the same driver for a couple of years, so for now I'll attribute it to the data. I will allow the beta and see if I can crunch more.

archae86
archae86
Joined: 6 Dec 05
Posts: 3161
Credit: 7310141689
RAC: 2310532

While data dependency is

While data dependency is certainly an issue, even a major issue, for some Einstein applications, my own observation is of quite small data dependency for BRP6 work processed by the 1.39 application.

I think that for 1.39 a WU showing longer ET than common on that host has probably suffered disadvantage on the CPU support task--possibly because it by bad luck shared a core for some time with something else which impaired latency of response to requests. If multiple work units over an extended period take collectively more average ET than over another extended period, I think for 1.39 the places to look are

1. the operating parameters of the GPU (core and memory clock rate, temperature limitation...)
2. the general congestion of CPU and I/O operations on the host. (in other words, what else is it doing besides supporting 1.39 GPU work?)

If you like puzzles, you can chew away on these or other possibilities. If you like output, I'd suggest enabling beta applications for Einstein, and enjoying the considerable Einstein throughput increase that will give you at the moment.

Betreger
Betreger
Joined: 25 Feb 05
Posts: 992
Credit: 1622876913
RAC: 790457

The core temps are around 64c

The core temps are around 64c as they always have been and the times I quoted were from overnight when nothing but crunching is going on. During the day when I use the machine the times have always gone up as you suggested. I do agree that the BRP times have always been consistent for me, that is why I noted the increase in run time.
As for beta I have downloaded 1 BRP6 already and I look forward to the increase in throughput.

Gary Roberts
Gary Roberts
Moderator
Joined: 9 Feb 05
Posts: 5877
Credit: 118656267179
RAC: 18998796

RE: Has there been a recent

Quote:
Has there been a recent change in this app? The reason I ask is when running 3 at a time on my GTX660 I was regularly seeing run times of 5h 15m and in the last couple of days it has gone up to 5h 30m. There have been no changes that I know of on the computer.


Thanks for asking this question - it's caused me to do quite a bit of thinking.

I realise you already have your answer but when I woke up this morning and read the overnight postings, my first thought was that surely everyone already realised there had been a whole saga of changes with the the BRP5 -> BRP6 -> BRP6-beta transition, simply because of the sheer volume of messages that had been flying back and forth. That was quickly followed by the second thought that all this activity had been occurring in highly technical forums and nothing had been noted in 'General News'. So it's quite natural that the majority of 'normal' volunteers may be quite unaware of all these changes.

That led me to ponder about how the average volunteer could best investigate anything that seemed to be changing on a particular host when there was no standard news about a change that would explain the observation. Of course, you could tell the volunteer to go check the technical news or other technical boards but I can understand that many people might not have the time or might find the prospect a bit daunting. Perhaps the best solution would be a short 'closed' thread in General News announcing the change and linking to the 'details' thread in Technical News, etc.

The other thought that occurred to me is that it's quite easy to convince yourself that something must have changed when it really hasn't - at least at the project end. I know I've been guilty of doing just that, quite a few times. If you think there has been a change, there are at least four things you should consider (and check) in working out what to 'blame' or how to account for the change.

1. Has there been a change in the app and/or data? If there had been, there would likely be a new version number and/or task naming convention. In the absence of any specific news announcement, a quick look through the tasks list for your host on the website should reveal such changes.

2. Has anything changed in the hardware/software configuration on your computer? Things like OS updates, changed driver versions, security software, installed apps, etc, as well as the obvious things like physical hardware and environmental factors.

3. Has your machine been rebooted recently and are you sure that it is still running with the same settings/BIOS options as previously? Simple things like a change in CPU/GPU frequency - eg thermal throttling could be changing things.

4. Have you looked at a big enough sample size to really be sure there is a change? I know from personal experience that a 'quick look at the numbers' can easily convince you that something has changed when it really hasn't. Even if you take the trouble to browse through all the results available in the on-line database, you can still fool yourself into seeing a change that's not really there. The biggest problem here is that results expire quite quickly these days and so a very short term change over a few days might just be part of a long term cycle that you can't see anymore.

This last factor has a simple solution - it's the job_log_einstein.phys.uwm.edu.txt file which exists in your BOINC Data directory. It doesn't expire quickly so can contain thousands of results. Here are a few lines from one of mine (only the first line is colour highlighted):

1427350896 ue 22870.594655 ct 1534.430000 fe 590000000000000 nm PM0011_05071_300_1 et 28261.906620 es 0
1427356555 ue 60851.364031 ct 35468.760000 fe 105000000000000 nm LATeah0109E_400.0_2268_0.0_0 et 35705.626589 es 0
1427361223 ue 27416.553087 ct 15524.700000 fe 47307700000000 nm LATeah0109E_80.0_130_-8.21e-10_0 et 15617.178513 es 0
1427362016 ue 22870.594655 ct 1509.230000 fe 590000000000000 nm PM0011_05071_302_1 et 28076.950765 es 0
1427370571 ue 22870.594655 ct 1516.506000 fe 590000000000000 nm PM0011_05221_84_0 et 27927.323731 es 0
1427378706 ue 22870.594655 ct 1530.090000 fe 590000000000000 nm PM0011_051B1_360_0 et 27809.746032 es 0
1427389881 ue 22870.594655 ct 1527.316000 fe 590000000000000 nm PM0011_051B1_366_0 et 27864.332947 es 0
1427390137 ue 60851.364031 ct 33339.750000 fe 105000000000000 nm LATeah0110E_80.0_108_-2.08e-10_1 et 33580.924319 es 0
1427394441 ue 60851.364031 ct 33042.590000 fe 105000000000000 nm LATeah0111E_48.0_26_-7.33e-10_0 et 33217.250708 es 0
1427398424 ue 22870.594655 ct 1538.913000 fe 590000000000000 nm PM0011_051B1_90_2 et 27851.984694 es 0
1427406661 ue 22870.594655 ct 1517.891000 fe 590000000000000 nm PM0011_053D1_40_1 et 27953.407901 es 0
1427417939 ue 22870.594655 ct 1535.926000 fe 590000000000000 nm PM0011_05361_352_2 et 28056.844675 es 0

It's a simple text file with space separated fields. The first (red) field is a date stamp (unix format) for when the result was returned. The blue 'ct' field is the CPU time component and the green 'et' field is the elapsed time. For other fields, the two letter tags are ue=elapsed time estimate, fe=flops estimate, nm=actual task name, es=?? - I don't know this one.

Browsing this file is quite painful because tasks from different runs are all mixed together and the task name fields in particular are of different widths so the numbers are not in nice lined up columns. So a little bit of pre-processing with freely available command line utilities soon fixes that.

It's pretty straight forward to select on the task name field to get just those lines for a particular run. This can be further processed to select the 'ct' and 'et' fields only so it's easy to see if anything is changing. The following are the times for the BRP6 tasks only from the above list:

1534.430000 28261.906620
1509.230000 28076.950765
1516.506000 27927.323731
1530.090000 27809.746032
1527.316000 27864.332947
1538.913000 27851.984694
1517.891000 27953.407901
1535.926000 28056.844675

So, for not much effort, you could look at a few hundred of the most recent results and really see if anything is changing - just by manual inspection. If the data was imported into a spreadsheet, a very accurate picture would be available.

I know this sounds terribly complicated but it's really not. In Linux or OS X, it's trivial because the appropriate utilities (eg grep, cut, sed, etc) come as standard utilities. In Windows (when I last used it several years ago) there were freeware versions of these utilities that ran well on XP. I imagine those utilities would still be available, even if Windows doesn't have its own methods of easily massaging text files - I wouldn't have a clue about that. Someone with Windows experience might like to comment.

I wrote all the above in the hope that people, even only marginally interested in performance, might be encouraged to use the job_log file as a resource for information about what's happening on their host. The biggest hurdle is the mental one of believing it's too hard and too complicated to even give it a try.

Cheers,
Gary.

DanNeely
DanNeely
Joined: 4 Sep 05
Posts: 1364
Credit: 3568297050
RAC: 465366

If you have Excel, the Text

If you have Excel, the Text to Columns feature (on the data tab of the ribbon) will quickly turn the file into a nice set of columnar data to play with. Due to variable WU name lengths, you'll want to ignore the default of fixed width and pick delimited, and then space as a delimiter to fully break it out.

Richard Haselgrove
Richard Haselgrove
Joined: 10 Dec 05
Posts: 2143
Credit: 2991169631
RAC: 701643

I've used Microsoft Access -

I've used Microsoft Access - the database component of MS Office Professional - to work on the job_log files in Windows. First small problem - the files have *nix line endings, and MS Access likes Windows for text import - but that's easily sorted: open a copy of the files with WordPad, re-save and close. You can then set up (and re-use) a text import specification for the space-delimited files, and even script the whole import process via a macro.

Once the log is properly organised in a data table, all the standard SQL operations can be used to knock it into shape. I use the technique to generate the SETI data distribution tables - the final step uses Excel to populate and colour the cells, and the Windows 7 snipping tool to screen-grab the area needed.

The few occasions I've tried it, OpenOffice (or LibreOffice) seemed to be as fully featured as the MS equivalents, so maybe that could be used as well.

Comment viewing options

Select your preferred way to display the comments and click "Save settings" to activate your changes.