Two BRP6 1.50 running now on ID 7163667 (in the past crashed immediately after starting)
Expecting remarkable shorter processing time (if they will come to an end ...)
I know I am a part of a story that starts long before I can remember and continues long beyond when anyone will remember me [Danny Hillis, Long Now]
Six BRP6 1.50 (two GPUs each 3 tasks) now running on ID 4546148 (in the past resulting with 0% GPU usage)
Now GPU usage 98% - GPU-Temp high but ok and stable
Also this ones could end up earlier (shown Remaining Time (estimated) as bad as always)
I know I am a part of a story that starts long before I can remember and continues long beyond when anyone will remember me [Danny Hillis, Long Now]
I'd say these results are probably better than what you would expect in the long run, averaging over more workunits (as the speed-up is data-dependent).
To extend Bikeman's comment on the data-dependent variability of both elapsed time and CPU time for the current beta application, I'll comment that if you are running more than one task on a GPU at a time, I believe you'll find that in a mismatched pair (say for example, either a "fortunate unit" beta with a particularly unfortunate unit Beta, or a fortunate beta with a 1.39 unit) the advantaged member of the pair will get more than half of the GPU resource--thus giving an elapsed time result much better than would be seen were two of the same degree of good fortune running simultaneously.
The mechanism, I suspect, is that each time the task currently using the GPU needs to wait for CPU service, it gives the GPU back to the other task. If the fortunate unit requests such service less frequently, then this switching will be unbalanced.
But it sure is pleasant to see the remarkably short elapsed times reported from fortunate units on the current beta. I have a moderately overclocked GTX 970 which has been running Parkes 1.39 at 3X with elapsed times of about 4:26:00 with charged CPU times of about 1:43:00. In a first trial with 1.50 beta work, it got two fortunate units plus an unfortunate unit, running simultaneously. The two fortunate units took 2:34:07 elapsed time and were charged 0:20:30 CPU time. The unfortunate unit is not yet finished, and is finishing paired with a mix of 1.39 and 1.50 work, but look like taking about four and half hours elapsed, and over 3:00:00 of CPU. As these are my only samples of 1.50 beta on this particular host, I can't comment on how near the poles of good and bad fortune these particular units are.
I don't mean to imply there are just two flavors--the degree of good or bad fortune appears to have rather fine-grained gradation among the few dozen units I have observed so far on another host which was able to handle the 1.47 beta.
[edit: for simplicity I wrote as though one were running specifically 2X, but the same basic issues of course apply at higher multiples also]
To extend Bikeman's comment on the data-dependent variability of both elapsed time and CPU time for the current beta application ...
Thanks for trying to explain.
You are speaking of "mismatched pairs", "fortunate" and "unfortunate" beta (work)units, about "good or bad fotune", of "fine-grained gradation". You suppose a "unbalanced switching process" between CPU and GPU ...
Maybe I better understand what happens if I could get more information about the mentioned data-dependency.
I know I am a part of a story that starts long before I can remember and continues long beyond when anyone will remember me [Danny Hillis, Long Now]
The fast units seem to cause considerably more cpu load, or is it just the first one that i have?
Within the 1.47 and 1.50 population, my observation is that CPU time and elapsed time are closely correlated, with the units which will take somewhat more elapsed time requiring far more CPU time.
Comparing the 1.47/1.50 beta to the preceding 1.39 non-beta application, the faster beta units take appreciably less CPU time on my hosts than did the 1.39, while the slowest beta units take somewhat more, I think.
Two BRP6 1.50 running now on
)
Two BRP6 1.50 running now on ID 7163667 (in the past crashed immediately after starting)
Expecting remarkable shorter processing time (if they will come to an end ...)
I know I am a part of a story that starts long before I can remember and continues long beyond when anyone will remember me [Danny Hillis, Long Now]
Six BRP6 1.50 (two GPUs each
)
Six BRP6 1.50 (two GPUs each 3 tasks) now running on ID 4546148 (in the past resulting with 0% GPU usage)
Now GPU usage 98% - GPU-Temp high but ok and stable
Also this ones could end up earlier (shown Remaining Time (estimated) as bad as always)
I know I am a part of a story that starts long before I can remember and continues long beyond when anyone will remember me [Danny Hillis, Long Now]
A BRP6 1.50, Task 487746125,
)
A BRP6 1.50, Task 487746125, finished on a GTX 980 on Ubuntu 64 bit. Fast run. Seems ok. Bent :-)
Too good to be true ? Two
)
Too good to be true ?
Two BRP 1.50 finished on ID 7163667 (Win7/i7/2 GTX670)
Example:
BRP6 1.39 - CPU 02:38:45 / GPU 07:46:56
BRP6 1.50 - CPU 00:19:10 / GPU 03:34:41
Can't get new tasks BRP 1.50 ...
I know I am a part of a story that starts long before I can remember and continues long beyond when anyone will remember me [Danny Hillis, Long Now]
RE: Too good to be true
)
I'd say these results are probably better than what you would expect in the long run, averaging over more workunits (as the speed-up is data-dependent).
HBE
RE: Too good to be true
)
To extend Bikeman's comment on the data-dependent variability of both elapsed time and CPU time for the current beta application, I'll comment that if you are running more than one task on a GPU at a time, I believe you'll find that in a mismatched pair (say for example, either a "fortunate unit" beta with a particularly unfortunate unit Beta, or a fortunate beta with a 1.39 unit) the advantaged member of the pair will get more than half of the GPU resource--thus giving an elapsed time result much better than would be seen were two of the same degree of good fortune running simultaneously.
The mechanism, I suspect, is that each time the task currently using the GPU needs to wait for CPU service, it gives the GPU back to the other task. If the fortunate unit requests such service less frequently, then this switching will be unbalanced.
But it sure is pleasant to see the remarkably short elapsed times reported from fortunate units on the current beta. I have a moderately overclocked GTX 970 which has been running Parkes 1.39 at 3X with elapsed times of about 4:26:00 with charged CPU times of about 1:43:00. In a first trial with 1.50 beta work, it got two fortunate units plus an unfortunate unit, running simultaneously. The two fortunate units took 2:34:07 elapsed time and were charged 0:20:30 CPU time. The unfortunate unit is not yet finished, and is finishing paired with a mix of 1.39 and 1.50 work, but look like taking about four and half hours elapsed, and over 3:00:00 of CPU. As these are my only samples of 1.50 beta on this particular host, I can't comment on how near the poles of good and bad fortune these particular units are.
I don't mean to imply there are just two flavors--the degree of good or bad fortune appears to have rather fine-grained gradation among the few dozen units I have observed so far on another host which was able to handle the 1.47 beta.
[edit: for simplicity I wrote as though one were running specifically 2X, but the same basic issues of course apply at higher multiples also]
RE: To extend Bikeman's
)
Thanks for trying to explain.
You are speaking of "mismatched pairs", "fortunate" and "unfortunate" beta (work)units, about "good or bad fotune", of "fine-grained gradation". You suppose a "unbalanced switching process" between CPU and GPU ...
Maybe I better understand what happens if I could get more information about the mentioned data-dependency.
I know I am a part of a story that starts long before I can remember and continues long beyond when anyone will remember me [Danny Hillis, Long Now]
The fast units seem to cause
)
The fast units seem to cause considerably more cpu load, or is it just the first one that i have?
The fast one is using 20%, the normal one 3% of a core.
RE: The fast units seem to
)
Within the 1.47 and 1.50 population, my observation is that CPU time and elapsed time are closely correlated, with the units which will take somewhat more elapsed time requiring far more CPU time.
Comparing the 1.47/1.50 beta to the preceding 1.39 non-beta application, the faster beta units take appreciably less CPU time on my hosts than did the 1.39, while the slowest beta units take somewhat more, I think.
I was speaking of two WUs
)
I was speaking of two WUs running 1.50.
I have two running in parallel now, one is using 22% CPU and one 3%.