Hi,
This host (GTX 1050 on linux) is producing only invalid and inconclusive results for the GPU GW application so far. No matter if 1x, 2x or 4x simultaneous jobs were running.
Is there any point in keeping it running (i.e. helping to optimize the verification process) or should I disable the beta test?
Other projects run fine, FGRP had a long term invalid rate of <1% IIRC.
Copyright © 2024 Einstein@Home. All rights reserved.
That host also has eight
)
That host also has eight recently generated inconclusive results on V1.01 GW work, which is CPU-only. I don't know what fraction of those will eventually be found invalid (for GPU V1.07 work, inconclusive seems usually to predict eventual invalid with high confidence, but CPU may be different).
While we have seen quite a range of host invalidation rates and patterns on V1.07 GPU work, I don't think the results on that machine for CPU-only V1.01 CPU are in the expected range. Possibly the machine is unhealthy in some way relevant to both CPU and GPU GW work on the current applications.
Bernd has advised that invalid (and on the way there inconclusive) results are currently valuable to the GW GPU beta. I'm unclear on whether he values long-continued production of invalid results from a machine already shown to produce them. Personally, I've chosen to move a machine which produced 100% GPU GW invalid back to GRP, but keep running GW on a machine which bursts on and off--a dozen or more of valid tasks in a row, then a half dozen invalid in a row, and back again.
Hm. I never had an invalid
)
Hm. I never had an invalid CPU task, as far as I remember. I will observe that. Unfortunately I can't see the wingman's results.
Keep an eye on it but also
)
Keep an eye on it but also remember that any beta-GPU task will be paired with a non-beta CPU task and if it's declared inconclusive then both will show that status until the tiebreaker is in. So it might turn out that your CPU is still doing just fine.
Holmis wrote:any beta-GPU
)
Good point. This is where the invisibility of the quorum partners until final resolution impairs our user ability to judge. For example, it could be that all of the CPU tasks from this machine current showing inconclusive status have as quorum partner one single GPU machine that is producing 100% invalid results on GW V1.07 (they exist).
But, because this GW search is configured to allow for single-task quorums with known reliable CPU hosts (not GPU) the quorum invisibility is in effect.
NVIDIA GeForce GTX 1050
)
NVIDIA GeForce GTX 1050 (1999MB) driver: 430.40
You might try a different version of the driver. 375.39 is showing as current on the NVIDIA website; I'm wondering if your driver is a beta release? I don't remember seeing a production driver with that high a revision number. While I was running the GW beta task, my NVIDIA validation rate was between 40 and 60%, of course, the devs were still tweaking the validation process. Another note, my NVIDIA card is in a Windows box, so mileage will vary.
I'm using the nvidia driver
)
I'm using the nvidia driver that is supplied by debian testing (non-free).
nvidia.com lists version 430.50 as current driver release:
https://www.nvidia.com/Download/driverResults.aspx/151568/en-us
As of now, all GPU-GW (22) tasks turned out invalid. All CPU-GW (11) tasks turned out valid.
I've disabled beta for now. There are still many inconclusive tasks yet to verify.
For slightly more than a week
)
For slightly more than a week I keep getting invalid results for almost all GPU-GW v1.07 tasks yet before most of them have validated successfully. CPU tasks are okay mostly. Something happened to the data or the GPU app? I've noticed that my GPU tasks don't validate against CPU versions but CPU to CPU validates okay.
Alexander Favorsky wrote:For
)
This is a common issue, discussed at length in this link.
Was there any resolution to
)
Was there any resolution to this?
Jacob Klein wrote:Was there
)
If by "this", you are referring to lots of invalid results for previous versions of the GW GPU app, then the answer is "yes". The app is still under test but current versions seem to be working OK and giving valid results.
If you are referring to something else, please specify.
This thread that you have posted in refers to the V1.07 app for the O2AS (All Sky) search. V1.09 of the app was the one that finally achieved a high proportion of valid results. That search was abruptly terminated and a new search O2MD1 (Multi Directed) was started whose aim was to use the now successful GPU app to 'target' some known pulsars rather than do a much lengthier 'all sky' search (which I guess they'll come back to at some later point). The app is now V2.0x and has had some more tweaks. Results seem to be valid but there are some oddities with the scheduler distributing work correctly.
As conditions will change and problems will show up from time to time, if you want to participate when new apps are being tested, you really need to follow the discussion threads for each new search as it comes along. That way you will always know of problems and fixes as they happen.
BTW, the link in the comment by Matt White (immediately before yours) points to a non-existent thread. That would be a deficiency in the forum software that is not able to handle fixing links to thread titles where the title has subsequently changed. When it became apparent that GW searches could quickly change (eg from O2AS to O2MD) I changed the title of that thread so that there could be two discussion threads, one for each different search. So the link to the thread that Matt was actually pointing to at the time he posted, should now be this link. This is ancient history for now but it wouldn't surprise to see O2AS resume once O2MD finishes, which probably won't be all that far away :-)
EDIT: After posting the above, I've just now caught up with the latest in the O2MD1 discussion thread. It seems there may be a significant problem with the very recently released V2.02 app such that lots of results are giving validate errors when eventually being presented for validation. Anyone reading this would be wise to not run this new version until Bernd has had time to comment about the problem.
I started running this version on a couple of hosts late yesterday - 14hrs ago. There are 3 attempted validations so far, all of which are validate errors so I've suspended crunching of GW tasks and returned the machines to FGRPB1G until further advice from Bernd.
Cheers,
Gary.