// DBOINCP-300: added node comment count condition in order to get Preview working ?>
Firebird
Joined: 10 Mar 05
Posts: 7
Credit: 90,397,577
RAC: 0
16 Jan 2010 19:19:57 UTC
Topic 194734
(moderation:
)
I just changed my video card from a 220gt to a 9800gt, now every workunit is showing a validate error. Driver is version 195.62, any ideas on how to fix it?
Someone at folding@home has released a nVIDIA GPU memory checker. The results was/are an eye-opener for nVidia and Fermi-based cards comes with ecc-memory...
This problem is not nVidia only, AMD/ATI is affected too.
Most likely, the validation problems are not caused be a hardware problem, but by problems that surfaced in the transition phase of the ABP1 to ABP2 searches.
Most likely, the validation problems are not caused be a hardware problem, but by problems that surfaced in the transition phase of the ABP1 to ABP2 searches.
New GPU apps have been distributed automatically, sothe problem should be resolved in a few days after all new results are computed with the new apps.
CU
HB
Well, that got my hopes up, so I put the card in a different computer. Now it's erroring everything it runs. The workunits run normal, everything looks fine until they validate, or rather don't.
Those are after I switched the card back to the 220gt. The 9800 gt went in a 3.0 gig pentium 4 box, then that one started running junk. I detached it from einstein, it's trying seti now. We'll see how that works for it. The card for sure doesn't run einstein though.
Those are after I switched the card back to the 220gt. The 9800 gt went in a 3.0 gig pentium 4 box, then that one started running junk. I detached it from einstein, it's trying seti now. We'll see how that works for it. The card for sure doesn't run einstein though.
It might be the individual card that has a problem, there 's no fundamental problem with the 9800 GT, as I have one myself and it's doing fine.
Again, the validation errors I could find for your host were with workunits that had earlier ABP2 hosts as wingmen, and you can see that even some of your 220GT crunched units had a validation problem with them:
I suspect it was a coincidence that you switched cards at almost the exact moment when your host ran out of the "problematic" workunits. I'm quite sure your 9800 GT will be doing great on E@H.
Those are after I switched the card back to the 220gt. The 9800 gt went in a 3.0 gig pentium 4 box, then that one started running junk. I detached it from einstein, it's trying seti now. We'll see how that works for it. The card for sure doesn't run einstein though.
It might be the individual card that has a problem, there 's no fundamental problem with the 9800 GT, as I have one myself and it's doing fine.
Again, the validation errors I could find for your host were with workunits that had earlier ABP2 hosts as wingmen, and you can see that even some of your 220GT crunched units had a validation problem with them:
I suspect it was a coincidence that you switched cards at almost the exact moment when your host ran out of the "problematic" workunits. I'm quite sure your 9800 GT will be doing great on E@H.
That one started with the 220, then switched to 9800 in the middle.
[19:43:43][4552][INFO ] Using CUDA device #0 "GeForce 9800 GT" (462.00 GFLOPS)
Those are after I switched the card back to the 220gt. The 9800 gt went in a 3.0 gig pentium 4 box, then that one started running junk. I detached it from einstein, it's trying seti now. We'll see how that works for it. The card for sure doesn't run einstein though.
It might be the individual card that has a problem, there 's no fundamental problem with the 9800 GT, as I have one myself and it's doing fine.
Again, the validation errors I could find for your host were with workunits that had earlier ABP2 hosts as wingmen, and you can see that even some of your 220GT crunched units had a validation problem with them:
I suspect it was a coincidence that you switched cards at almost the exact moment when your host ran out of the "problematic" workunits. I'm quite sure your 9800 GT will be doing great on E@H.
That one started with the 220, then switched to 9800 in the middle.
[19:43:43][4552][INFO ] Using CUDA device #0 "GeForce 9800 GT" (462.00 GFLOPS)
CU
HB
Ok, I see, but still it's one of these units that have a high initial replication and were practically bound to fail for some wingmen
I'm sure it's just that particular card, not 9800 in general. For some reason though it REALLY doesn't like this project. It's running Seti fine, everything validating OK there.
FWIW, http://www.bfgtech.com/bfgr981024gtge.aspx
That's the exact card in question.
The 9800 gt went in a 3.0 gig pentium 4 box, then that one started running junk.
I'm not sure what you mean by "running junk" but I've had a good look at your 3.0GHz Pentium 4 host with the 9800GT and it is showing 6 recently completed tasks, 2 with the outcome of 'success' and 4 with the outcome of 'validate error'. These 4 are not necessarily 'invalid' - 'validate error' means that there was a problem on the server when the validator tried to perform a validation on the 2 members of the quorum. It is possible for this error to be caused by the files not being available on the server in the expected location when the validator tried to do the validation. Check out the explanatory link under the 'outcome' heading.
I'm only guessing here but the ABP2 task load is having a noticeable effect on server performance. In a day there are many short periods where the server response is woeful. The thought occurred to me that you might be trying to 'return results immediately'. If so, the 'reporting' phase might be happening too quickly after the 'uploading' phase and this with a bogged down server might explain what you are seeing. So, are you trying to 'return results immediately'?
Quote:
I detached it from einstein, it's trying seti now. We'll see how that works for it. The card for sure doesn't run einstein though.
You say you have detached from E@H. That host's task list shows a swag of tasks which are still assigned to it. Why don't you give it another try with NNT (no new tasks) set so the host will crunch and upload the results but will not be asking for new work and so won't be trying to report immediately. Once you have a few done, try reporting them well after they upload and see if that gets rid of the 'validate error'.
Validate Errors
)
Someone at folding@home has released a nVIDIA GPU memory checker. The results was/are an eye-opener for nVidia and Fermi-based cards comes with ecc-memory...
This problem is not nVidia only, AMD/ATI is affected too.
Most likely, the validation
)
Most likely, the validation problems are not caused be a hardware problem, but by problems that surfaced in the transition phase of the ABP1 to ABP2 searches.
Details can be found here, for example.
New GPU apps have been distributed automatically, sothe problem should be resolved in a few days after all new results are computed with the new apps.
CU
HB
RE: Most likely, the
)
Well, that got my hopes up, so I put the card in a different computer. Now it's erroring everything it runs. The workunits run normal, everything looks fine until they validate, or rather don't.
I think you were unlucky
)
I think you were unlucky enough to still catch some of those transition workunits that had some of the flawed initial ABP2 apps as "wingmen".
All the recent (lots of it) result here http://einsteinathome.org/host/2263082/tasks seem to validate just fine, and with a good performance.
CU
HB
Those are after I switched
)
Those are after I switched the card back to the 220gt. The 9800 gt went in a 3.0 gig pentium 4 box, then that one started running junk. I detached it from einstein, it's trying seti now. We'll see how that works for it. The card for sure doesn't run einstein though.
RE: Those are after I
)
It might be the individual card that has a problem, there 's no fundamental problem with the 9800 GT, as I have one myself and it's doing fine.
Again, the validation errors I could find for your host were with workunits that had earlier ABP2 hosts as wingmen, and you can see that even some of your 220GT crunched units had a validation problem with them:
e.g. http://einsteinathome.org/task/155515818
I suspect it was a coincidence that you switched cards at almost the exact moment when your host ran out of the "problematic" workunits. I'm quite sure your 9800 GT will be doing great on E@H.
CU
HB
RE: RE: Those are after I
)
RE: RE: RE: Those are
)
Ok, I see, but still it's one of these units that have a high initial replication and were practically bound to fail for some wingmen
http://einsteinathome.org/workunit/65820105
CU
HB
I'm sure it's just that
)
I'm sure it's just that particular card, not 9800 in general. For some reason though it REALLY doesn't like this project. It's running Seti fine, everything validating OK there.
FWIW,
http://www.bfgtech.com/bfgr981024gtge.aspx
That's the exact card in question.
RE: The 9800 gt went in a
)
I'm not sure what you mean by "running junk" but I've had a good look at your 3.0GHz Pentium 4 host with the 9800GT and it is showing 6 recently completed tasks, 2 with the outcome of 'success' and 4 with the outcome of 'validate error'. These 4 are not necessarily 'invalid' - 'validate error' means that there was a problem on the server when the validator tried to perform a validation on the 2 members of the quorum. It is possible for this error to be caused by the files not being available on the server in the expected location when the validator tried to do the validation. Check out the explanatory link under the 'outcome' heading.
I'm only guessing here but the ABP2 task load is having a noticeable effect on server performance. In a day there are many short periods where the server response is woeful. The thought occurred to me that you might be trying to 'return results immediately'. If so, the 'reporting' phase might be happening too quickly after the 'uploading' phase and this with a bogged down server might explain what you are seeing. So, are you trying to 'return results immediately'?
You say you have detached from E@H. That host's task list shows a swag of tasks which are still assigned to it. Why don't you give it another try with NNT (no new tasks) set so the host will crunch and upload the results but will not be asking for new work and so won't be trying to report immediately. Once you have a few done, try reporting them well after they upload and see if that gets rid of the 'validate error'.
Cheers,
Gary.