Validate error - What this really means!

archae86
archae86
Joined: 6 Dec 05
Posts: 3157
Credit: 7228681575
RAC: 1124846

wolfman1360 wrote:I was just

wolfman1360 wrote:

I was just going through my tasks. Started crunching last night and I already have 10 invalid.

https://einsteinathome.org/host/12775393/tasks/5/0

 

I'm not sure what to look for here. GPU has the mildest of overclocks, so I reverted to stock settings in hopes that would help. I can't find any error in any of the logs but again, not sure exactly what to look for / what would actually be causing the errors in the first place.

You already have almost as many validate errors as you have of successful validations.  That is very much worse than normal results.  Worse yet, so far your invalid tasks are all "validate error", and not "Completed marked as invalid".  This means that when a quorum partner had also returned a result against this workunit, each partner was subjected to a basic sanity check before comparison.  Your result flunked the sanity check, so comparison did not occur.

Most of us get very roughly 1% marked as "Completed marked as invalid" where the first comparison was not close enough for both returns to be judged valid--then a third task sent out turned out to be better matched to the "other fellow" than ours.  Your outcome is far, far worse than normal.

Something is wrong.  In case it is clock rate related, I'd suggest you move the core clock down by at least 10%.  Yes, if your "mildest" overclock is only 4%, I suggest you take it down to -6% (there is nothing magic about 0).  If it does not get noticeably better, it is not so likely to be clock rate related.  If it gets better, you can fiddle to find a better operating point.

If you have not already done so, perhaps a full cold-iron reboot is in order, followed by a full driver removal with clean install.

I'm not promising any of this will help, just suggesting things you might try.

wolfman1360
wolfman1360
Joined: 17 Feb 17
Posts: 19
Credit: 33664141
RAC: 0

archae86 wrote:That is very

archae86 wrote:

That is very much worse than normal results.  Worse yet, so far your invalid tasks are all "validate error", and not "Completed marked as invalid".  This means that when a quorum partner had also returned a result against this workunit, each partner was subjected to a basic sanity check before comparison.  Your result flunked the sanity check, so comparison did not occur.

Most of us get very roughly 1% marked as "Completed marked as invalid" where the first comparison was not close enough for both returns to be judged valid--then a third task sent out turned out to be better matched to the "other fellow" than ours.  Your outcome is far, far worse than normal.

Something is wrong.  In case it is clock rate related, I'd suggest you move the core clock down by at least 10%.  Yes, if your "mildest" overclock is only 4%, I suggest you take it down to -6% (there is nothing magic about 0).  If it does not get noticeably better, it is not so likely to be clock rate related.  If it gets better, you can fiddle to find a better operating point.

If you have not already done so, perhaps a full cold-iron reboot is in order, followed by a full driver removal with clean install.

I'm not promising any of this will help, just suggesting things you might try.

What I have done so far:

Updated my display driver (I didn't realize it doesn't auto update, but since Lenovo has a ridiculously outdated driver from 2017, it jumped from 22.19.128.0 to 25.20.15031.1000.

Rebooted multiple times, just to make sure things stuck / windows update didn't attempt to reinstall a fresh driver - similarly Lenovo vantage left it alone.

I haven't touched clock speed yet since I just now brought it back to stock settings (clicking reset to defaults in the AMD software).

The CPU has more than enough headroom (I think). This is a Ryzen 1800x with hyper threading enabled, so I have the CPU set to 93% usage.

 

This is making me very hesitant to bring other machines onto the project. I haven't completed a CPU task yet (though will be shortly) and it will be quite  the frustration if the 16 currently in progress have failures.

Right now, the project settings in regards to the GPU are all set to one task at once, too.

Gary Roberts
Gary Roberts
Moderator
Joined: 9 Feb 05
Posts: 5872
Credit: 117754145437
RAC: 34826516

wolfman1360 wrote:The CPU has

wolfman1360 wrote:

The CPU has more than enough headroom (I think). This is a Ryzen 1800x with hyper threading enabled, so I have the CPU set to 93% usage.

This is making me very hesitant to bring other machines onto the project. I haven't completed a CPU task yet (though will be shortly) and it will be quite  the frustration if the 16 currently in progress have failures.

Right now, the project settings in regards to the GPU are all set to one task at once, too.

I haven't checked, but I think a Ryzen 1800x is 8/16 (cores/threads).  If you are running 16 CPU tasks, it would really be a problem for GPU crunching.  By default, a GPU task should 'reserve' a CPU thread for support.  If BOINC is only allowed to use 93% of threads for CPU crunching, there should be only .93x16=14.88 threads (minus a further 1 for GPU) -> 13.88 threads available.  BOINC can use a partial thread so you should have 14 CPU tasks only that are actually running.  If there really are 16 CPU tasks running (none waiting to run), that might explain your GPU behaviour.  How have you managed to achieve that with the settings you say you are using?

If you look at your tasks list for your machine, check out the crunch times for completed GPU tasks.  I saw one example of 760s which is probably about right but a bit on the slow side for what your GPU is capable of.  Look at the others - between 1100s to 1900s - classic signs of inadequate CPU 'support' and maybe this is related to your validate errors.

You really need to start slowly and gradually build up.  Your most efficient way to contribute with that machine is to keep the GPU happy.  It should happily run 2 concurrent tasks so, eventually, you should plan to get to that state.  I would suggest you allow BOINC to use 50% of threads for starters.  Leave everything else at default values.  That should allow 7 CPU tasks and 1 GPU task to be running concurrently.  I'm assuming you are NOT running a non-default GPU utilization factor or an app_config.xml file?  You haven't mentioned it and you would be wise to become comfortable with stock standard behaviour before ever going down that route.

If you have any difficulty getting to 7 CPU tasks plus 1 GPU task, please tell us what settings you are using and someone will respond as quickly as possible.  Once you have it running at those settings, you should be able to see stable GPU crunch times perhaps around 650-700s.  Leave it that way until you have at least 20-40 GPU tasks completed showing nice stable crunch times and hopefully no validate errors.  We can then make further suggestions.

If validate errors continue, the next thing to think about (if you are definitely not overclocking your GPU) is the PSU.  Make, model and 12V rating would be handy to know.

 

Cheers,
Gary.

wolfman1360
wolfman1360
Joined: 17 Feb 17
Posts: 19
Credit: 33664141
RAC: 0

Gary Roberts wrote: I haven't

Gary Roberts wrote:

I haven't checked, but I think a Ryzen 1800x is 8/16 (cores/threads).  If you are running 16 CPU tasks, it would really be a problem for GPU crunching.  By default, a GPU task should 'reserve' a CPU thread for support.  If BOINC is only allowed to use 93% of threads for CPU crunching, there should be only .93x16=14.88 threads (minus a further 1 for GPU) -> 13.88 threads available.  BOINC can use a partial thread so you should have 14 CPU tasks only that are actually running.  If there really are 16 CPU tasks running (none waiting to run), that might explain your GPU behaviour.  How have you managed to achieve that with the settings you say you are using?

If you look at your tasks list for your machine, check out the crunch times for completed GPU tasks.  I saw one example of 760s which is probably about right but a bit on the slow side for what your GPU is capable of.  Look at the others - between 1100s to 1900s - classic signs of inadequate CPU 'support' and maybe this is related to your validate errors.

You really need to start slowly and gradually build up.  Your most efficient way to contribute with that machine is to keep the GPU happy.  It should happily run 2 concurrent tasks so, eventually, you should plan to get to that state.  I would suggest you allow BOINC to use 50% of threads for starters.  Leave everything else at default values.  That should allow 7 CPU tasks and 1 GPU task to be running concurrently.  I'm assuming you are NOT running a non-default GPU utilization factor or an app_config.xml file?  You haven't mentioned it and you would be wise to become comfortable with stock standard behaviour before ever going down that route.

If you have any difficulty getting to 7 CPU tasks plus 1 GPU task, please tell us what settings you are using and someone will respond as quickly as possible.  Once you have it running at those settings, you should be able to see stable GPU crunch times perhaps around 650-700s.  Leave it that way until you have at least 20-40 GPU tasks completed showing nice stable crunch times and hopefully no validate errors.  We can then make further suggestions.

If validate errors continue, the next thing to think about (if you are definitely not overclocking your GPU) is the PSU.  Make, model and 12V rating would be handy to know.

 

I'm starting to see why you say this now. Thank you for the detailed post and I'll attempt and address your questions as best I am able.

It has been a while, so please bare with me.

 

Yes, you are correct. 8 cores, 16 threads.

On the website, I separated this machine from all others - as the core 2 duo should have no issues with cpu only tasks, and set this to the home profile. I then Changed the 'home' profile to use 50% of the CPU in this instance. I have it set not to do work while the machine is in use - eliminating  yet one more factor. Yet when navigating to CPU preferences in boinc, I don't see a use web preferences button. I must be missing something plainly obvious here. I'm just not sure what. I changed it locally, but it would be nice if I could get web preferences to stick as I was able to do with WCG. Since it was CPU only, it was just set it and forget it with the occasional change to various projects via the website. I also clicked update just in case that might transfer over web preferences, but after waiting 15 minutes I got no results and resorted to changing them locally. As a side note, I also have WCG attached (though not running any active tasks. It also isn't set to). Would this be conflicting? From what I vaguely remember reading the site you change settings on last is the one boinc grabs its changes from? 

Note: After more testing, it appears that WCG is what boinc thinks it should grab the preferences from. I'm unsure why so I'm detaching it for now in hopes that will fix things. Frustrating.

 

Apparently, I miscounted - there are currently 15 CPU tasks in progress. I started out with everything at the default settings of 100% CPU usage because I forgot the GPU takes one CPU core to crunch. (In this case thread).

 

No, I haven't touched GPU utilization. I do not have an app_config.xml file set up and the GPU is not overclocked or undervolted.

 

Not sure of the make or model of the PSU. I have hwinfo installed - would I be able to get 12V rating from there?

 

Hopefully I provided the necessary information. Please let me know if I've left anything out.

 

Again, thank you.

 

Looks like an error happened this time. https://einsteinathome.org/host/12775393/tasks/6/00

 

I'm really hoping this sorts itself out at some point. A lot of CPU cycles lost there.

Holmis
Joined: 4 Jan 05
Posts: 1118
Credit: 1055935564
RAC: 0

wolfman1360 wrote:Yet when

wolfman1360 wrote:
Yet when navigating to CPU preferences in boinc, I don't see a use web preferences button. I must be missing something plainly obvious here. I'm just not sure what. I changed it locally, but it would be nice if I could get web preferences to stick as I was able to do with WCG.

Local preferences always overrides web prefs for computing, project prefs on the other hand can only be set on the project website. To revert to using web prefs open "Computing prefs" in Boinc and click the "Use web prefs" button top right, it's labeled "Clear" in older versions of Boinc.

Quote:
As a side note, I also have WCG attached (though not running any active tasks. It also isn't set to). Would this be conflicting? From what I vaguely remember reading the site you change settings on last is the one boinc grabs its changes from?

This should not interfere or conflict, Boinc should use the prefs from the project where you last edited and saved them.

Quote:
Looks like an error happened this time. https://einsteinathome.org/host/12775393/tasks/6/0

The error for that task was "194 (0x000000C2) EXIT_ABORTED_BY_CLIENT", why Boinc decided to abort the task is unfortunately not reported.

Pushkin
Pushkin
Joined: 12 Mar 07
Posts: 15
Credit: 33187685
RAC: 0

Hi, I have also problems

Hi,

I have also problems with Ryzen in last weeks. Please see https://einsteinathome.org/task/858100371

Got many of them on this PC: https://einsteinathome.org/host/12649345

According to my score there was something that happened on May 17th - since then it continuously drops (from approx 14 000 points to less than 4000 points).

Any hints?

Thank you!

Gary Roberts
Gary Roberts
Moderator
Joined: 9 Feb 05
Posts: 5872
Credit: 117754145437
RAC: 34826516

Pushkin wrote:I have also

Pushkin wrote:

I have also problems with Ryzen in last weeks. Please see https://einsteinathome.org/task/858100371


This thread was for the purpose of reporting a specific type of error - a validate error.  With a validate error, tasks crunch successfully but the results contain absolute rubbish.  In your case, your tasks mostly fail to crunch.

The current O2AS GW search has been running for a while now and I've not seen any widespread mention of errors or problems due to the app or the data.  It's most likely that your problems are specific to your machine.  It could be a hardware problem, a power or excess heat problem or perhaps the way you have configured it.

Are you overclocking or have you made voltage or similar mods?  Have you tested the RAM or tried a different PSU to see if you can identify any dodgy components?  Unfortunately, it's probably only you who can figure out the cause of the problem.

Cheers,
Gary.

Moises Cardona
Moises Cardona
Joined: 2 Jul 10
Posts: 2
Credit: 128955
RAC: 0

I'm getting "Marked as

I'm getting "Marked as Invalid" with a Radeon RX 570 GPU :( 

https://einsteinathome.org/host/12781036/tasks/0/0

Holmis
Joined: 4 Jan 05
Posts: 1118
Credit: 1055935564
RAC: 0

Moises Cardona wrote:I'm

Moises Cardona wrote:

I'm getting "Marked as Invalid" with a Radeon RX 570 GPU :( 

https://einsteinathome.org/host/12781036/tasks/0/0

And the advice would be more or less the same as the one Gary gave in the post before yours.
The difference being that you have a GPU instead of a CPU that's not working as it should and you manage to complete results that don't match what others get when they process the same task. It's still not a "Validate Error" that this thread is supposed to be about.

Gary wrote: "It could be a hardware problem, a power or excess heat problem or perhaps the way you have configured it." and "Are you overclocking or have you made voltage or similar mods?  Have you tested the RAM or tried a different PSU to see if you can identify any dodgy components?"

For anyone else seeing errors and feeling this thread might be a good place to post about them, please make sure the error shown on the tasks page reads as exactly this -> "Validate Error" and nothing else. If you're experience anything else then start a new thread and we (volunteers or the staff will look into it).

Gary Roberts
Gary Roberts
Moderator
Joined: 9 Feb 05
Posts: 5872
Credit: 117754145437
RAC: 34826516

Moises Cardona wrote:I'm

Moises Cardona wrote:

I'm getting "Marked as Invalid" with a Radeon RX 570 GPU :( 

https://einsteinathome.org/host/12781036/tasks/0/0

As is clearly stated in the opening post for this long running thread, "Completed, marked as invalid" is NOT a "Validate error".  They are two totally different things.  None of your completed work seems to agree with that of your quorum partners so your answers are seen as wrong - maybe only slightly wrong - but they are NOT seen as total rubbish (which is what a validate error is).

You need to start a new thread and give all the operating conditions / driver details etc., you are using.  Without a lot more information about how you have set up and are using your GPU, it's impossible to know what might be causing the problem.  If I had to guess, I'd suggest you may be operating your card outside of its standard operating settings.  If so, set everything back to default and see if your results then start agreeing with those of the other computers which complete the same tasks and do agree about the answers.

If you give a lot more of the details, it may be possible for someone to give you better suggestions.

EDIT: Thanks Holmis, I hadn't seen your response when I posted mine :-).

Cheers,
Gary.

Comment viewing options

Select your preferred way to display the comments and click "Save settings" to activate your changes.