It’s starting to look like certain WUs won’t validate no matter how many times they are crunched and, because they are constantly circling around again and again, they are becoming a plague. Could admin turn off the re-issue of WUs that fail validation until the problem is fixed? dAVE
I agree. We are using a lot of wasted time to try and recover the same error for teammates and visa versa. Like we are going round in circles. It would be nice to hear some words from admin. If they know what is wrong, and if they are doing anything to solve the problem.
Two of my invalid results are already back and valid. Cruncher was one of Bruce Allens hosts. ;)
Btw. this problem is/was pretty massive, as our team lost about 30,000 credits during the last 24 hours.
It’s starting to look like certain WUs won’t validate no matter how many times they are crunched and, because they are constantly circling around again and again, they are becoming a plague. Could admin turn off the re-issue of WUs that fail validation until the problem is fixed? dAVE
I agree. We are using a lot of wasted time to try and recover the same error for teammates and visa versa. Like we are going round in circles. It would be nice to hear some words from admin. If they know what is wrong, and if they are doing anything to solve the problem.
Two of my invalid results are already back and valid. Cruncher was one of Bruce Allens hosts. ;)
Btw. this problem is/was pretty massive, as our team lost about 30,000 credits during the last 24 hours.
> Had a workunit that was only progressing on the 3 to 4 second (Dual core AMD 4800+, adjacent workunit accumulates time in 5 second blocks but this one ticked over 1 second for every 3 to 5 seconds of the other WU).
This helped blow out processing time from 19000 to 20000 seconds normally to 27974 seconds.
When finished it failed Validation even though successful.
Yes, there have been a few reports coming in lately of results running unusally long on a host (up to close to twice as long), even though there didn't seem to a reason why it should (ie. no indication of loosing the checkpoint file along the way) and the result partner(s) ran it within their normal limits.
However IIRC, the cases I looked at still validated so yours is even more unusual in that regard.
Was there anything else going on with the host that may have been bogging that core down that you recall?
Yes, there have been a few reports coming in lately of results running unusally long on a host (up to close to twice as long), even though there didn't seem to a reason why it should (ie. no indication of loosing the checkpoint file along the way) and the result partner(s) ran it within their normal limits.
However IIRC, the cases I looked at still validated so yours is even more unusual in that regard.
Was there anything else going on with the host that may have been bogging that core down that you recall?
Alinator
Thanks Alinator, the computer in question only does Boinc with an occassional use for internet browsing, e-bay and such. Other than that it does Boinc only.
It is possible I suppose that a virus/worm/trojan got through my defenses and caused a problem that I am unaware of but my current Virus programme (ZoneAlarm) has not detected anything.
One core is definately running differently to the other (and still is), it does not matter if it is Einstein, Rosetta or Ralph, when one core ticks over 3 to 5 seconds then the other core ticks over 1 second. So on the Boinc Manager it will show a normal time but has if fact run for at least half as long again as it should of.
It looks like I have a have a computer problem then, but I have no idea how to fix it.
At last resort I may stop Boinc on this computer due to the huge processing times but low output. I have gone from 500 to 600 credit outputs to 150 to 250.
Any ideas anyone?
Have you checked system monitor (i think thats what windoze uses). From my experience with dual core it sounds like you have a task using a large amount of CPU time. Its only going to affect 1 core cause its one one thread. I can almost guarantee something is running that shouldn't be.
Thanks Clownius, I did another look through the Startup files in the System Configuration Utility, that boot up when Windows does and found a process that had no name but was being loaded and run.
I unticked it, applied and rebooted and now all running back to normal.
Have no idea what the programme is but it takes about 75% of my cpu resources as that cpu ticks over a second after the other core has ticked over 4 seconds.
Very strange. Also now that I have unticked the programme instead of staying on the list to be ticked again if needed I found it has disappeared altogether. So possibly a virus/worm/trojan maybe??
No wonder I prefer Linux.
RE: RE: It’s starting
)
Two of my invalid results are already back and valid. Cruncher was one of Bruce Allens hosts. ;)
Btw. this problem is/was pretty massive, as our team lost about 30,000 credits during the last 24 hours.
http://einsteinathome.org/workunit/18338522
http://einsteinathome.org/workunit/18329162
cu,
Michael
RE: RE: RE: It’s
)
Jumping the gun a bit, I fear:
RE: Jumping the gun a bit,
)
I can only imagine what you mean,
but I fear your're right. ;)
Message from Dr. Bruce Allen
)
Message from Dr. Bruce Allen on this subject.
Ive had my first ever one as
)
Ive had my first ever one as well. Oh well, all is back to normal.
> Had a workunit that was
)
> Had a workunit that was only progressing on the 3 to 4 second (Dual core AMD 4800+, adjacent workunit accumulates time in 5 second blocks but this one ticked over 1 second for every 3 to 5 seconds of the other WU).
This helped blow out processing time from 19000 to 20000 seconds normally to 27974 seconds.
When finished it failed Validation even though successful.
http://einsteinathome.org/task/57646715
Yes, there have been a few
)
Yes, there have been a few reports coming in lately of results running unusally long on a host (up to close to twice as long), even though there didn't seem to a reason why it should (ie. no indication of loosing the checkpoint file along the way) and the result partner(s) ran it within their normal limits.
However IIRC, the cases I looked at still validated so yours is even more unusual in that regard.
Was there anything else going on with the host that may have been bogging that core down that you recall?
Alinator
RE: Yes, there have been a
)
Thanks Alinator, the computer in question only does Boinc with an occassional use for internet browsing, e-bay and such. Other than that it does Boinc only.
It is possible I suppose that a virus/worm/trojan got through my defenses and caused a problem that I am unaware of but my current Virus programme (ZoneAlarm) has not detected anything.
One core is definately running differently to the other (and still is), it does not matter if it is Einstein, Rosetta or Ralph, when one core ticks over 3 to 5 seconds then the other core ticks over 1 second. So on the Boinc Manager it will show a normal time but has if fact run for at least half as long again as it should of.
It looks like I have a have a computer problem then, but I have no idea how to fix it.
At last resort I may stop Boinc on this computer due to the huge processing times but low output. I have gone from 500 to 600 credit outputs to 150 to 250.
Any ideas anyone?
Have you checked system
)
Have you checked system monitor (i think thats what windoze uses). From my experience with dual core it sounds like you have a task using a large amount of CPU time. Its only going to affect 1 core cause its one one thread. I can almost guarantee something is running that shouldn't be.
Thanks Clownius, I did
)
Thanks Clownius, I did another look through the Startup files in the System Configuration Utility, that boot up when Windows does and found a process that had no name but was being loaded and run.
I unticked it, applied and rebooted and now all running back to normal.
Have no idea what the programme is but it takes about 75% of my cpu resources as that cpu ticks over a second after the other core has ticked over 4 seconds.
Very strange. Also now that I have unticked the programme instead of staying on the list to be ticked again if needed I found it has disappeared altogether. So possibly a virus/worm/trojan maybe??
No wonder I prefer Linux.