I'm only on about my 4th day, and believe it or not I was able to solve the other 2 "problems", but thanks for the help on the other one.:)
I have by now completed several jobs on machines ranging from a 1.2 Athlon to an Athlon 64 3200 with completion times running between roughly 4 and 18 hrs. I have one running on the 1.2 Athlon that reached 80% within about 6 hrs but is now only 87.2% complete after 36 hrs. Is this just a slow job or does it sound like there could be a problem with the job?
F. Prefect
In the beginning the Universe was created. This has made a lot of people very angry and been widely regarded as a bad move.....Douglas Adams
Copyright © 2024 Einstein@Home. All rights reserved.
Is this in the range of "normal"?
)
Ford,
Well done on your progress to date and congratulations for solving your own problems. That always feels good, doesn't it :).
You have 4 machines connected to the project and whilst we can see the results lists of each of these, we don't know which particular one is the one you call the "1.2 Athlon". We can tell the Athlon 64 but the other three just say "Athlon". Have you worked out how to use the website to "drill down" into all the information that is available about your own machines and the results they crunch on as compared to the machines and the results of others who are crunching on the same work unit?
I have a 1.2 Duron which does a result every 11 hours. The crunch rate with EAH is relatively uniform over the period of crunching so there is something funny with 80% in 6 hours as that is much faster than I would have expected. Also 87% at 36 hours is way too slow!! Have you done a rough check that the reported time at various stages agrees with wall clock time? Is there a screensaver kicking in and stealing lots of cpu cycles?
There are a number of things to check out but first of all we need more info, such as the CPUID of the machine in question (is it the Win98 box - 438453?) and details of any preference changes from default values that you may have made on the website. Also do you support other projects or is it EAH only and does the machine run 24/7 or do you shutdown and restart from time to time?
Cheers,
Gary.
RE: I'm only on about my
)
There have been issues reported with BOINC 5.2.X's "time to completion" estimates. However, what you report does sound a little "stranger" than usual. Is this happening on the same box you had the other problems? And, are you saying that the 36 hours is CPU time working on the same Einstein WU?
EDIT: I see Gary is "on the case" now - so I'll defer to him.
I have 3 Athlons (thunderbird
)
I have 3 Athlons (thunderbird core) running in the 1.1 to 1.3 range. The machine in question has only completed 1 other job in 9:57 with the other 2 machines completing jobs in roughly the same frame.
Job in question is now 89.2% complete with a running time of 42+ hours. I hate to delete it after this much time, but if it gives me some error message when I eventually try to upload, I've wasted even more time.
F. Prefect
In the beginning the Universe was created. This has made a lot of people very angry and been widely regarded as a bad move.....Douglas Adams
RE: The machine in question
)
Using this "clue", I am guessing you are talking about CPUID 438453 as it is the only one of your machines to have completed just one result (in 38825 seconds or 10.8 hours) which is close enough to the time you quoted.
I doubt it will give error messages while uploading. You really need to identify what is using the cpu cycles. It would be helpful if you could give more information along the lines of the questions in my previous message. At least now I know it's Win98 then I can tell you that you need a good equivalent of XP's Task Manager such as Process Explorer from SysInternals so as to be able to see exactly what is running and consuming your cpu. It's a free download and a very useful little tool to see what is really going on.
Do you run firewall/antivirus software? Would you be prepared to shut them down completely for a couple of minutes and see if magically, EAH starts clocking up the percentage completed more rapidly? How up-to-date are you with Win98 fixes? There is a really good unofficial 98SE service pack that makes 98 basically as stable as XP.
Please give us more details, details, details .....
Cheers,
Gary.
RE: RE: The machine in
)
Gary,
Check result #2560473 on box 438543. I think that's the culprit, even though it has 3 completed results, the time matches.
(edit)If I email to the gaming place, will it get through to you? ws, or sc?
Ford, details, brother, we (mainly Gary) need specific info to diagnose and help. Also, it would be better if you use just one thread for all these "getting started" problems, instead of opening a new one for each question.
Regards,
Michael
microcraft
"The arc of history is long, but it bends toward justice" - MLK
RE: Check result #2560473
)
Michael, you are absolutely right about the time - well spotted!! However I'm still struggling with the fact that there is "supposed" to be only one completed result. Notice also the two numbers 438453 and 438543. My advanced dyslexia really choked for a while on that little combo :).
Cheers,
Gary.
RE: RE: Check result
)
Yes, noticed that # thing, too. It's not a sure thing, due to the time/# of results discrepancy, so the confusion may have originated on the other end. He might be referring to either machine.
my edit, preceding post
microcraft
"The arc of history is long, but it bends toward justice" - MLK
RE: RE: RE: The machine
)
In the beginning the Universe was created. This has made a lot of people very angry and been widely regarded as a bad move.....Douglas Adams
RE: RE: Check result
)
I apologize for for starting more than one thread and appriciate all of your help.
The job finally finished in something like 47 hrs. and attempts to upload produced a computation error message. I attempted to just delete the job (BTW how is that done?) but was unsucessfull and was also unable to download any additional jobs. I took the quick and dirty approach and uninstalled/reinstalled and now it appears to be back to running at a pace similar to my 2 other machines with a similar CPU and clockspeed.
Thanks again for the help and sorry about all the threads.
Gary
F. Prefect
In the beginning the Universe was created. This has made a lot of people very angry and been widely regarded as a bad move.....Douglas Adams
RE: I apologize for for
)
Ford,
A real horror show in all, huh? In answer to your question, navigate to your Boinc folder. Inside is another folder "Projects", inside that is another one, "einstein.phys.uwm.edu". Inside the last one, you will find a file w1_1340.5. This is the large data file from which Einstein has been slicing off workunits, so it contains about 10+ WUs in all. The only way to delete the workunit, if I understand your intent correctly, is to delete the entire w1_1340.5 data file. You can only delete it when Einstein is not running, because Windows will not give you permission to delete a file that is in use. The next time that Einstein runs, it will recognize that it doesn't have a datafile to work from, it will error-out the current WU, and will immediately contact the server to download another datafile. The datafile is about 8 megabytes, so I hope that your internet connection is fresh, because that will probably take 2-3 hours to download (you're on dial-up, correct?).
BTW, the current Boinc is 5.2.7, which has a far more accurate estimation of time to completion for in-process work, among other subtler improvements. As accurate as it's estimation is on the initial WU, it gets better - it learns and adapts on subsequent WUs until eventually it zeroes-in on virtual point-blenk. My rig consistently takes 5 hrs 3 min or so to process a WU. Since updating to 5.2.7, it's completed 8 WUs. The initial estimate of To Completion time was 5 hr 45 min, and currently that estimate is down to 5 hr 18 min, so the correction gets closer pretty quickly; I expect that after another couple days' work it will be within 7-8 minutes of actual time. There will always be at least 1% inaccuracy, because the final 1% happens extremely quickly (within a few seconds), probably just assembling the result data to send to the server, instead of involving computation.
I hope this has been informative, helps you understand better.
Regards,
Michael
(edit) - the computation error on that long WU was not due to the upload process, but because the computation time far exceeded what it should have been. Ironic!
microcraft
"The arc of history is long, but it bends toward justice" - MLK