I just reported 8 resent tasks manually with NNT selected by clicking the update button. Tasks reported fine but it's downloading another 12 resent tasks even with NNT selected. Here's the log. Downloading tasks goes for miles as it gets files necessary. (We have an unlimited data cap)
23/04/2011 3:19:10 p.m. Einstein@Home Sending scheduler request: Requested by user. 23/04/2011 3:19:10 p.m. Einstein@Home Reporting 8 completed tasks, not requesting new tasks 23/04/2011 3:19:21 p.m. Einstein@Home Scheduler request completed 23/04/2011 3:19:21 p.m. Einstein@Home Message from server: Resent lost task h1_1478.40_S5R4__595_S5GC1HFa_1 23/04/2011 3:19:21 p.m. Einstein@Home Message from server: Resent lost task h1_1481.25_S5R4__552_S5GC1HFa_1 23/04/2011 3:19:21 p.m. Einstein@Home Message from server: Resent lost task h1_1490.95_S5R4__640_S5GC1HFa_0 23/04/2011 3:19:21 p.m. Einstein@Home Message from server: Resent lost task h1_1478.65_S5R4__577_S5GC1HFa_0 23/04/2011 3:19:21 p.m. Einstein@Home Message from server: Resent lost task h1_1478.65_S5R4__576_S5GC1HFa_0 23/04/2011 3:19:21 p.m. Einstein@Home Message from server: Resent lost task h1_1464.20_S5R4__461_S5GC1HFa_0 23/04/2011 3:19:21 p.m. Einstein@Home Message from server: Resent lost task h1_1441.25_S5R4__316_S5GC1HFa_0 23/04/2011 3:19:21 p.m. Einstein@Home Message from server: Resent lost task h1_1484.80_S5R4__559_S5GC1HFa_1 23/04/2011 3:19:21 p.m. Einstein@Home Message from server: Resent lost task h1_1484.80_S5R4__558_S5GC1HFa_1 23/04/2011 3:19:21 p.m. Einstein@Home Message from server: Resent lost task h1_1479.10_S5R4__633_S5GC1HFa_1 23/04/2011 3:19:21 p.m. Einstein@Home Message from server: Resent lost task h1_1447.90_S5R4__134_S5GC1HFa_0 23/04/2011 3:19:21 p.m. Einstein@Home Message from server: Resent lost task h1_1484.80_S5R4__557_S5GC1HFa_0 23/04/2011 3:19:23 p.m. Einstein@Home Started download of skygrid_1480Hz_S5GC1.dat 23/04/2011 3:19:23 p.m. Einstein@Home Started download of h1_1478.40_S5R4 23/04/2011 3:20:08 p.m. Einstein@Home Finished download of skygrid_1480Hz_S5GC1.dat 23/04/2011 3:20:08 p.m. Einstein@Home Started download of h1_1478.40_S5R7 23/04/2011 3:20:14 p.m. Einstein@Home Finished download of h1_1478.40_S5R4 23/04/2011 3:20:14 p.m. Einstein@Home Started download of l1_1478.40_S5R4 23/04/2011 3:21:02 p.m. Einstein@Home Finished download of l1_1478.40_S5R4 23/04/2011 3:21:02 p.m. Einstein@Home Started download of l1_1478.40_S5R7 23/04/2011 3:21:21 p.m. Einstein@Home Finished download of h1_1478.40_S5R7 23/04/2011 3:21:21 p.m. Einstein@Home Started download of h1_1478.45_S5R4 23/04/2011 3:22:00 p.m. Einstein@Home Finished download of l1_1478.40_S5R7 23/04/2011 3:22:00 p.m. Einstein@Home Started download of h1_1478.45_S5R7 23/04/2011 3:22:13 p.m. Einstein@Home Finished download of h1_1478.45_S5R4 23/04/2011 3:22:13 p.m. Einstein@Home Started download of l1_1478.45_S5R4 23/04/2011 3:22:45 p.m. Einstein@Home Finished download of h1_1478.45_S5R7 23/04/2011 3:22:45 p.m. Einstein@Home Started download of l1_1478.45_S5R7 23/04/2011 3:22:51 p.m. Einstein@Home Finished download of l1_1478.45_S5R4 23/04/2011 3:22:51 p.m. Einstein@Home Started download of h1_1478.50_S5R4 23/04/2011 3:23:19 p.m. Einstein@Home Finished download of l1_1478.45_S5R7 23/04/2011 3:23:19 p.m. Einstein@Home Started download of h1_1478.50_S5R7
Is this meant to happen?
I'm using Win 7 64 Ultimate I7 980X boinc 6.10.58
Cheers,
Gary.
Copyright © 2024 Einstein@Home. All rights reserved.
Help!! I want to abort tasks but E@H keeps resending them to me
)
Yes, resending lost tasks seems to be independent of NNT.
Gruß,
Gundolf
Computer sind nicht alles im Leben. (Kleiner Scherz)
RE: Yes, resending lost
)
Thanks so does this mean that if I let Boinc automatically report tasks will get more on it's own accord? Evan if I don't want anymore
RE: Thanks so does this
)
No, to the contrary. You will get sent the tasks again that were not (yet) reported. If you don't want to run Einstein@home anymore, set "no new tasks", then abort the tasks you already got and update the project to report these. You may reset the project after that if you want.
BM
BM
RE: RE: Thanks so does
)
Thanks I've aborted all tasks & detached E@H from BAM for the time being. I now have 19 tasks showing in progress sorry these will have to time themselves out unless admin (BM) can speed this along?
RE: I just reported 8
)
OK, because you are reporting your experiences in this particular thread, I'm assuming you are interested in understanding what is going on in order to assist in some way with the cleanup of the remaining tasks in the old run, particularly any resends. I'll try to explain it all as best I can. Please note that these comments and explanations are NOT designed for people who don't want to micromanage and just want a 'set and forget' experience. If you're in that category, please ignore what follows. Please don't take ANY action if what follows is not perfectly clear to you. BOINC will handle it automatically, but rather inefficiently, if left alone.
Your log snippet starts with the reporting of completed tasks but it doesn't say what sort of tasks were reported. I very much doubt you are reporting resent tasks but you could be reporting resends or primary tasks - you can't tell without looking at the actual task names.
I think you may be confusing what the scheduler is calling a "lost task" that is being "resent" with what I call a "resend" task. When tasks are first issued, I call them "primary" tasks. They always have an _0 or _1 suffix. If a primary task fails for whatever reason, the scheduler (when it becomes aware of this failure) will issue a further copy of the task. I call this extra copy a "resend" and you can always distinguish these from primary tasks because they have suffixes like _2, _3, _4, etc, as many as are required to eventually complete the quorum - with a max limit of 20 tasks.
The snippet above simply shows that your client reported 8 tasks and wasn't asking for any new ones - after all you did have NNT set. You would have to check the suffix on each of those tasks to see if any were resends and it really doesn't matter because it's not connected with what comes next.
This snippet tells you there was a discrepancy between the schedulers idea of things and your client's idea of things. There are (at least) 12 tasks (and they are all primary tasks - check for yourself) that the scheduler thinks you should have and that your client doesn't have. As you could imagine, it's extremely important that both sides of the client/server relationship should have identical views of the world. Your client has sent a list of the tasks it has and the server thinks that the client should have more. So the scheduler (irrespective of your NNT setting) will force the client to take these extra tasks (the "lost" tasks) so that each side will agree. It will do this 12 at a time so it could well be that you will eventually be sent more than 12 on the next exchange between client and scheduler - if there are any more lost tasks.
So how do tasks become lost? The most common reason is simply bad luck. If your client happens to ask for work at a time when the server is under heavy load, it may not receive a reply within a timeout interval. The client stops waiting and reports (on the messages tab of BOINC Manager) a problem talking to the server. it will usually retry after a further interval but this could easily have been prevented if NNT had been set in the interim. In the meantime, the server would have eventually got around to answering your request and would have issued tasks and recorded that fact in the database. Your client is no longer listening so it wouldn't have received the tasks. So the discrepancy now exists. This really isn't much of an issue because the server will notice and make good the discrepancy on the very next contact and your client can't refuse the lost tasks even if NNT is set. So agreement is going to be restored eventually.
There are a number of things in the above snippet that are important to understand. Firstly, these are not tasks that are being downloaded. They are data files, namely skygrid files and lots of LIGO data files - in fact all the data needed to support the 12 lost tasks that the scheduler had just resent to you. Secondly, there would have been about 2GB of data which would have taken quite a long time to download - and it was for just 12 tasks! Is it any wonder I'm on a crusade to improve this. Take a good look at all the different and unrelated frequencies being sent to you. In other words, check the frequency value included as part of the task name of each lost task being resent. The scheduler is not very smart in the way it assigns work - particularly from now on with so little work left in the current run.
Earlier on in this thread I discussed these sort of issues with Oliver Bock, who (in time) will be able fix the problems. This was a couple of weeks ago and there were more tasks available then. What happened to you is exactly what I described to Oliver in this message that I posted on April 7. Here's the critical bit.
Note in the last paragraph above, I commented that 10 tasks would be supplied from about 7 different frequency sets. In your case you got about 12 tasks from about 9 different frequency sets. This is the sort of inefficiency that I'm concerned about and (hopefully) Oliver will vastly improve.
Any participant can make a big improvement by their own actions. To prevent this unnecessary jumping to multiple frequency sets, just make sure you don't ask for a big number of tasks in one single hit if you don't have the necessary blocks already in your state file to support it. Always make it your business to know how many frequency bands you have available above the particular frequency that your most recent task was for. If you know you don't have many, just follow the next procedure. It's a bit painful to do but ask for 20 tasks by asking for just 1 initially (eg work out what cache size you need, to get just (say) a single extra task). In the worst case there may be no tasks for your current frequency set and you will get a frequency jump to a new set along with either 48 or 52 LIGO files. Once you have them all, you could then ask for at least 10 more tasks without risking another frequency jump. The scheduler can now see all the 12 or 13 frequency bands you have and will supply tasks for these bands without doing a complete frequency jump. You will download 4 extra LIGO files for each 0.05Hz frequency shift to the next band but this is far less than what happens with a jump to a different frequency set. When you get those 10 tasks (and the extra LIGO data as described) you can keep asking for 10 (or even more) at a time and keep getting more tasks from the same frequency bands with perhaps the occasional server delete request and the 4 extra LIGO files. The rule to apply is, "Don't ask for more tasks than what could be supplied for the frequency bands for which you already have blocks in your state file." You can estimate this by allowing say 2 tasks per frequency band. So if you've just downloaded a single task, you would have at least 12 current frequency bands in you state file and you could estimate that quite a lot of extra tasks could be asked for next time. If you're keen enough you can also remove the tags from the state file to get more tasks from the bands that have been marked prematurely for deletion.
Yes, everything in that log snippet is compatible with the way I know that the scheduler works.
Cheers,
Gary.
Thanks Gary. All tasks
)
Thanks Gary. All tasks returned on the 23rd were the 8 tasks I was referring to in my earlier post. I still can't understand how you can tel E@H to work in a radio band ie 1430Hz to 1440Hz. If you are going to explain you'll need to put it step by step so I can understand. As I said in eariler post I've detached from E@H. How long do you thing the resend will be around for?
Thanks for taking time to explain.
RE: Thanks I've aborted all
)
Did you report the aborted tasks before detaching? If not, they'll just be resent to you when you re-attach to Einstein@home.
If you want to speed this along yourself, re-attach to Einstein, set NNT and accept resent lost tasks. Then abort and report them. Don't forget to abort the download of the data files.
Rinse and repeat until your Tasks (and Transfers) tab show no more Einstein tasks and your online task list has no more "in progress" tasks. ;-)
Gruß,
Gundolf
Computer sind nicht alles im Leben. (Kleiner Scherz)
RE: RE: Thanks I've
)
Thanks I've aborted all my tasks & detached. Thanks again for the help
RE: RE: RE: Thanks I've
)
But did you report them Before you detached?
Claggy
RE: RE: RE: RE: Thank
)
Yes they show as Aborted by user in my account. Sorry for any misunderstanding