Video card not seen?

Gary Roberts
Gary Roberts
Moderator
Joined: 9 Feb 05
Posts: 5872
Credit: 117680372605
RAC: 35166989

Here are some thoughts on

Here are some thoughts on what you might see and what you need to do.

You have the following tasks on board:-

BRP6 - 21 GPU tasks, names start with PM0102...
BRP4G - 99 GPU tasks, names start with p2030...
FGRP4 - 8 CPU tasks, names start with LATeah0159E...
FGRPB1 - 40 CPU tasks, names start with LATeah0001L...
O1AS20-100T - 117 CPU tasks, names start with h1_0049.95... or h1_0050.00...

When you have the GPU at its proper speed, you shouldn't have any problem completing all GPU tasks within the deadline. I estimate the BRP6 tasks should take between 1-2 days max and the BRP4G a further few days, say 7 days max for both types. The deadline is a full 14 days.

For the CPU tasks it's quite different. The 117 O1AST tasks will probably take 8-10 hours each. maybe you'll get lucky and they might take less but the risk is that they'll take even longer. With 6 cores running you may be able to get about 15 done per day but it could well be less than that. let's say you get 12 which means you can do 72 in the remaining 6 days to deadline. You may well need to abort more than 40 of these to get a workable balance remaining. These should be running first because of their shorter deadlines. To modify my previous comments (having thought some more) please make sure that all tasks whose names start with LATeah... (ie all 14 day deadlines) are currently suspended on the tasks tab of BOINC manager. This will make sure that we get as many GW tasks as possible done before deadline. Make sure you have at least 12 h1_... tasks ready to start in addition to the 6 currently running. This should represent around a full days work and you should get in the habit of checking (say twice daily) and freeing new tasks from suspension as current ones complete.

As you complete tasks, the estimates should return to normal. This will be driven mostly by GPU tasks completing quickly. When you have at least 5-10 completed h1_... tasks, make an estimate of the average crunch time and work out how many can complete before the deadline. You will most certainly need to abort quite a few.

Both types of LATeah... tasks should be quite a bit faster and the 48 in total should be able to be completed in maybe 2-3 days at worst. This is why you will be able to leave these suspended while the h1_... tasks are running. They have 14 day deadlines and there will be sufficient time to crunch them once the deadline for the h1_... tasks has passed.

Please have a think about the above and ask about anything that is unclear. Please report anything that seems to be not behaving as predicted. There could easily be discrepancies in the estimates given but these will be easily adjusted when the real values are known.

Cheers,
Gary.

Gary Roberts
Gary Roberts
Moderator
Joined: 9 Feb 05
Posts: 5872
Credit: 117680372605
RAC: 35166989

John, With the problem of

John,

With the problem of your work cache settings, I thought I'd better check other computers on your account. I found this particular one that has 335 tasks in progress, all of which are now within a day or two of deadline. The machine would appear to be not running at all since nothing has been returned for a very long time.

It's quite OK to change your mind and withdraw a machine from crunching. If you do that would you please have the courtesy to abort and return the surplus at that time rather than just letting them 'wither on the vine' so to speak. The tasks are cluttering up the database and can't be reissued to someone else while you keep them 'locked up'.

You actually have over 100 FGRP4 tasks. These are for a run that has essentially finished. You are delaying the final cleanup by not releasing them for someone else to complete.

Cheers,
Gary.

John
John
Joined: 1 Nov 13
Posts: 59
Credit: 573081286
RAC: 0

It seems the 75% wasn't

It seems the 75% wasn't considered. And I updated 2 times the settings. The 8 CPU tasks are still running and range from 80 to 89%. I manually stopped 2 of them, to see if it speeds up.

The increments have grown, from 5-6 to 8-9-10, but not for those tasks, but by general, in the 'simple view'.

John
John
Joined: 1 Nov 13
Posts: 59
Credit: 573081286
RAC: 0

OK, I didn't know about this,

OK, I didn't know about this, I didn't know it worked that way. Will take care from now on :)
I have a machine with Ubuntu and find it hard to install a driver for the video card, that's why I had to take it out. I admit I forgot about the deadlines.

Gary Roberts
Gary Roberts
Moderator
Joined: 9 Feb 05
Posts: 5872
Credit: 117680372605
RAC: 35166989

RE: It seems the 75% wasn't

Quote:
It seems the 75% wasn't considered. And I updated 2 times the settings. The 8 CPU tasks are still running and range from 80 to 89%. I manually stopped 2 of them, to see if it speeds up.


If you made the change on the website it should affect all computers on your account, not just the one where you need the change. In this case it would be better to leave the website settings where they were and make the change locally through BOINC Manager (Advanced view). Click the 'Tools' menu item and select 'Computing preferences' on that menu. A new window will open. There are different tabs in this new window. Click the one labeled 'processor usage'. You will see an item near the bottom '... use at most 100.00 % ...' Change that to 75.00 % and save the change. The effect will be immediate and will only apply to the current machine. Other machines will still use the website preferences.

Perhaps the reason why the change didn't seem to occur is that we have different understandings of what you mean by "I updated 2 times". I took it to mean that you changed the website settings and then you clicked 'update' in your local BOINC Manager. This is the usual way to ensure that the local BOINC client is forced to contact the server to become aware of the preference change. Perhaps you are referring to something on the website only and your client didn't change because it wasn't forced to contact the server. On the Projects tab of BOINC Manager advanced view, select the Einstein project and on the left, the 'update' button will become available to click. This is what will allow the client to become aware of the change without having to wait for a scheduler contact to otherwise occur.

Cheers,
Gary.

Gary Roberts
Gary Roberts
Moderator
Joined: 9 Feb 05
Posts: 5872
Credit: 117680372605
RAC: 35166989

RE: OK, I didn't know about

Quote:
OK, I didn't know about this, I didn't know it worked that way. Will take care from now on :)
I have a machine with Ubuntu and find it hard to install a driver for the video card, that's why I had to take it out. I admit I forgot about the deadlines.


It's OK, don't be too worried :-).

However, the machine running Linux is not the one I was referring to. It only has a few tasks and they are not due until 07 March. If you're not going to crunch them, it would be kind of you to fire it up and just abort those tasks and return them before shutting it down again. When you fire it up, make sure you set 'No new tasks' in BOINC Manager before you abort the tasks. Otherwise it's likely to just ask for more :-). Once aborted, you can return the tasks immediately by clicking 'update' before you shut the machine down.

The machine with the real problem is running Windows and has an i7-3537U CPU. It hasn't contacted the servers since 18 Feb. Is it possible to start that machine and abort the 335 tasks that are going to expire soon anyway?

I just looked at your machine with the new AMD GPU. You seem to have aborted a whole bunch of the BRP4G tasks?? These are tasks that could easily have been crunched before the deadline when you get your GPU operating correctly. Please don't abort anything more until you get the 75% CPU cores setting working correctly which should allow your GPU to crunch at the proper speed. You now have 8 completed O1AS20-100T CPU tasks (names start with h1_...) and they have taken over 24 hours each in elapsed time - much slower than hoped for. This should improve a bit when you actually have only 6 CPU tasks running but you will probably need to abort the bulk of those that remain. Don't abort any yet. Wait until we see the crunch time when the GPU is running correctly.

Cheers,
Gary.

John
John
Joined: 1 Nov 13
Posts: 59
Credit: 573081286
RAC: 0

The change took place only

The change took place only with the 'Update' from local, detailed vied, not from the online option. Did it now and it worked just fine.
You were right about '2 times' :)

John
John
Joined: 1 Nov 13
Posts: 59
Credit: 573081286
RAC: 0

Thanks for specifying the

Thanks for specifying the machine, I had trouble finding it by that number.
I uninstalled the Boinc manager completely from that. Should I install it back and abort all tasks?

OK, stopped aborting new tasks on the AMD machine. I guess the next 24-48h will show some change in speed.

Gary Roberts
Gary Roberts
Moderator
Joined: 9 Feb 05
Posts: 5872
Credit: 117680372605
RAC: 35166989

RE: Thanks for specifying

Quote:
Thanks for specifying the machine, I had trouble finding it by that number.


OK, sorry. I gave you a link to its complete list of tasks. Do you know how to go to your account page on the website and click the 'view computers' link to see a list of all your machines? That's a useful page because for each of your assigned host IDs, you can see the CPU, GPU, OS, etc and there are links to a details page and a tasks list for each one. You should play with those and become familiar with what you can see.

Quote:
I uninstalled the Boinc manager completely from that. Should I install it back and abort all tasks?


No need to worry anymore :-). All the tasks are past deadline and the scheduler is probably busy sending them all out again :-). For future reference, if you are going to withdraw a machine from crunching, please use the controls in BOINC Manager to set NNT (No New Tasks), to mark the tasks on board as 'Aborted' AND to tell the BOINC client to report all the aborted tasks to the server by clicking the update button. These should always be the last actions you take before switching off the lights and locking the doors, so to speak :-).

Quote:
OK, stopped aborting new tasks on the AMD machine. I guess the next 24-48h will show some change in speed.


You now have proper GPU crunch times for both BRP6 and BRP4G. They are both going very fast. You will need to abort more O1AST CPU tasks but PLEASE make sure you know what you are about to abort before confirming the action. You will soon have a shortage of GPU work, not a surplus.

There is possibly some small improvement in the crunch times for GW tasks (O1AST). Several recently completed tasks (at the time I looked) took around 80K secs. Before that there were a number that took around 90-95K secs. I think it's safe to assume that you will be able to complete between 6-7 tasks per day. Deadline is currently about 4.7 days away so that means about 30 more tasks. You currently have 77 in progress.

At this stage you need to confirm that newly crunched tasks are taking around 80-82K secs. If so, you could consider aborting say 40 more of the GW tasks. In BOINC Manager advanced view -> tasks tab, you will clearly see the name of these starting with "h1_...". In the Application column you will see the first word "Gravitational". Abort 40 of these with the shortest deadline once you are sure of the ongoing crunch time. You may need to abort a couple more when you get closer to the deadline. If a task can well and truly get started before the deadline, I'd let it run since there's always a delay in generating a replacement task and you may be able to return yours before that happens.

You have been burning through the BRP4G and BRP6 tasks at such a rate that you only have a few BRP6 left now. All the BRP4G are done. It sure would be nice to have all those aborted ones right now :-). If you have 'suspended' CPU tasks (which you will have if you followed earlier advice) they will prevent BOINC from downloading more GPU tasks of any type. You should make sure you have a low cache setting (1 day max for the moment) and then carefully 'resume' any suspended tasks. BOINC will then be able to ask for more GPU work but not CPU work as you alreay have far more than 1 day's worth of that.

Please read the above carefully and ask if anything is not clear.

Cheers,
Gary.

John
John
Joined: 1 Nov 13
Posts: 59
Credit: 573081286
RAC: 0

Did some suspending, some

Did some suspending, some aborting and now it looks like:
1. the CPU tasks move up by 4-5 increments, like before; they are 6, with another 4 waiting to run (2 are LAT... and 2 h1...);
2. for the CPU+GPU, it increases with about 0.04 or 0.05. So big difference compared to the previous steps. One of them was done like 6% in 60 seconds. Pretty fast I guess.
3. I suspended some CPU tasks, deadline 9.03, to let it 'breathe'. I guess it has to have a balance between cpu/gpu tasks and the cpu/gpu speed.

Comment viewing options

Select your preferred way to display the comments and click "Save settings" to activate your changes.