Bernd, Bruce, Gary, Mike, a question for you!

Mac-Nic

Joined: 25 Feb 05

Posts: 71

Credit: 547461

RAC: 0

RE: Any other way with

27 Jun 2006 17:41:22 UTC

Message 40777 in response to message 40772

(moderation:

)

Quote:

Any other way with official 5.4.9?

Try Process Explorer from Sysinternals to set the affinity.
WWW.sysinternals.com

Best regards

"The FUTURE is only a PARTICLE away from the PRESENT and the PAST."

J Langley

Joined: 30 Dec 05

Posts: 50

Credit: 58338

RAC: 0

RE: I really don't

27 Jun 2006 18:59:41 UTC

Message 40778 in response to message 40767

(moderation:

)

Quote:

I really don't understand why people get upset with not seeing huge numbers of results flash past.

Well if you don't run your computer 24x7, and / or you run multiple projects, it's harder for slower computers to meet the deadlines for the long S5 WUs.

I understand that the load on the servers would rise if the WUs were shorter, and I see the need to avoid the 'curse of 32' again for fast dedicated crunchers, but I'd be happier if these long WUs were half their current size.

robert stone

Joined: 19 Mar 05

Posts: 4

Credit: 16971

RAC: 0

I see a problem with resource

27 Jun 2006 21:11:12 UTC

Message 40779

(moderation:

)

I see a problem with resource sharing between projects. I have several
Rosetta wu's that will push the Einstein wu (only have 1 with 35 hours projected to finish) past the deadline unless i manually suspend Einstein and let rosetta finish and hope the there is time for Einstein to finish, or flip it around and do the opposite and let Einstein finish first.

I suppose i can go tweak all my preferences on different projects, again.

These large WU's are going to create havoc with people running multiple projects.

Just my opinion

"Don't take life to serious .....after all none of us will make it out alive."

Joachim Schmidt

Joined: 19 Feb 05

Posts: 35

Credit: 391050

RAC: 0

then just set your "Connect

27 Jun 2006 21:21:20 UTC

Message 40780

(moderation:

)

then just set your "Connect to network about every" to 0.1 days, so that you have a small cache and you won't have problems with deadlines.

greets

Odysseus

Joined: 17 Dec 05

Posts: 372

Credit: 20945996

RAC: 3897

RE: I see a problem with

27 Jun 2006 21:30:37 UTC

Message 40781 in response to message 40779

(moderation:

)

Quote:

I see a problem with resource sharing between projects. I have several
Rosetta wu's that will push the Einstein wu (only have 1 with 35 hours projected to finish) past the deadline unless i manually suspend Einstein and let rosetta finish and hope the there is time for Einstein to finish, or flip it around and do the opposite and let Einstein finish first.

Your BOINC Manager should be able to deal with all that, without intervention: when it notices that a WU is at risk of missing the deadline, given the computer's up-time and the project's resource share, it will preÃ«mpt (or refuse) other work to make sure the deadline is met. This may put it into â€˜panic modeâ€™ for a while, but once it's become accustomed to the larger WU sizes itâ€™ll avoid overfilling its cache.

Gary Roberts

Moderator

Joined: 9 Feb 05

Posts: 5883

Credit: 119069164639

RAC: 24252443

RE: RE: ...I manage to

27 Jun 2006 22:36:41 UTC

Message 40782 in response to message 40772

(moderation:

)

Quote:

Quote:
...I manage to keep 1 project on each virtual CPU most of the time....

Gary, how do you obtain this configuration? Which Manager do you use? AFAIK only TruXoft Manager can set the cpu affinity... Is this your case? Don't tell me that you manually suspend every n-1 wu in the cache... Any other way with official 5.4.9?

I use the official manager. What I do (with BOINC stopped) is edit client_state.xml to make both long term debts (LTD) to be zero and the short term debt (STD) for EAH to be +20 (seconds) and for Seti to be -20. STD controls what project will run if a decision has to be made. They have to balance to zero.

When BOINC is restarted, both threads will be EAH (+20) but after a very short time (probably 10 seconds per thread) the internally calculated values for STD will have reversed sign. At this point if I create a reason for the reschedule_cpus routine to be called, one of the EAH threads will preempt and a Seti thread will take over. Once this has been established, the machine seems quite happy to continue through multiple results for days at a time without losing the 1 project per cpu desirable goal. Also I have the preference for "Leave tasks in memory when preempted" set and this might be important to getting it to work. I haven't tried without this setting.

The key seems to be to get request_reschedule_cpus called at just the right time. I achieve this by picking the newest result in my cache and suspending and reenabling just that one result. This forces the rescheduling request without interfering with any of the actually running tasks. Most of the time the desired goal is achieved immediately but sometimes if I mistime it, I can catch it "on the way back" as the STDs are now heading in the other direction.

Someone mentioned using Process Explorer. The problem with this (I think) is that you would have to keep reapplying the affinity everytime a process exited and was respawned. With my way the affinity seems to survive quite happily through many consecutive results. The machine is right at my desk so I just check it once in the morning and once in the evening (if I even remember). At the moment (I've just checked it now) there is still one project per cpu and this has been going on for several days without interference from me. Of course it would be a bit tedious if the machine isn't running 24/7 :).

Quote:

Thanks for all the other answers!

You're welcome!!

Cheers,
Gary.

Nuadormrac

Joined: 9 Feb 05

Posts: 76

Credit: 229259947

RAC: 0

RE: RE: Luca, if you set

28 Jun 2006 2:11:59 UTC

Message 40783 in response to message 40774

(moderation:

)

Quote:

Quote:
Luca,
if you set you ressource share to the same value for both projects (50/50) they will both run on one CPU each.
I had a server with 3 CPUs and had E@H at 66% and an other project at 33%. So I had 2 WUs for E@H and 1 WU of the other project running at the same time.
Udo

I'm not so lucky, Udo :-(
I've got 3 Eintein long wu and 14 Rosetta wu. The resurce share is 100 for both. The short and long debts are the same for both project... they are even... BUT... Rosy is running both task and Einstein is preempted...

The only thing that worked for me in the past was TruXoft tx36, but now I'm using official 5.4.9... and nothing to do!!

Thanks!

As a temporary fix, you could always set Rosseta to no new work and decrease it's time setting in preference... In this way, they'll clear out sooner, and it will get through these long E@H units. Afterwards, setting it back to normal, BOINC should have learned to predict the time to completion better, and not pull quite so much E@H work on the next connect...

Oh, and on the last Rossetta, reset the time setting back to your desired setting, so it estimates that value for the next download :D

Bernd Machenschalk

Moderator

Administrator

Joined: 15 Oct 04

Posts: 4343

Credit: 252801358

RAC: 41215

Back to the original

28 Jun 2006 5:35:48 UTC

Message 40784

(moderation:

)

Back to the original question:

Roughly speaking anything you see in your cache (number of Tasks, movement etc.) would be the same in the database, multiplied by the number of users (or actually CPUs). If you cut the current WUs in a half, you have twice the number of results the database needs to keep track of. The database size is still our limiting factor, we're currently running a server with 24GB main memory and it's already tight.

Mike Hewson

Moderator

Joined: 1 Dec 05

Posts: 6592

Credit: 331812263

RAC: 314689

RE: we're currently running

28 Jun 2006 5:50:47 UTC

Message 40785 in response to message 40784

(moderation:

)

Quote:

we're currently running a server with 24GB main memory

Pedal to the metal! Wowie! I want one ...... :-)
Cheers, Mike.

I have made this letter longer than usual because I lack the time to make it shorter ...

... and my other CPU is a Ryzen 5950X :-) Blaise Pascal

Ivan Minkov

Joined: 1 Mar 05

Posts: 3

Credit: 228829

RAC: 0

RE: Which would have the

28 Jun 2006 8:41:58 UTC

Message 40786 in response to message 40767

(moderation:

)

Quote:

Which would have the most value to you, (a) a 1Kg gold bar, or, (b) 100 x 10gm gold mini-bars

see the scenario:

you start counting your $1000 bucks,

1, 2, 3 ..... 55, 56, *telephone rings*, blah, blah, erm where was I, 55 !, 55, 56, 57, .... 88, .... *insert some other crap here*, 85, 86 ... DAMN I got that wrong, start over !

what you prefer getting BSOD on one 1hour WU, or getting it on 18h WU ...

Bernd, Bruce, Gary, Mike, a question for you!

Forums › Cruncher's Corner

Comment viewing options

Forums › Cruncher's Corner