Gravity Wave search on GPUs: do we have a problem?

mmonnin
mmonnin
Joined: 29 May 16
Posts: 291
Credit: 3412216540
RAC: 3441823

I got some tasks that only

I got some tasks that only had 3 minutes until their deadline according to my task list here. It takes longer than that to complete the last 1%.

Original estimates are far too low at Just over a minute. 1080Ti from 5min to 35-37min run time.

980Ti from 7:33 MIN:SEC to 21:03 MIN:SEC, seems to have the lowest increase from GRPB

Another PC from 15:20 MIN:SEC to 1:25:34 HR:MIN:SEC.

Radeon VII in Windows uses a full CPU thread the entire time where GRPB was about 1/3 of a CPU. Core clock is only about 1/3 of what it could be with low power draw at 64W as it's CPU bound.

Bernd Machenschalk
Bernd Machenschalk
Moderator
Administrator
Joined: 15 Oct 04
Posts: 4312
Credit: 250579411
RAC: 34635

Thanks. I reduced the

Thanks. I reduced the "speedup" (that the GPU version is assumed to be faster than the CPU version) to 1/3 for GW. This should be roughly in line with what Richard reported.

BM

Richard Haselgrove
Richard Haselgrove
Joined: 10 Dec 05
Posts: 2143
Credit: 2959099505
RAC: 707721

My GPUs are mostly hammering

My GPUs are mostly hammering WCG for their current server 'stress test', but I'm letting Einstein run overnight in case something dries up while I'm asleep. I'll knock off the app_config.xml temporarily and do a work fetch this evening, so I can check what the server response looks like after that change.

Ian&Steve C.
Ian&Steve C.
Joined: 19 Jan 20
Posts: 3958
Credit: 47024602642
RAC: 65053943

Ian&Steve C. wrote:Richard

Ian&Steve C. wrote:

Richard Haselgrove wrote:

Einstein GPU app - still O2MDF in this case - is shown as 'base priority: below normal', whereas pure CPU apps are 'base priority: low'. So the logic seems to be "if any one (or more than one) of these cases apply, increase the priority".

being that these are GPU tasks, it seems that case 1 (uses coprocs) will always be satisfied regardless of the others, no?

maybe got missed from the last page.

 

with my app_config set at 1CPU-1GPU. I can confirm that the GPU tasks are running at the same priority as the CPU tasks. (priority = 20+0)

 

is this what is desired or not?

_________________________________________________________________________

Richard Haselgrove
Richard Haselgrove
Joined: 10 Dec 05
Posts: 2143
Credit: 2959099505
RAC: 707721

Ian&Steve C.

Ian&Steve C. wrote:

Ian&Steve C. wrote:

Richard Haselgrove wrote:

Einstein GPU app - still O2MDF in this case - is shown as 'base priority: below normal', whereas pure CPU apps are 'base priority: low'. So the logic seems to be "if any one (or more than one) of these cases apply, increase the priority".

being that these are GPU tasks, it seems that case 1 (uses coprocs) will always be satisfied regardless of the others, no?

maybe got missed from the last page.

with my app_config set at 1CPU-1GPU. I can confirm that the GPU tasks are running at the same priority as the CPU tasks. (priority = 20+0)

is this what is desired or not?

Sorry, I did indeed miss that - busy yesterday.

I would expect the priorities to come from this list:

#define PROCESS_PRIORITY_UNSPECIFIED    0
#define PROCESS_PRIORITY_LOWEST     1
    // win: IDLE; unix: 19
#define PROCESS_PRIORITY_LOW        2
    // win: BELOW_NORMAL; unix: 10
#define PROCESS_PRIORITY_NORMAL     3
    // win: NORMAL; unix: 0
#define PROCESS_PRIORITY_HIGH       4
    // win: ABOVE_NORMAL; unix: -10
#define PROCESS_PRIORITY_HIGHEST    5
    // win: HIGH; unix: -16

Using that convention, I would expect:

CPU tasks - my Windows priority 'low' - unix 19
GPU tasks - my Windows priority 'below normal' - unix 10

I don't see where 20 would come from?

Ian&Steve C.
Ian&Steve C.
Joined: 19 Jan 20
Posts: 3958
Credit: 47024602642
RAC: 65053943

i think the values in the

i think the values in the comments are not representative of what modern linux OS applies anymore.

 

I'm just reporting what I can actually see the priority+nice values listed as in htop. they are all PRI=20, NI=0.

 

but this goes back to my earlier question/comment about the values being non-intuitive. lower numbers mean higher priority? "high" priority = less CPU time allocated? it's all very confusing. at the end of the day, we want the tasks to have proper resources to run full speed without being hindered. so which values do we want? and at the same time, from my quick research, it seems that process priority doesnt make any difference unless the CPU is pegged at 100% and overcommited anyway (which I personally never do, always 1-2 threads free/available), otherwise it just uses the free resources.

_________________________________________________________________________

Richard Haselgrove
Richard Haselgrove
Joined: 10 Dec 05
Posts: 2143
Credit: 2959099505
RAC: 707721

http://manpages.ubuntu.com/ma

http://manpages.ubuntu.com/manpages/trusty/man1/nice.1.html doesn't say anything different, unless you run a custom shell. In which case, you're on your own.

Ian&Steve C.
Ian&Steve C.
Joined: 19 Jan 20
Posts: 3958
Credit: 47024602642
RAC: 65053943

Richard Haselgrove

Richard Haselgrove wrote:

http://manpages.ubuntu.com/manpages/trusty/man1/nice.1.html doesn't say anything different, unless you run a custom shell. In which case, you're on your own.

I don't run a custom shell. but i think you misunderstood my post.

 

overall process priority is *combination* of PR and NI (nice) value. NI itself maxes out at 19, but overall PR goes higher.

 

see here: https://askubuntu.com/questions/656771/process-niceness-vs-priority

 

htop reports the tasks are running at PR = 20, with a NI (nice) = 0. this is identical between CPU and GPU tasks. my question was aimed at if this was expected for the CPU and GPU to have the same nice and priority levels? I got the impression from earlier posts that maybe they would be different.

 

but again, do these values even matter if you're not maxing out the CPU to 100% utilization? I think the answer is probably no.

_________________________________________________________________________

Ian&Steve C.
Ian&Steve C.
Joined: 19 Jan 20
Posts: 3958
Credit: 47024602642
RAC: 65053943

found this tidbit too:

found this tidbit too: http://osr507doc.xinuos.com/en/PERFORM/calc_proc_priorities.html

Quote:
The default nice value of a user's process is 20. An ordinary user can increase this value to 39 and in so doing reduce a process' chance of running on the CPU. Processes with low nice values will on average get more CPU time because of the effect the values have on the scheduling algorithm.

 

so it seems a lower value actually grants more priority (logically). this is what i meant by the numbers being counter intuitive. you'd think that a bigger number gets more CPU time, but it's the inverse of that if I'm reading the above correctly. 

_________________________________________________________________________

Ian&Steve C.
Ian&Steve C.
Joined: 19 Jan 20
Posts: 3958
Credit: 47024602642
RAC: 65053943

another addition. I just

another addition. I just remembered that some time ago, I set my systems to use higher priority in the cc_config.xml file. So that explains why my Nice value was set to 0.

I removed those options.

  • overall priority gets set to 39 for CPU tasks; 20+19(NI)
  • overall priority gets set to 30 for GPU tasks; 20+10(NI)

 

changing the CPU use estimate in the app_config.xml has no effect on these values. GPU tasks will run at 30 (nice 10) whether it's CPU use estimate is 0.9 or 1.0. this supports my comment that GPU tasks will get increased priority just for being GPU tasks.

_________________________________________________________________________

Comment viewing options

Select your preferred way to display the comments and click "Save settings" to activate your changes.