You have a quad core i5. Are you sure you really need to restrict it to just one CPU task? If you could run two you could probably clear the backlog before deadline. What do you have for work cache settings?
Good point. My work cache is set to store at least 5 days of work. I did that so I wouldn't run out of work for my 750 Ti, but it had been a long time since I looked at my Tasks pages. I was shocked to see that with respect to FGRP4-SSE2 tasks I had 3 marked Timed out - no response (earliest 23 January), and 16 marked Not started by deadline - canceled (earliest 26 January). That is unacceptable, so I am going to run 2 concurrently.
BTW I did get one Gamma-ray pulsar binary search #1 v1.00 task since my post yesterday. But until FGRPB1 replaces FGRP4-SSE2 I will change my app_config file so I don't download any more FGRPB1 tasks. That's too bad, since the much more numerous FGRP4-SSE2 tasks have an average run time of 24,793.57s and the single FGRPB1 task had a run time of 22,177.15s.
What is the difference between Timed out - no response and Not started by deadline - canceled, anyway?
"Remember, nothing that's good works by itself, just to please you. You have to make the damn thing work." Thomas A. Edison
What is the difference between Timed out - no response and Not started by deadline - canceled, anyway?
The first (Timed out) means that the server had marked it as such but when your client contacted the server it was actually crunching so couldn't be canceled by the server. If a task is already past the deadline but has not even started to crunch, the server can cancel them, if it is configured to do so.
In the past, I didn't think the EAH servers canceled non-started deadline misses. It obviously does now.
With the extra information given (cache setting 5 days) I'm now of the opinion that BOINC is not smart enough to handle the app_config.xml restriction properly. If you restrict to 1 CPU task at a time, BOINC still gets enough work to last 5 days for all 4 cores. So in effect you will have a 20 day work cache for a of 1. If you lowered your cache to 3 days, you would have 12 days of CPU work for a single core or 6 days if you changed to 2.
I have a 750Ti that is running 3x on the GPU with CPU tasks on all CPU cores (Pentium dual core). It seems to handle that quite OK. There's no real improvement in GPU crunch time if I run only 1 CPU task.
I have a 750Ti that is running 3x on the GPU with CPU tasks on all CPU cores (Pentium dual core). It seems to handle that quite OK. There's no real improvement in GPU crunch time if I run only 1 CPU task.
That's interesting that you see no real improvement in GPU crunch time whether you run 1 or 2 CPU tasks, especially with a dual core processor. Is the computer in question dedicated to E@H, or do you use it for browsing also? And are you saying you are running 3 concurrent tasks on your GPU? If so, I would expect to see longer crunch times than if you only ran 2 GPU tasks. I would love to see your app_config.xml.
"Remember, nothing that's good works by itself, just to please you. You have to make the damn thing work." Thomas A. Edison
And are you saying you are running 3 concurrent tasks on your GPU?
Yes.
Quote:
If so, I would expect to see longer crunch times than if you only ran 2 GPU tasks.
Yes, longer crunch times for all the batch of three but shorter crunch times per task if you work it out. otherwise I wouldn't be running 3x.
I started running at 2x and the tasks were taking around 13.4 ksecs. That's a per task time of 6.7 ksecs. I'm now running 3x and tasks take around 19 ksecs or 6.4 ksecs per task. Sure, that's a pretty marginal improvement but it is an improvement. I was going to put it back to 2x but I thought I'd let it run for a while to see if any issues developed. It's been a while since I last looked at it and thinking about what you were writing forced me to take a look so I was interested to see it chugging along with no apparent issues :-).
The CPU is a G3258 running at about 3.7GHz and the two cores are producing FGRPB1 results in about 14.2 ksecs.
Quote:
I would love to see your app_config.xml.
It's precisely what you would get from a GPU utilization factor of 0.33. I am using an app_config.xml for the convenience of instant changes but the only parameters that are set are cpu_usage of 0.2 and gpu_usage of 0.33.
There is nothing in the file relating to a restriction on CPU tasks.
I sincerely apologize for not responding earlier. I actually composed my initial response long ago, but I wasn't happy with the result. I hope this one makes sense.
Quote:
Yes, longer crunch times for all the batch of three but shorter crunch times per task if you work it out. otherwise I wouldn't be running 3x. I started running at 2x and the tasks were taking around 13.4 ksecs. That's a per task time of 6.7 ksecs. I'm now running 3x and tasks take around 19 ksecs or 6.4 ksecs per task. Sure, that's a pretty marginal improvement but it is an improvement. I was going to put it back to 2x but I thought I'd let it run for a while to see if any issues developed. It's been a while since I last looked at it and thinking about what you were writing forced me to take a look so I was interested to see it chugging along with no apparent issues :-).
Very interesting result regarding 3x GPU tasks, and I'll look into it later. My BRP6 tasks running at 2x take ~12.756 ksecs, for a per task time of 6.4 ksecs.
Quote:
The CPU is a G3258 running at about 3.7GHz and the two cores are producing FGRPB1 results in about 14.2 ksecs.
Quote:
I would love to see your app_config.xml.
It's precisely what you would get from a GPU utilization factor of 0.33. I am using an app_config.xml for the convenience of instant changes but the only parameters that are set are cpu_usage of 0.2 and
gpu_usage of 0.33.
There is nothing in the file relating to a restriction on CPU tasks.
My CPU is an i5-2310 running at about 2.9 GHz, so my results may vary. And do you mean that there is no max_concurrent line in your app_config.xml? If I don't use that BOINC will run 4-5 CPU tasks concurrently, which drives the CPU usage to 100%. That has not always been the case. The only thing I remember that has changed is that until recently BRP6 was the only E@H task I ran, while my CPU was used for other projects.
Does your app_config.xml specify cpu_usage of 0.2 because you use the stock app? I'm using BRP6-Beta-cuda55, and that part of my app_config was given to me by Stan Pope, another member of The Planetary Society, who also generously gave me the 750 Ti.
I'm no longer running FGRPB1, and I've dropped FGRP4 in favor of O1AS20-100T. Along the same line as your calculation for your GPU tasks, I recently calculated the Cobblestones/second (Cs/s) for FGRP4 and O1AS20-100T tasks. I was running one each simultaneously since GW tasks became available again. On my machine GW tasks take 33.517 s/C, while FGRP4 tasks take 34.121 s/C. Again, its a marginal improvement, but an improvement. So now I run 2x O1AS20-100T tasks.
"Remember, nothing that's good works by itself, just to please you. You have to make the damn thing work." Thomas A. Edison
Very interesting result regarding 3x GPU tasks, and I'll look into it later. My BRP6 tasks running at 2x take ~12.756 ksecs, for a per task time of 6.4 ksecs.
That sounds about right. Very little difference really between 2x and 3x so no great need to run 3x.
Since you have responded again, I'll try to give some more information about my G3258 Pentium dual core with the 750 Ti GPU. I haven't looked at it since we last spoke. It has a current RAC of nearly 72K which is somewhat larger than what I would expect. It's not doing GW work, only FGRPB1 on both cores and BRP6 on the GPU @ 3x. FGRPB1 tasks still take around 14.2ksecs at the moment. Using your metric, that's around 20.5 secs/credit. BRP6 still takes just over 19ksecs for 3 tasks, so 6.4ksecs per task is a reasonable estimate - same as last time I looked. So, in theory, the host should have a RAC of around 60K from GPU work and 8.4K from CPU work. Not too shabby for a budget dual core. The fact that it's currently a little higher is probably just normal RAC fluctuation. Maybe a bunch of pendings validated recently or something like that.
EDIT: I just worked out why the RAC is a little on the high side. The above host is one of the 'lucky' ones that got a 'credit spike' as discussed in this thread. Interesting that it still hasn't fully returned to normal even more than a month after the event.
Here is the app_config.xml I'm using
[pre]
einsteinbinary_BRP6
3
0.33
0.2
[/pre]
You can see it's purely for the control of the GPU. There is nothing for controlling or limiting CPU tasks in any way.
Quote:
My CPU is an i5-2310 running at about 2.9 GHz, so my results may vary.
Yes, that's a Sandy Bridge generation CPU with 4 physical cores, so each core would perform worse than a Haswell generation core. I also have a Sandy bridge (a i3-2120 dual core with HT) that's supporting a HD7850 GPU running BRP6 4x. It runs CPU tasks on 2 virtual cores with the other two reserved for GPU support. FGRPB1 complete in ~20 ksecs on both cores used.
Quote:
And do you mean that there is no max_concurrent line in your app_config.xml?
See for yourself above. The line that's there is ONLY for the GPU app and is probably quite redundant since the other two settings properly control the GPU. There is NOTHING in that file that affects crunching of CPU tasks.
Quote:
If I don't use that BOINC will run 4-5 CPU tasks concurrently, which drives the CPU usage to 100%. That has not always been the case. The only thing I remember that has changed is that until recently BRP6 was the only E@H task I ran, while my CPU was used for other projects.
You have a 4 core/4 thread processor - ie. no HT. It could never have run 5 CPU tasks - just 4 max. Without sufficient cooling, it's possible that the processor was suffering thermal throttling, which could have quite a big effect on performance. With proper (ie. aftermarket) cooling, you could get proper performance while running tasks on all 4 cores, if your other use of the machine was for 'office' type activities. I do this every day on several different machines without feeling 'restricted' in any way. YMMV.
Quote:
Does your app_config.xml specify cpu_usage of 0.2 because you use the stock app? I'm using BRP6-Beta-cuda55, and that part of my app_config was given to me by Stan Pope, another member of The Planetary Society, who also generously gave me the 750 Ti.
No. I use 0.2 for all my NVIDIA GPUs for both BRP4G and BRP6, irrespective of the app, beta or otherwise. I don't have any high end GPUs. The 750 Ti is the 'highest' and it functions just fine at the default 0.2 cpu_usage setting. I have a range of low to medium NVIDIA GPUs running at either 2x or 3x with CPU tasks on all cores. More demanding GPUs may benefit from a 'free' core or two but not the ones I run.
There is no point changing the 0.2 setting UNLESS you decide to use it for automatic restriction of CPU tasks. For example, if you set gpu_usage to 0.33 and cpu_usage to 0.67, this would allow 3 GPU tasks to run whilst simultaneously 'freeing-up' 2 CPU cores, since 3x0.67=2.01 which is greater than 2. The cpu_usage setting does NOT control the number of CPU cycles used for support. The app will use whatever it needs which is actually quite a lot less than 20% of a CPU.
Quote:
On my machine GW tasks take 33.517 s/C, while FGRP4 tasks take 34.121 s/C.
That's exactly what the Devs try to achieve and looks like they've really succeeded for your host. They put a lot of effort into trying to come up with a fair credit award across the different runs. It's quite a difficult job because of all the different platforms/architectures that contribute here. When the tuning run ends, one of the things that will be looked at is the credit award and they will tweak that if required. Hopefully, the majority will be 'about right' but there are bound to be 'winners and losers' each way. People can choose the science run that works best for them if they so desire.
I intend to have a 'foot in both camps' :-). It would be very nice to 'bag a pulsar' in a binary system. It would be even better to be part of the first detection of continuous GW. After all, we now know they should be there for the finding ;-).
Greetings.
Thanks to Gary for his comments on running more than one task per GPU.
After reading the thread, I collected data on my Zotac GTX 780Ti (3 GB, stock clocks) and found that running 5 tasks per GPU resulted in completing the most tasks per day, virtually tied with 4 tasks per GPU.
Number of Tasks per GPU Average Time per Task, Hrs Tasks per Day
2 0.50 96.0
3 0.68 105.4
4 0.88 108.7
5 1.10 109.1
6 1.45 99.3
The CPU is a Core i7 3930K, 6 cores, 12 logical, overclocked from 3.2 to 3.5 GHz (stable) , 16 GB RAM @ XMP-2134. Windows 10x64.
The GPU core load averages in the range of about 82% to 86% with 5 tasks running (taken from Open Hardware Monitor charts). The tasks are in the BRP4G-Beta-cuda-nv301 series.
To run multiple tasks, I set the GPU Utilization Factors in my account to ((1/n) -.01) where n is the number of tasks you want to run. The -.01 just guarantees that rounding errors by BOINC/Einstein will result in runnung n tasks.
The time data was taken from BOINC Manager Event Log, so the time resolution for each task's time measurement is one minute.
You can copy & paste the log file into a spreadsheet, then filter column 3, which shows the task's status, for "starting" or "computation". This hides rows you won't use.
Then you can calculate the run time per task, the average time per task, and the tasks per day.
I had to set the time column format to match the date display format to get the run time in hours:minutes when subtracting the end time from the start time.
Of note when running two or more tasks per GPU, once Einstein has one or more GPU task running on a given GPU, it will continue to replace completed tasks with new ones, blocking all other projects from using that GPU.
With two or more GPUs, Einstein can capture all of them for its tasks, blocking other GPU-based projects.
To prevent this from happening with my two GPUs (I also run GPUGRID and some SETI GPU tasks), I modified the cc_config.xml file in the BOINC data directory by adding the followning in the section.
This modification excludes the two Einstein applications from GPU 1 and excludes the two GPUGRID applications from GPU 0. I might change this rule in the future, but it means that neither Eindtein nor GPUGRID can capture both GPUs.
The project URLs required are listed in the early part of the BOINC Manager Event Log.
This is working so far, but if anyone has other suggestions, please post them.
You have a quad core i5. Are
)
You have a quad core i5. Are you sure you really need to restrict it to just one CPU task? If you could run two you could probably clear the backlog before deadline. What do you have for work cache settings?
Good point. My work cache is set to store at least 5 days of work. I did that so I wouldn't run out of work for my 750 Ti, but it had been a long time since I looked at my Tasks pages. I was shocked to see that with respect to FGRP4-SSE2 tasks I had 3 marked Timed out - no response (earliest 23 January), and 16 marked Not started by deadline - canceled (earliest 26 January). That is unacceptable, so I am going to run 2 concurrently.
BTW I did get one Gamma-ray pulsar binary search #1 v1.00 task since my post yesterday. But until FGRPB1 replaces FGRP4-SSE2 I will change my app_config file so I don't download any more FGRPB1 tasks. That's too bad, since the much more numerous FGRP4-SSE2 tasks have an average run time of 24,793.57s and the single FGRPB1 task had a run time of 22,177.15s.
What is the difference between Timed out - no response and Not started by deadline - canceled, anyway?
"Remember, nothing that's good works by itself, just to please you. You have to make the damn thing work." Thomas A. Edison
RE: What is the difference
)
The first (Timed out) means that the server had marked it as such but when your client contacted the server it was actually crunching so couldn't be canceled by the server. If a task is already past the deadline but has not even started to crunch, the server can cancel them, if it is configured to do so.
In the past, I didn't think the EAH servers canceled non-started deadline misses. It obviously does now.
With the extra information given (cache setting 5 days) I'm now of the opinion that BOINC is not smart enough to handle the app_config.xml restriction properly. If you restrict to 1 CPU task at a time, BOINC still gets enough work to last 5 days for all 4 cores. So in effect you will have a 20 day work cache for a of 1. If you lowered your cache to 3 days, you would have 12 days of CPU work for a single core or 6 days if you changed to 2.
I have a 750Ti that is running 3x on the GPU with CPU tasks on all CPU cores (Pentium dual core). It seems to handle that quite OK. There's no real improvement in GPU crunch time if I run only 1 CPU task.
Cheers,
Gary.
RE: I have a 750Ti that is
)
That's interesting that you see no real improvement in GPU crunch time whether you run 1 or 2 CPU tasks, especially with a dual core processor. Is the computer in question dedicated to E@H, or do you use it for browsing also? And are you saying you are running 3 concurrent tasks on your GPU? If so, I would expect to see longer crunch times than if you only ran 2 GPU tasks. I would love to see your app_config.xml.
"Remember, nothing that's good works by itself, just to please you. You have to make the damn thing work." Thomas A. Edison
RE: Is the computer in
)
Yes.
Yes.
Yes, longer crunch times for all the batch of three but shorter crunch times per task if you work it out. otherwise I wouldn't be running 3x.
I started running at 2x and the tasks were taking around 13.4 ksecs. That's a per task time of 6.7 ksecs. I'm now running 3x and tasks take around 19 ksecs or 6.4 ksecs per task. Sure, that's a pretty marginal improvement but it is an improvement. I was going to put it back to 2x but I thought I'd let it run for a while to see if any issues developed. It's been a while since I last looked at it and thinking about what you were writing forced me to take a look so I was interested to see it chugging along with no apparent issues :-).
The CPU is a G3258 running at about 3.7GHz and the two cores are producing FGRPB1 results in about 14.2 ksecs.
It's precisely what you would get from a GPU utilization factor of 0.33. I am using an app_config.xml for the convenience of instant changes but the only parameters that are set are cpu_usage of 0.2 and gpu_usage of 0.33.
There is nothing in the file relating to a restriction on CPU tasks.
Cheers,
Gary.
I sincerely apologize for not
)
I sincerely apologize for not responding earlier. I actually composed my initial response long ago, but I wasn't happy with the result. I hope this one makes sense.
Very interesting result regarding 3x GPU tasks, and I'll look into it later. My BRP6 tasks running at 2x take ~12.756 ksecs, for a per task time of 6.4 ksecs.
My CPU is an i5-2310 running at about 2.9 GHz, so my results may vary. And do you mean that there is no max_concurrent line in your app_config.xml? If I don't use that BOINC will run 4-5 CPU tasks concurrently, which drives the CPU usage to 100%. That has not always been the case. The only thing I remember that has changed is that until recently BRP6 was the only E@H task I ran, while my CPU was used for other projects.
Does your app_config.xml specify cpu_usage of 0.2 because you use the stock app? I'm using BRP6-Beta-cuda55, and that part of my app_config was given to me by Stan Pope, another member of The Planetary Society, who also generously gave me the 750 Ti.
I'm no longer running FGRPB1, and I've dropped FGRP4 in favor of O1AS20-100T. Along the same line as your calculation for your GPU tasks, I recently calculated the Cobblestones/second (Cs/s) for FGRP4 and O1AS20-100T tasks. I was running one each simultaneously since GW tasks became available again. On my machine GW tasks take 33.517 s/C, while FGRP4 tasks take 34.121 s/C. Again, its a marginal improvement, but an improvement. So now I run 2x O1AS20-100T tasks.
"Remember, nothing that's good works by itself, just to please you. You have to make the damn thing work." Thomas A. Edison
RE: Very interesting result
)
That sounds about right. Very little difference really between 2x and 3x so no great need to run 3x.
Since you have responded again, I'll try to give some more information about my G3258 Pentium dual core with the 750 Ti GPU. I haven't looked at it since we last spoke. It has a current RAC of nearly 72K which is somewhat larger than what I would expect. It's not doing GW work, only FGRPB1 on both cores and BRP6 on the GPU @ 3x. FGRPB1 tasks still take around 14.2ksecs at the moment. Using your metric, that's around 20.5 secs/credit. BRP6 still takes just over 19ksecs for 3 tasks, so 6.4ksecs per task is a reasonable estimate - same as last time I looked. So, in theory, the host should have a RAC of around 60K from GPU work and 8.4K from CPU work. Not too shabby for a budget dual core. The fact that it's currently a little higher is probably just normal RAC fluctuation. Maybe a bunch of pendings validated recently or something like that.
EDIT: I just worked out why the RAC is a little on the high side. The above host is one of the 'lucky' ones that got a 'credit spike' as discussed in this thread. Interesting that it still hasn't fully returned to normal even more than a month after the event.
Here is the app_config.xml I'm using
[pre]
einsteinbinary_BRP6
3
0.33
0.2
[/pre]
You can see it's purely for the control of the GPU. There is nothing for controlling or limiting CPU tasks in any way.
Yes, that's a Sandy Bridge generation CPU with 4 physical cores, so each core would perform worse than a Haswell generation core. I also have a Sandy bridge (a i3-2120 dual core with HT) that's supporting a HD7850 GPU running BRP6 4x. It runs CPU tasks on 2 virtual cores with the other two reserved for GPU support. FGRPB1 complete in ~20 ksecs on both cores used.
See for yourself above. The line that's there is ONLY for the GPU app and is probably quite redundant since the other two settings properly control the GPU. There is NOTHING in that file that affects crunching of CPU tasks.
You have a 4 core/4 thread processor - ie. no HT. It could never have run 5 CPU tasks - just 4 max. Without sufficient cooling, it's possible that the processor was suffering thermal throttling, which could have quite a big effect on performance. With proper (ie. aftermarket) cooling, you could get proper performance while running tasks on all 4 cores, if your other use of the machine was for 'office' type activities. I do this every day on several different machines without feeling 'restricted' in any way. YMMV.
No. I use 0.2 for all my NVIDIA GPUs for both BRP4G and BRP6, irrespective of the app, beta or otherwise. I don't have any high end GPUs. The 750 Ti is the 'highest' and it functions just fine at the default 0.2 cpu_usage setting. I have a range of low to medium NVIDIA GPUs running at either 2x or 3x with CPU tasks on all cores. More demanding GPUs may benefit from a 'free' core or two but not the ones I run.
There is no point changing the 0.2 setting UNLESS you decide to use it for automatic restriction of CPU tasks. For example, if you set gpu_usage to 0.33 and cpu_usage to 0.67, this would allow 3 GPU tasks to run whilst simultaneously 'freeing-up' 2 CPU cores, since 3x0.67=2.01 which is greater than 2. The cpu_usage setting does NOT control the number of CPU cycles used for support. The app will use whatever it needs which is actually quite a lot less than 20% of a CPU.
That's exactly what the Devs try to achieve and looks like they've really succeeded for your host. They put a lot of effort into trying to come up with a fair credit award across the different runs. It's quite a difficult job because of all the different platforms/architectures that contribute here. When the tuning run ends, one of the things that will be looked at is the credit award and they will tweak that if required. Hopefully, the majority will be 'about right' but there are bound to be 'winners and losers' each way. People can choose the science run that works best for them if they so desire.
I intend to have a 'foot in both camps' :-). It would be very nice to 'bag a pulsar' in a binary system. It would be even better to be part of the first detection of continuous GW. After all, we now know they should be there for the finding ;-).
Cheers,
Gary.
Greetings. Thanks to Gary for
)
Greetings.
Thanks to Gary for his comments on running more than one task per GPU.
After reading the thread, I collected data on my Zotac GTX 780Ti (3 GB, stock clocks) and found that running 5 tasks per GPU resulted in completing the most tasks per day, virtually tied with 4 tasks per GPU.
Number of Tasks per GPU Average Time per Task, Hrs Tasks per Day
2 0.50 96.0
3 0.68 105.4
4 0.88 108.7
5 1.10 109.1
6 1.45 99.3
The CPU is a Core i7 3930K, 6 cores, 12 logical, overclocked from 3.2 to 3.5 GHz (stable) , 16 GB RAM @ XMP-2134. Windows 10x64.
The GPU core load averages in the range of about 82% to 86% with 5 tasks running (taken from Open Hardware Monitor charts). The tasks are in the BRP4G-Beta-cuda-nv301 series.
To run multiple tasks, I set the GPU Utilization Factors in my account to ((1/n) -.01) where n is the number of tasks you want to run. The -.01 just guarantees that rounding errors by BOINC/Einstein will result in runnung n tasks.
The time data was taken from BOINC Manager Event Log, so the time resolution for each task's time measurement is one minute.
You can copy & paste the log file into a spreadsheet, then filter column 3, which shows the task's status, for "starting" or "computation". This hides rows you won't use.
Then you can calculate the run time per task, the average time per task, and the tasks per day.
I had to set the time column format to match the date display format to get the run time in hours:minutes when subtracting the end time from the start time.
Of note when running two or more tasks per GPU, once Einstein has one or more GPU task running on a given GPU, it will continue to replace completed tasks with new ones, blocking all other projects from using that GPU.
With two or more GPUs, Einstein can capture all of them for its tasks, blocking other GPU-based projects.
To prevent this from happening with my two GPUs (I also run GPUGRID and some SETI GPU tasks), I modified the cc_config.xml file in the BOINC data directory by adding the followning in the section.
This modification excludes the two Einstein applications from GPU 1 and excludes the two GPUGRID applications from GPU 0. I might change this rule in the future, but it means that neither Eindtein nor GPUGRID can capture both GPUs.
http://einstein.phys.uwm.edu
[1]
[NVIDIA]
[einsteinbinary_BRP4G]
http://einstein.phys.uwm.edu
[1]
[NVIDIA]
[einsteinbinary_BRP6]
http://www.gpugrid.net
[0]
[NVIDIA]
[acemdlong]
http://www.gpugrid.net
[0]
[NVIDIA]
[acemdshort]
The project URLs required are listed in the early part of the BOINC Manager Event Log.
This is working so far, but if anyone has other suggestions, please post them.
Thanks, Art