Save Overclocking GPU

Gary Roberts
Gary Roberts
Moderator
Joined: 9 Feb 05
Posts: 5850
Credit: 110022812199
RAC: 22584952

RE: I see 23 valid Arecibo

Quote:
I see 23 valid Arecibo tasks out of 28.


You should check again. You have 15 valid, 6 pending and 7 errors. Assuming all the pending do validate (as they should hopefully do) you will still have only 21 out of 28. You shouldn't consider a 25% error rate for those BRP4G tasks as "running like clockwork", particularly when you look at the huge variation in elapsed times. I'm not at all trying to question your commitment to various BOINC projects. I'm just suggesting there is an issue you may wish to investigate further.

Quote:
All Parkes tasks failed save one.


Once again, not quite true. You have 2 successfuly completed BRP6, 1 valid and 1 pending. You have 3 compute errors and once again, big variations in elapsed times. You need to realise that whatever is causing the instability is affecting all the tasks you run. BRP4G is not immune.

Quote:
I have reduced the load of VBox tasks.


I'm sorry, I don't know anything about the other projects you run or their running requirements so "VBox tasks" means nothing to me. All I wanted you to be aware of is the very poor outcomes for Einstein's (and I presume other projects') tasks.

Quote:
All CPU gamma-ray tasks have completed even in a longer time than on the Linux boxes, one of them 32-bit.


Yes, it's really sad to see those tasks taking over 300K seconds to complete. That's probably something close to an order of magnitude longer than they should be taking. Does your GPU get used at any other project besides Einstein?

Quote:
PS I never had a higher RAC on any BOINC project than this.


Tullio, my dear friend, a RAC of 2,500 for a machine with a GTX750 is not really a RAC to be proud of :-). It's capable of so much more if things were running correctly. I just had a quick look at your computers over at Seti. The A10-6700 APU is shown (with 4 cores and a reasonable RAC) but no tasks at the moment. Last contact was a day or two ago so I imagine it must have had work recently. I also see two other entries for A10-6700 APU, each with just one core and each one with quite a number of tasks. Looks like you are running two separate BOINC installations inside two virtual machines inside a real machine that has a BOINC installation of its own with several other attached projects, one of then being Einstein that is also trying to run full bore. Even though I know nothing about VMs, I can imagine the overheads and so I can perhaps start to understand why that poor machine is struggling.

Is there any particular reason why you don't just run Seti on the real machine and do away with Seti in the VMs? You really do seem to be trying to do far too much on what is (for crunching purposes) a relatively low powered CPU.

If you would like to see what your machine could produce (if not overloaded), you could temporarily shut down the virtual machines and suspend ALL other projects bar Einstein. Locally in BOINC Manager, set BOINC to use just 50% of the 4 cores. On the website, set the GPU utilization factor for BRP apps to 0.5. Increase your work cache by just enough to trigger a work fetch at Einstein. Sit back and watch what happens for just one day. After that you can revert the changed settings and start all the other stuff again if you wish.

If you are willing to do this, you should see your CPU+GPU at their full potential (provided nothing else is misconfigured or broken). Make sure you allow BRP6 so you get some of them as well. I'd be quite surprised to see compute errors. There would have to be some sort of hardware problem if there were any. The improvement should be quite quick for you to see so you won't have to leave the other projects suspended for very long to get the necessary performance information.

Cheers,
Gary.

tullio
tullio
Joined: 22 Jan 05
Posts: 2118
Credit: 61407735
RAC: 0

I am running 4 CERN BOINC

I am running 4 CERN BOINC projects on that PC, vLHC@home, ATLAS@home, CMS-dev,LHCb@home plus a project, CERN Challenge, which is not a BOINC projecT. They all need the installation of Virtual Box of Oracle, which makes available guest Virtual Machines on Windows, Linux, Mac OS and Solaris hosts. I have installed two Virtual OS on the Windows 10 PC,SuSE Leap 42.1 and SuSE 13.2 and they run SETI@home. Virtual Box is often upgraded by Oracle, which makes me suspend and even shutdown all other applications. The CERN projects are my main interest, all the rest is secondary. I am not interested in credits.
Tullio
I am amazed that you don't know Virtual Box.

tullio
tullio
Joined: 22 Jan 05
Posts: 2118
Credit: 61407735
RAC: 0

I've shut down all Virtual

I've shut down all Virtual Machines because I need to install the Extension Pack of Virtual Box 5.0.16 and I cannot do it if a single Virtual Macine is running. Let's see the results.
Tullio

Gary Roberts
Gary Roberts
Moderator
Joined: 9 Feb 05
Posts: 5850
Credit: 110022812199
RAC: 22584952

RE: ... The CERN projects

Quote:
... The CERN projects are my main interest, all the rest is secondary.


Then perhaps you should just run the CERN stuff. When you try to run so much else as well, it's very likely that everything is running far worse than it should be. The machine obviously can't cope with it all. Do any of the five CERN related projects you mention use the GPU for crunching? If not, the GPU is really being wasted in a machine primarily devoted to CERN.

As you are a retired physicist, I understand your interest in CERN, Seti and Einstein. It seems you could more efficiently contribute to all of these if you were to use two separate machines, one (with a more powerful CPU and lots of RAM) devoted to CERN and the other (with the current CPU/GPU) devoted to Seti/Einstein. If having two such machines is out of the question, there is another alternative. Run CERN only for a certain period, say a month. Then shut that down and run Seti/Einstein for a similar period. At the end of 2 months you would probably find you had contributed more to all your projects than you would have done by trying to run them all together.

Quote:
I am not interested in credits.


You talked about your RAC - I just pointed out how low it seemed. If you are going to contribute to science, why not try to make your contribution as efficient as possible? Your RAC (within the one project) is a useful measure of how efficient your contribution is - that's all.

Quote:
I am amazed that you don't know Virtual Box.


Why should I need to know the intricate details of something I currently have no use for?

Cheers,
Gary.

Gary Roberts
Gary Roberts
Moderator
Joined: 9 Feb 05
Posts: 5850
Credit: 110022812199
RAC: 22584952

RE: I've shut down all

Quote:
I've shut down all Virtual Machines ... Let's see the results.


Some results are already visible. You have completed all BRP4G and BRP6 and now have empty caches for those two runs. The most recent 6 BRP4G and 2 BRP6 tasks have completed without error in much faster times than some of the earlier examples. Take a look for yourself.

It will take a bit longer to see what happens to CPU tasks. Your previous 3 FGRPB4 tasks took ~320,000 each. If there are partly completed tasks, they wont show the full improvement until there is one fully crunched under the new conditions.

Did you reduce the number of cores crunching? The improvement should be substantial, even without that.

####EDIT: And so it is!! I've just seen three FGRPB1 tasks returned. Instead of taking around 320 ksecs, the new ones have taken only ~90 ksecs. You can see these tasks by following this link.

It's not possible to know exactly how much of this time was under the former settings and how much was under the new settings, unless you happened to have recorded the elapsed time of each task when you did the change in settings. Maybe the tasks weren't downloaded until after the settings change. In that case all the time was under the new settings.

Notice the big difference that still exists between CPU time and elapsed time. Those tasks have the potential to crunch a lot faster again if each had full access to the needed CPU resources.

One thing you should realise. AMD CPUs of the type you are using, have a compromise where the floating point unit (FPU) is shared. There are only 2 FPUs for the 4 CPU cores. Such CPUs will always have lower performance when lots of floating point calculations are being performed. If you changed your % of cores BOINC is allowed to use to 50% so as to run just 2 CPU tasks, each should have full access to an FPU. That could make quite a difference to the crunch times. The only way to find out for sure is to try the experiment.

Cheers,
Gary.

tullio
tullio
Joined: 22 Jan 05
Posts: 2118
Credit: 61407735
RAC: 0

Thanks for your suggestions.

Thanks for your suggestions. I still have another GPU board, a Sapphire HD 7770, and I am planning to install it on my oldest PC, a SUN WS with Opteron 1210 with two cores. I have downloaded the AMD Catalyst driver for it and I have seen it is possible to install it in a headless mode, without altering the X-window system on a Linux OS (SuSE Leap 42.1). It is now running only Einstein@home CPU tasks. Could it be running Einstein@home GPU tasks?
Tullio

Gary Roberts
Gary Roberts
Moderator
Joined: 9 Feb 05
Posts: 5850
Credit: 110022812199
RAC: 22584952

RE: ... a Sapphire HD 7770,

Quote:
... a Sapphire HD 7770, and I am planning to install it on my oldest PC, a SUN WS ...


If the workstation has a PCIe x16 slot you could try. Does the PSU have a 6-pin connector for supplying power to the card? I know nothing about those sorts of machines so I don't know what problems there might be.

AMD GPUs need CPU support and can be very productive when they get it. If you wanted to have good GPU performance you would probably need to run CPU tasks on only one CPU core. As for drivers/OpenCL libs, your distro should have a method for detecting the card and installing what is needed without you having to do it manually. I don't understand why you would want to "install it in a headless mode, without altering the X-window system". Wouldn't you be better to just use it as your primary graphics adapter?

Cheers,
Gary.

tullio
tullio
Joined: 22 Jan 05
Posts: 2118
Credit: 61407735
RAC: 0

The PSU has 400 W power and

The PSU has 400 W power and has a 6 pin connector. Thanks.
Tullio

tullio
tullio
Joined: 22 Jan 05
Posts: 2118
Credit: 61407735
RAC: 0

Bad crash on the Windows 10

Bad crash on the Windows 10 PC. No repair disk could save it. I tried to install Linux, but the installer found many bad blocks on a Seagate 2 TB disk aged only two years. I swapped it with 1 TB WD disk I had as a supply. Tried to install Linux on it but SuSE Linux could not use the Geforce GTX 750 board, and the desktop was painful. OK, I downloaded the NVIDIA driver but the driver can be installed only if no X-window is running. The default instruction is to type "init 3". Black screen. By desperation I reinstalled the HP Windows 8 on 4 DVD I had made as a backup and it is now running again. I downloaded the newest NVIDIA driver on it. So this is the reason I want to install the headless driver for the Sapphire HD 7770 on the SUN WS with SuSE Leap 42.1 without altering the X-Window setup. Cheers.
Tullio

Erich56
Erich56
Joined: 16 Dec 15
Posts: 3
Credit: 158923838
RAC: 0

Talking about OC the GPU, I'd

Talking about OC the GPU, I'd like to discuss a problem I've had since I am using my new PC for grid computing.
The processor is an Intel i7-4930k, which means 6 cores (+6 cores in HT mode). So far, it's running at base clock 3.4GHz.
Then there are 2 GPUs Nvidia GTX980ti.
Basically, I am crunching GPUGRID, Einstein and Rosetta (the latter is using the CPU only). The setting in BOINCs is between 50% and 60% CPU usage, which means that I am using 6-7 cores of the CPU.
For GPUGRID, the two GPUs are operating with 3500MHz memory clock and between 1400 and 1420 MHz core clock (temperatures between 60 and 65°C). And here comes the problem: whenever a GPUGRID WU was finished and an Einstein WU follows, the system crashes either immediately or after some time. Which would indicate to me that the Einstein WUs don't like the GPUs running at 1400MHz.
I now could, of course, decrease the GPU clock in certain increments as a trial and error method. On the other Hand, I'd like to run the GPUGRID WUs at 1400MHz.
And, of course, I am not sitting all the time at the PC to wait for the moment when there is a change between these to projects, so that I could adapt the GPU clock by Hand.
Any thoughts/suggestions on this situation from you guys?

Comment viewing options

Select your preferred way to display the comments and click "Save settings" to activate your changes.