Observations on FGRPB1 1.15 for Windows

Der Mann mit de...

Joined: 12 Dec 05

Posts: 151

Credit: 302594178

RAC: 0

I totally agree with you. If

15 Dec 2016 14:14:18 UTC

Message 152849 in response to message 152848

(moderation:

)

I totally agree with you.

If the Project isn't able to provide us usefull apps until tomorrow, I will abort all WU's and leave this Project.

To much Trouble in the past 3 month.

Der Mann mit der Ledertasche

Greetings from the North

Christian Beer

Joined: 9 Feb 05

Posts: 595

Credit: 188192100

RAC: 329501

Please bear in mind that the

15 Dec 2016 14:17:44 UTC

Message 152850

(moderation:

)

Please bear in mind that the FGRPB1G GPU application for Windows is currently still marked as Beta because it doesn't utilize the GPU fully (as you noticed). A lot of the things that the Linux app is doing on the GPU are still done on the CPU on Windows. Bernd is working on this and this has his top priority.

ravenigma

Joined: 20 Aug 10

Posts: 69

Credit: 80558758

RAC: 320

My setup: OS: Win 7

15 Dec 2016 14:42:35 UTC

Message 152851

(moderation:

)

My setup:

OS: Win 7 x64

CPU: i7-4790K (4.2GHz)

GPU: GTX 1080 (1708MHz core, 4513MHz memory)

I have been running two tasks concurrently since I started receiving these. The first two I ran were actually a combo of 1 FGRPB1G and 1 GPUGrid task and these only took a few minutes longer to complete than all the others which have been two of these tasks together. I am averaging a little over 2.5 hours per task.

For now I will disable the option in my account to run Beta apps. I'll keep an watch on this thread and the one in the News section regarding this app, though, and re-enable when the next Beta version is released.

Mad_Max

Joined: 2 Jan 10

Posts: 154

Credit: 2211568088

RAC: 319610

Gary Roberts wrote: As an

15 Dec 2016 15:01:20 UTC

Message 152853 in response to message 152828

(moderation:

)

Gary Roberts wrote:

As an example, if you had at least a 2GB card (based on Linux experience - may be different for Windows), you could try running 2 GPU tasks supported by 1 CPU core (instead of the default 2) by using the following app_config.xml file, placed in the Einstein project directory. Note that I'm posting this from a borrowed computer in a hospital room so it's from (possibly faulty) memory. Check the above link for documentation on app_config.xml
<app_config>
    <app>
        <name>hsgamma_FGRPB1G</name>
        <gpu_versions>
            <gpu_usage>0.5</gpu_usage>
            <cpu_usage>0.5</cpu_usage>
       </gpu_versions>
    </app>
</app_config>
If 0.5 CPUs per GPU task is not enough CPU support, you could try (with sufficient GPU memory) running 3 GPU tasks supported by 2 CPU cores. That would effectively reserve 0.67 CPUs per GPU task. To do this, just set gpu_usage to 0.33 and cpu_usage to 0.67 in the above example, followed by a 're-read config files' in BOINC Manager. Please realise that I don't have access to a Windows machine to try any of this so anyone following this will need to experiment to find the optimum conditions. It may well be that one core per GPU task really is the best option anyway.

Yes, it is different for Windows. At least for current windows version - i monitor GPU VRAM consumption and it is about 300 MB per hsgamma_FGRPB1G taks. So 1GB card can safely run 2 tasks in parallel and 2GB cards can run 4-5 tasks. I thinks windows VRAM usage is much less due to the fact that only part of computations transferred to GPU compared to *nix and Mac versions.
Default one CPU core per GPU task really is far from optimal. At least for AMD GPUs (may be different for NVidia).
CPU load jumps in 30-100% range. I did few tests and looks like optimum point is 0.5 or 0.67 CPU cores per taks (2 WU one 1 CPU core or 3 WUs on 2 CPU cores)
With 4 WUs per 1 GPU running in parallel (and 2 CPU cores to support CPU part of computation) I was able to achieve ~90% GPU load on HD 7870 2 GB (1280 shaders @ 1100 Mhz).

But even with such optimizations current app is very slow and inefficient.

cliff

Joined: 15 Feb 12

Posts: 176

Credit: 283452444

RAC: 0

Hi Christian, I'm happy to

16 Dec 2016 3:07:37 UTC

Message 152869 in response to message 152850

(moderation:

)

Hi Christian,

I'm happy to know its being worked on, I don't like to abort tasks because they wont even start before the deadline.

When the app is sorted out I'll be happy to run it if it actually used the GPU properly and doesn't need 1 x CPU,

as you are probably aware I run 3 projects here E@H being one of them, and prior to the current windows app everything ran pretty well here. Hopefully it will do so again when the app is fixed and updated.

I'll monitor the forums to see when its ok to remove NNT.

Regards,

Cliff,

Been there, Done that, Still no damm T Shirt.

Betreger

Joined: 25 Feb 05

Posts: 992

Credit: 1587515744

RAC: 750171

Cliff I was going to make

16 Dec 2016 3:15:26 UTC

Message 152870

(moderation:

)

Cliff I was going to make that comment but you beat me to it. I'll run these thru Friday and if it's not fixed I'll find something else for my GPU's to do. I'll keep an eye on the boards to see when it will be productive for me to come back.

Richie

Joined: 7 Mar 14

Posts: 656

Credit: 1702989778

RAC: 0

GTX 780 + Q9550 @ 3.7GHz,

16 Dec 2016 5:06:27 UTC

Message 152872

(moderation:

)

GTX 780 (347.88) + Intel Q9550 @ 3.7GHz, running 1 task : 13600s (3h 47min)

GTX 760 (376.33), 2 cards, Intel X56xx @ 4GHz, running 3 tasks per card : 14600s (4h 03min)

AMD R9 270X (16.12.1), Intel X56xx @ 4GHz, running 1 task : 20400s (5h 40min)

I think DRAM speed and overall memory speed (motherboard + CPU) has a notable impact on completion times with these tasks.

System with Q9550 has the oldest and slowest DDR2, but that CPU has about the same power to deliver for 1 GPU task as the others. If I compare my observations with BRP4G tasks with these completion times, I believe if it wasn't the system memory speed thing having an effect here, then that difference between GTX 780 and GTX 760 completion times would be easily wider.

The system in the middle has fastest DRAM of those three. I'm not sure if Nvidia and AMD OpenCL can be compared at all, but I believe also those two hosts with DDR3 memory could theoretically end up with much more similar completion times with each other, if they had same DRAM speed. That's because those two GPU models should have pretty much same computational power (that AMD could be even the fastest of those two),

cliff

Joined: 15 Feb 12

Posts: 176

Credit: 283452444

RAC: 0

Hi Betreger Betreger

16 Dec 2016 8:56:43 UTC

Message 152876 in response to message 152870

(moderation:

)

Hi Betreger

Betreger wrote:

Cliff I was going to make that comment but you beat me to it. I'll run these thru Friday and if it's not fixed I'll find something else for my GPU's to do. I'll keep an eye on the boards to see when it will be productive for me to come back.

Well now and then I manage to get in a goodish response:-)

Trouble is I don't think that a lot of folks realise what the crunchers put into projects, its not just time and cpu/gpu cycles, its cost of kit, plus cost of the electricity used to run the kit, and the amount of money paid out for upgrades etc.

I've spent far more this qtr on electricity bills than I'd planned on [extra £256] and also 2 2nd hand intel based workstations at very near £900. I find I cant actually afford another bill of over £200, so I'm trying to keep my computers offline until after midnight when its a lower rate. That said I still need to power up 1 to do the usual web prowling or to read an e-book or watch a video. Local cable TV is a wee bit of a joke, endless repeats dating back to the 1930's 40'2 etc or if someone in the acting community dies, their films proliferate across both DTV and Cable, and after having already seen almost all of then at least 3 times the PC becomes almost irresistible:-)

Well that's got that mega whinge off my keyboard:-)

Regards

Cliff,

Been there, Done that, Still no damm T Shirt.

archae86

Joined: 6 Dec 05

Posts: 3157

Credit: 7219624931

RAC: 975722

As Bernd has posted in the

16 Dec 2016 13:56:03 UTC

Message 152892

(moderation:

)

As Bernd has posted in the Technical News forum, a new Windows64 version 1.16 has superseded 1.15.

Initial reports and my own observations show completion times with 1.16 to be very greatly reduced compared to 1.15.

I suggest that users consider aborting 1.15 tasks in their cache. Ordinarily this is considered a bit anti-social, but here the loss of host productivity seems to warrant it. A bit of initial caution may be warranted, perhaps by suspending 1.15 tasks until some assurance of 1.16 success on one's own host is found.

A word of caution: the progress indication is pretty misleading, with initial progress suggesting only modest improvement. But unlike 1.16 which greatly slowed in reported rate of progress toward the end, some of us have observed 1.16 to leap from about 13% complete to finishing. On the other hand I have some 1.16 tasks still running at current reported progress over 13%, so this may not be consistent either.

My most productive host was consuming about 9400 elapsed time seconds to complete a 1.15 task. The first two 1.16 completion took 495 and 839 seconds, and one of those has validated.

archae86

Joined: 6 Dec 05

Posts: 3157

Credit: 7219624931

RAC: 975722

While GPU utilization is way

16 Dec 2016 14:00:36 UTC

Message 152893 in response to message 152892

(moderation:

)

While GPU utilization is way up compared to 1.15, 1.16 GPU utilization is still pretty low when running 1X. If you have sufficient CPU cores to provide adequate support, I suspect running somewhat higher X will help more than usual. But the CPU support required while less than the virtually 100% needed for 1.15, is still very high, so a bit of caution may be indicated.

Observations on FGRPB1 1.15 for Windows

Forums › Cruncher's Corner

Comment viewing options

Forums › Cruncher's Corner