Observations on FGRBP1 1.17 for Windows

Betreger
Betreger
Joined: 25 Feb 05
Posts: 810
Credit: 438,538,819
RAC: 141,753

My GTX 660 gets work with

My GTX 660 gets work with 335.23 and got work with 376.33 but it was 8% slower.

Mad_Max
Mad_Max
Joined: 2 Jan 10
Posts: 136
Credit: 1,351,005,670
RAC: 1,338,200

archae86 wrote:Maybe

archae86 wrote:

Maybe "spin-waiting" is what is going on.  It should not be.  I occasionally see an installer that hogs a whole CPU core to itself during run time without really using it most of the time, but it is not behavior I expect from a proper application.

 

It is not about app - it is how nVidia implement their GPU drivers and OpenCL library's. Taking full CPU core near ~100% load all time while OpenCL app is running is "normal" behavior for nVidia even if app do almost nothing. And APP devs can not do anything about this.

It is different for AMD OpenCL implementation - CPU load very variable here and proportional to actual amount of work done. Similar to CUDA works on NV GPUs.

For example my current average runtimes(elapsed time/wall clock) for Gamma-ray pulsar binary search #1 on GPUs v1.17 with AMD GPUs is in 3000-3500 sec range and CPU time in ~ 1000 sec range (~0.25-0.3 CPU core per OpenCL task). For previous BRP4G it was even smaller (like ~0.15-0.2 CPU core per GPU taks about par to CUDA WUs).

archae86
archae86
Joined: 6 Dec 05
Posts: 2,666
Credit: 2,269,120,733
RAC: 2,858,951

Pretty recently two of my

Pretty recently two of my hosts have downloaded tasks which are also 1.17 tasks, but include the string "beta" embedded near the end of the otherwise same application name.

I assume these are associated with the reduction in precautionary required GPU memory that Christian mentioned over in the technical news thread.

In that post Christian is quite explicit in specifying that this beta application does not use any less memory at all in actual running--it is the exclusionary requirement that was lowered.  So owners of previously excluded cards around the 1 Gbyte memory size may be well advised to monitor the first few WUs for progress and success.  Owners of real 768M cards perhaps should be really cautious and report their success or failure here.

AgentB
AgentB
Joined: 17 Mar 12
Posts: 915
Credit: 513,211,304
RAC: 0

archae86 wrote:Owners of real

archae86 wrote:
Owners of real 768M cards perhaps should be really cautious and report their success or failure here.

OpenCL does not release all GPU memory.  I have two 768MB cards, only 701-708 MB is released so they do not pick up tasks.  The 1GB cards should do better.

Gary Roberts
Gary Roberts
Moderator
Joined: 9 Feb 05
Posts: 4,914
Credit: 30,269,333,454
RAC: 36,230,254

I have a 1GB 550TI that

I have a 1GB 550TI that reports as having 1021MB and was previously excluded.

It has now successfully completed 6 GPU tasks, 4 of which have already validated.

 

Cheers,
Gary.

Richie
Richie
Joined: 7 Mar 14
Posts: 457
Credit: 1,540,866,707
RAC: 768,167

When I look at task

When I look at task properties on my Linux hosts with Nvidia GTX 960 these v1.17 tasks (regular and Beta) report massive sizes for 'Virtual memory size'. Something like 28.46 GB (yes, twenty eight gigabytes!). 'Working set size' is about 282 MB at the same time. Is that normal? Nevertheless, completion times for those tasks are quite normal.

This host for example: https://einsteinathome.org/host/12468219

Other sort of platforms (Linux with AMD GPU or Windows with Nvidia or AMD GPU) report much more understandable sizes also for virtual memory size (a few hundred MB, not gigabytes).

Zalster
Zalster
Joined: 26 Nov 13
Posts: 2,976
Credit: 3,198,503,576
RAC: 2,085,502

All this take about virtual

All this take about virtual memory, physical memory, OpenCL adherence to a CPU core, Kernal activity reminds me a heck of whole lot like Raistmer's OpenCl SoG app on Seti@home.  Looking at my SIV64X it even resembles it when looking at activity.  My own observations, show me that overall usage of CPU cores is slightly higher than 1 CPU core per work unit.  I have a 16 core CPU, running 3 at a time on 4 GPUs, it's utilizing 77% of all cores or 12.32 Cores of 16.  Any idea if commandlines would work here as they do on Seti?

AgentB
AgentB
Joined: 17 Mar 12
Posts: 915
Credit: 513,211,304
RAC: 0

Richie_9 wrote:When I look at

Richie_9 wrote:

When I look at task properties on my Linux hosts with Nvidia GTX 960 these v1.17 tasks (regular and Beta) report massive sizes for 'Virtual memory size'. Something like 28.46 GB (yes, twenty eight gigabytes!). 'Working set size' is about 282 MB at the same time. Is that normal? Nevertheless, completion times for those tasks are quite normal.

This host for example: https://einsteinathome.org/host/12468219

Other sort of platforms (Linux with AMD GPU or Windows with Nvidia or AMD GPU) report much more understandable sizes also for virtual memory size (a few hundred MB, not gigabytes).

I think this large VMsize is likely an OpenCL characteristic.  I seem to recall it during BRP6 days.

I see the similar values (20.35 GB) for PrimeGrid on a Nvidia GTX-460, and I do see 16.39 GB  on on this app FGRPB1G on AMD RX-480, both Ubuntu 16.04 LTS.

These large VM values just mean the process has a large virtual address space, it does not mean any part of that space is used.    The values to worry about are the Resident memory (of a process) and the overall free memory. 

Whilst Python this https://www.youtube.com/watch?v=twQKAoq2OPE especially around the 30 minute mark is explains OS memory allocation.

edit: I also started a few Collatz tasks, as there are not many GPU projects left for the GTX-460 768 MB cards, that too also had VM size of 20.4 GB and resident size of 0.04 GB. 

 

archae86
archae86
Joined: 6 Dec 05
Posts: 2,666
Credit: 2,269,120,733
RAC: 2,858,951

Regarding credit, Bernd

Regarding credit, Bernd announced in the Technical News thread a credit award per unit bump.  While his post asserts that the former 1300 was raised to 3300, what I have actually observed is that 1365 was raised to 3465.  A word of caution: Bernd specifically advised this applied to newly made WUs, not newly sent.  So expect old unit re-issues to linger for some time with the old credit award, and expect that some work sent out after his announcement was posted will have been made long enough before also to get the old award.

Separately I have an observation and a question.

Observation: currently a Windows client can receive 1.17 Beta work if the web page setting to accept test applications is enabled.  If (and only if, I think) that setting is disabled, one receives 1.17 work without the application name containing beta as a substring.  A practical difference is that if one has beta enabled, currently the site restricts you to be the quorum partner of non-beta other systems.  As there is a lot of  Windows system capacity with beta enabled, this restriction can cause the system to disqualify all or nearly all the available work in the current ready-to-send-to-you buffer under consideration (which has a maximum size somewhere not far from 100 possible units).  So I think a Windows user will currently find work flows to them more quickly and easily with beta disabled.

Does anyone here know, other than the mentioned quorum restriction, what actual difference there is at this moment between the beta and non-beta Windows 1.17 application?

 

 

Gary Roberts
Gary Roberts
Moderator
Joined: 9 Feb 05
Posts: 4,914
Credit: 30,269,333,454
RAC: 36,230,254

archae86 wrote:Does anyone

archae86 wrote:
Does anyone here know, other than the mentioned quorum restriction, what actual difference there is at this moment between the beta and non-beta Windows 1.17 application?

I believe the apps themselves are identical.  The purpose for re-introducing the 'beta' version was to provide a mechanism for owners of GPUs with <1022MB GPU RAM (down to about 766MB) to get tasks to see if they could be crunched without 'lack of GPU memory' problems.

I don't know if there is still a restriction whereby a 'beta' quorum task must be partnered with a 'non-beta' task.  There's no point in doing that if the apps are computationally identical.  If you don't need to support GPUs <1022MB, the non-beta app should be used.   If you wish to perhaps support future beta apps whilst still running only the current non-beta app, you could do this with two separate 'locations' (aka venues) with the only difference being whether or not test apps are enabled in the project preferences.  It's pretty quick to change a host's location at the appropriate time.  There's an advantage to not being the 'earliest adopter' of a new test app :-).

 

Cheers,
Gary.

Comment viewing options

Select your preferred way to display the comments and click "Save settings" to activate your changes.