Why BRPS non cuda files ?

astro-marwil
astro-marwil
Joined: 28 May 05
Posts: 434
Credit: 150,856,133
RAC: 35,978
Topic 195563

I´m using a mall labtop with a Geforce 310M GPU. To crunch a BRPS 1.05(cuda) file takes 3.5 to 5h, a BRPS 1.04(non cuda) file 35 to 38h. So there is a factor of about 8 between them and for each validated one get 500 Cobblestone. A S5GC1HF file takes 10 – 13h to crunch and one get 250 Cobblestones for it. So for a BRPS 1.04(non cuda) file I can crunch about 3 S5GC1HF files, which gives me 750 Cobblestone instead of only 500 for the BRPS 1.04. So, I have a 50% higher interest to crunch S5GC1HF than BRPS 1.04(non cuda). Why sending out BRPS 1.04(non cuda) files to people who are running obviously cuda??? In the E@H preferences there is no adjustment. I helped my self by killing this task.

Kind regards
Martin

Jord
Joined: 26 Jan 05
Posts: 2,949
Credit: 5,542,375
RAC: 7,450

Why BRPS non cuda files ?

See http://einsteinathome.org/node/195542&nowrap=true#109024 and Mike's answers after that, the main one being "Not everyone has a suitable ( NVidia based, sufficient free memory ..... ) GPU, but many still want to be in on the possible discovery and 'fame' of a new pulsar detected using their computer! :-)"

PS: When you edit the project preferences you do have a choice of running "Binary Radio Pulsar Search" or not. Just uncheck if you do not want to receive them.

Bikeman (Heinz-Bernd Eggenstein)
Bikeman (Heinz-...
Moderator
Joined: 28 Aug 06
Posts: 3,515
Credit: 446,321,776
RAC: 52,565

Hi! Jord, I think the

Hi!

Jord, I think the point of the OP is like this: It's ok to have BRP3 tasks on CPU (for the reasons you mentioned), but why should a PC that can (and does) run BRP3 CUDA tasks also receive BRP3 tasks for the CPU? Wouldn't it make more sense to leave those tasks to those who don't have a CUDA card, and instead send GC1HF tasks to the GPU crunchers?

I think this is a valid point and have forwarded the idea to the devs. But note that this would involve quite a bit of customized logic for the scheduler, and the scheduler needs to be very, very fast in order to avoid congestion when handling requests of tens of thousands of active volunteers.

If you feel very strongly about avoiding BRP3 CPU tasks, you might want to look into the possibility of using an app_info.xml file like the one discussed in this forum in the thread about running more than one CUDA task at the same time. The trick here is to not include the CPU BRP3 app, so it will not be used.

CU
HB

Jord
Joined: 26 Jan 05
Posts: 2,949
Credit: 5,542,375
RAC: 7,450

RE: Jord, I think the point

Quote:
Jord, I think the point of the OP is like this: It's ok to have BRP3 tasks on CPU (for the reasons you mentioned), but why should a PC that can (and does) run BRP3 CUDA tasks also receive BRP3 tasks for the CPU? Wouldn't it make more sense to leave those tasks to those who don't have a CUDA card, and instead send GC1HF tasks to the GPU crunchers?


Although it may be a valid point, there's the "what if" scenario behind it. What if the CUDA card starts throwing out garbage? At least during the time that it throws garbage out, the CPUs would still be able to run the BRPs they got appointed. Until the user manages a reboot, or replaces the drivers, or replaces the card. Depending on what causes the garbage...

Bikeman (Heinz-Bernd Eggenstein)
Bikeman (Heinz-...
Moderator
Joined: 28 Aug 06
Posts: 3,515
Credit: 446,321,776
RAC: 52,565

True. It would be difficult

True. It would be difficult to include all that in the scheduler code. Maybe the best would be to allow users to opt out of specific apps taking the plan class into account (is that possible)? So people could assign their CUDA hosts to a certain venue and then effectively instruct BOINC not to send BRP3 CPU tasks to those hosts if they wish that.

CU
HB

Mike Hewson
Mike Hewson
Moderator
Joined: 1 Dec 05
Posts: 6,075
Credit: 116,549,197
RAC: 47,788

Well, this is what Bernd was

Well, this is what Bernd was alluding to recently. The issue of mix isn't so much that of applications, but of user involvement - the 80/20 rule of features or however you want to frame that. I do especially like Mr Anderson's idea of a higher level interface that one can depart into customisation from. The 'what if' problem is trying to understand what users are 'really' asking for when a certain feature/ability is mentioned ie. what is materially different about their machine context/setup? The trend I think is for contributors to want to be either more involved in the detail and/or expect an optimistic but trouble free use of their machines if not.

Cheers, Mike.

I have made this letter longer than usual because I lack the time to make it shorter. Blaise Pascal

astro-marwil
astro-marwil
Joined: 28 May 05
Posts: 434
Credit: 150,856,133
RAC: 35,978

In the BOINC preferences for

In the BOINC preferences for E@H are 4 check boxes for selecting different projects. The check box for Global Correlations S5 search #1 is obsolete now. So, why not using this to distinguish between BRPS-cuda and -non-cuda files? As Bikeman pointed out, this check boxes are controlling all computers belonging to this participant. Is it eventually possible to transfer these check boxes to the tableau BOINCmanager/Extras/Adjustments? By a simple and quick bit comparison could then be decided for each computer, what files are wanted/needed. But of course, this would be a more major change in the software.

There is another aspect of this non-cuda files, but I have to tell some around of it.
1) With my laptop I don´t have at all-time access to the internet and seldom but sometimes the servers are down. So I like to have always files to become crunched for a minimum of 3 days in stock.
2) The system now don´t tell BRPS-cuda and –non-cuda files apart. They become summarized, even though they are belonging to different processors. This I concluded from some observations.
3) The system is cautious and predicts at the beginning much longer crunching times than actual realized and adapts to the real values very, very slowly, but it jumps very quickly to the much higher values for all files, if a so long lasting file as the non-cuda ones comes up.
For example: the non-cuda files at me are predicted to become crunched within 48h, but they finish within 35 -38h. The cuda files meanwhile become finished within about 5h. So 1 non-cuda file downloaded avoids downloading about of 9 cuda files, or the work of a bit less than 2 days. In fact I only have 3 to 4 cuda files in stock, a reserve of about 0.8 days of crunching only. (In reality it´s a bit more complicated but it ends up at the same.) I feel, this is too low.
So also from this point of view, it seems to be desirable to avoid downloading unnecessary non-cuda files.

Kind regards
Martin

Richard Haselgrove
Richard Haselgrove
Joined: 10 Dec 05
Posts: 1,933
Credit: 171,904,067
RAC: 384,176

Martin has actually put his

Martin has actually put his finger on a problem that I was meaning to investigate and report on. His thread is as good a place as any.

The predicted running time of a task is controlled by many things. From the server, the main ones are the (Floating Point Operations per Second) assessed by the server, and sent out to our computers in the section describing each Einstein application, and the which describes the size of each workunit we receive.

On our computers, these two figures are combined with the DCF (Duration Correction Factor) to generate an esimate in hh:mm:ss.

Looking at one of my CUDA-equipped Q6600s (running at stock 2.4GHz), I see:

2353795748.577250
4820108604.217452

GeForce 9800 GT
1500000
14
GeForce 9800 GT (driver version 26089, CUDA version 3020, compute capability 1.1, 512MB, 336 GFLOPS peak)


    einstein_S5R6
    301
    2360142320.733454

einsteinbinary_ABP1
312
2360142320.733454

einstein_S5R6
301
3777255014.891032
S5R6sse2

einsteinbinary_ABP2
308
2362101809.315894

einstein_S5GC1
302
3779362894.905431
S5GCESSE2

einstein_S5GC1HF
306
3777708006.279435
S5GCESSE2

einsteinbinary_BRP3
105
4722135007.849294
BRP3cuda32

einsteinbinary_BRP3
104
2361067503.924647


So, it looks as if Einstein is assuming:

The speed of my Q6600 running code without SIMD optimisation is pretty much the sticker value.
Running SIMD SSE2 optimisation adds 60% to the sticker speed.
My CUDA card is twice the speed of my CPU.

The first two assumptions are so good that I hadn't noticed them before, but the third assumption (that my CUDA card is twice the speed of my CPU) is wildy out. It understates the real speed of my card - running BRP3 v1.05, at least - by a factor of nearly seven. A task estimated at 12 hours finishes in 1.8 hours.

So, how is Einstein estimating the speed of CUDA cards? Is the 2 x CPU speed standard across the board, or is it derived in some way from the card specifications? If the latter, what's the calculation - and is it as bad for everyone else as it is for me?

This is a problem because the BOINC client only has one DCF value available, and it has to be applied to both CPU and CUDA tasks. New BOINC server code - newer than that currently in operation at Einstein - can dynamically track the effective speed of each application/host/resource on the server, and apply the necessary corrections at the server to each task as it is issued. But this new code (which I thoroughly welcome) is intimately bound in with the "new credit" scoring system (which I don't, not yet at least). This may be, at least partly, why Einstein hasn't adopted the new code yet.

So, what can be done to help Martin achieve equally-accurate estimates for both BRP3/cpu and BRP3/cuda tasks, at least while there is no fine control over the work chosen and accepted for crunching?

Bernd Machenschalk
Bernd Machenschalk
Moderator
Administrator
Joined: 15 Oct 04
Posts: 3,906
Credit: 191,160,812
RAC: 50,365

RE: So, how is Einstein

Quote:
So, how is Einstein estimating the speed of CUDA cards? Is the 2 x CPU speed standard across the board, or is it derived in some way from the card specifications?

Thanks for reminding me of this. The 2.0 factor was the result of measurements with the 1.04 CUDA App and incorporates the slowdown (wrt 1.05) arising from the priority problem. With 1.05 we should definitely raise this estimation again.

BM

BM

Bernd Machenschalk
Bernd Machenschalk
Moderator
Administrator
Joined: 15 Oct 04
Posts: 3,906
Credit: 191,160,812
RAC: 50,365

RE: In the BOINC

Quote:
In the BOINC preferences for E@H are 4 check boxes for selecting different projects. The check box for Global Correlations S5 search #1 is obsolete now. So, why not using this to distinguish between BRPS-cuda and -non-cuda files?

Changing the names or number of checkboxes is easy enough. The problem with using the existing "application opt-out" mechanism would be that BRP3 CUDA would need to become an application separate from BRP3 CPU, which would mean that BRP3 CUDA results would only be validated against other BRP3 CUDA results. Considering the numerical reliability of CUDA this is not what we want.

But I think this is a pretty valid request and we're working on finding a way to implement the requested feature.

BM

BM

Bernd Machenschalk
Bernd Machenschalk
Moderator
Administrator
Joined: 15 Oct 04
Posts: 3,906
Credit: 191,160,812
RAC: 50,365

See here. BM

See here.

BM

BM

Comment viewing options

Select your preferred way to display the comments and click "Save settings" to activate your changes.