EM searches, BRP Raidiopulsar and FGRP Gamma-Ray Pulsar

Mr P Hucker
Mr P Hucker
Joined: 12 Aug 06
Posts: 838
Credit: 519351379
RAC: 14913

"In addition, about 350

"In addition, about 350 high-performance, specialized graphics cards (GPUs) have been added in parallel with about 2,000 existing cards for specialized applications. These additions increase Atlas' theoretical peak computing performance to more than 2 PFLOP/s."

That doesn't sound much.  I have a £150 GPU that does a theoretical 8 Tflops.  They only have 250 times the power of one of my GPUs, yet they say they have 2350 GPUs.

The UPS beats mine though, I only have 1.5kW.  But with deep cycle leisure batteries it can last for an eternity.  None of the sealed lead acid crap that comes with it.  My neighbour once asked me why all my lights were on during a powercut :-)

Pah!  "Each cable is rated for 10 Gb/s."  I have a 40Gb/s cable between my house and garage.   I can't find switches and network cards that go that fast though :-(

They seem to have neater wiring than Summit though:

 

If this page takes an hour to load, reduce posts per page to 20 in your settings, then the tinpot 486 Einstein uses can handle it.

Mike Hewson
Mike Hewson
Moderator
Joined: 1 Dec 05
Posts: 6590
Credit: 319280089
RAC: 420305

Yup, the technology curve for

Yup, the technology curve for GPUs especially is such that by the time it's installed it is well out of date.

Our server status page puts E@H at 13179.3 TFLOPS ~ 13 PFLOPS ( estimated from collective RAC ).

Cheers, Mike. 

I have made this letter longer than usual because I lack the time to make it shorter ...

... and my other CPU is a Ryzen 5950X :-) Blaise Pascal

Filipe
Filipe
Joined: 10 Mar 05
Posts: 186
Credit: 407818694
RAC: 358906

Peter Hucker of the Scottish

Peter Hucker of the Scottish Boinc Team wrote:

"In addition, about 350 high-performance, specialized graphics cards (GPUs) have been added in parallel with about 2,000 existing cards for specialized applications. These additions increase Atlas' theoretical peak computing performance to more than 2 PFLOP/s."

That doesn't sound much.  I have a £150 GPU that does a theoretical 8 Tflops.  They only have 250 times the power of one of my GPUs, yet they say they have 2350 GPUs.

The UPS beats mine though, I only have 1.5kW.  But with deep cycle leisure batteries it can last for an eternity.  None of the sealed lead acid crap that comes with it.  My neighbour once asked me why all my lights were on during a powercut :-)

Pah!  "Each cable is rated for 10 Gb/s."  I have a 40Gb/s cable between my house and garage.   I can't find switches and network cards that go that fast though :-(

They seem to have neater wiring than Summit though:

 

All our GPU compute power is not enough?

Ian&Steve C.
Ian&Steve C.
Joined: 19 Jan 20
Posts: 3963
Credit: 47164452642
RAC: 65461870

Mike Hewson wrote: (

Mike Hewson wrote:

( estimated from collective RAC ).

and this is the problem. this is not a valid way to measure FLOPS at all. this increase to ~13 PFLOPS is largely from the shifting of systems from O3AS gravitational wave tasks (which awarded much less credit) when GW ran out, over to the only available GPU work, FGRPB1G, which awards ~10x more credit per unit time. so it "looks" like FLOPS increased just because people started earning much more credit with the same devices.

_________________________________________________________________________

Ian&Steve C.
Ian&Steve C.
Joined: 19 Jan 20
Posts: 3963
Credit: 47164452642
RAC: 65461870

Filipe wrote: All our GPU

Filipe wrote:

All our GPU compute power is not enough?

nope. according to the server status page, there are ~13,000 hosts with either an Nvidia or AMD GPU (last 7 days). the vast majority of those are probably slower, low end devices. and there will be some percentage that aren't even crunching or crunching other projects. there are ~1.6 million WUs that still need processed, which equates to ~3.2 million tasks that need to be completed, not even accounting for errors and invalids resulting in resends.

_________________________________________________________________________

Mike Hewson
Mike Hewson
Moderator
Joined: 1 Dec 05
Posts: 6590
Credit: 319280089
RAC: 420305

In a real sense we can always

In a real sense we can always exceed the computing power of E@H. Or Atlas for that matter, or <*insert you favourite supercomputer here*>. The multidimensional parameter spaces for these searches can be explored in different ways to look for new signals & regularities. Nowadays there is such a mass of information available from the various detection devices, so there always a wealth of data. What is discarded as noise for one search template may constitute a detection for another. That's because all manner of radiation traverses the universe and our local space. Suppose for a given investigation the search sensitivity goes like the square root of the signal integration time, then to double your chances of finding something you need to quadruple the time. This is quite typical : you can always 'listen' for longer to access the 'quieter' sources. There will always be something to do here at E@H. ;-)

Cheers, Mike.

I have made this letter longer than usual because I lack the time to make it shorter ...

... and my other CPU is a Ryzen 5950X :-) Blaise Pascal

Gary Roberts
Gary Roberts
Moderator
Joined: 9 Feb 05
Posts: 5872
Credit: 117758188745
RAC: 34819309

Ian&Steve C. wrote:The days

Ian&Steve C. wrote:
The days remaining estimate is derived from the 5-day average WU completed per day metric. That 5-day average has been dropping since it looks like the “Atlas Condor Jobs” super computer ...

I haven't looked at the stats pages for quite a while so the mention of "Atlas Condor Jobs" surprised me since (in years past) the only Atlas entry was for "Atlas AEI Hannover".

So I took a look at the top participants list and can see the Condor entry but none for AEI Hannover.  The current value of total credit for Condor is way too low for everything that Atlas had accumulated so it can't just be a rename.  From memory, Atlas AEI Hannover had a total credit of many tens of billions.

Realising that the table is constructed in RAC order rather than Total Credit, I clicked on the Total heading to get a reordering.  It actually starts from the bottom up so I had to click twice :-).  That caused Atlas AEI Hannover to appear as #2, so it's still in the list but no longer usually visible because its RAC is less than 7M these days.

That was quite a blast from the past as several other 'high producers' also appeared as well.  In particular I remember Gavin who had some high producing machines and was active in the Forums some years ago.  His current RAC is only a shadow of what it was but he must still be around since it's still significant.

It's quite a reminder that there are people who have contributed a lot in the past but whose efforts are no longer normally seen in the default view.  Perhaps the default view should be based on total credit to direct attention to past significant contributions.

Condor doesn't get a look-in (yet) if the ordering is based on Total :-).

Cheers,
Gary.

Mike Hewson
Mike Hewson
Moderator
Joined: 1 Dec 05
Posts: 6590
Credit: 319280089
RAC: 420305

Gary Roberts wrote:From

Gary Roberts wrote:

From memory, Atlas AEI Hannover had a total credit of many tens of billions.

Now that you mention that : I wonder if this total can possibly be all simply due to 'burn-ins' of new nodes ? There's not that many nodes. If correct this suggests to me that in addition to the role of pre- and post-processing E@H work, maybe it is scheduled to do some of the actual work units when it has nothing else/better to do ( using nodes of any age ). I'm pretty sure that once built these supercomputers are kept busy close to 24/7 @ 100%.

Just a thought.

{ Of course, I know who #1 is. But there is a computer at the bottom of the total credit list owned by 'ballen'. I guess that 2005 era computer, probably the very first enrolled, could not now hack the pace. ;-) }

Cheers, Mike.

I have made this letter longer than usual because I lack the time to make it shorter ...

... and my other CPU is a Ryzen 5950X :-) Blaise Pascal

Bernd Machenschalk
Bernd Machenschalk
Moderator
Administrator
Joined: 15 Oct 04
Posts: 4312
Credit: 250646917
RAC: 34343

Atlas run BOINC in two

Atlas runs BOINC in two different ways: There is a single, low (CPU) priority BOINC client running on every node of Atlas, announcing as many CPU cores as the node has. These clients get and run only CPU jobs. The associated account on E@H is "Atlas AEI Hannover", and it is there basically since 2008.

To make use of the growing number of GPUs on Atlas we recently (~May 2022) developed another scheme of submitting E@H (GPU) tasks as low-priority Condor jobs (minimal priority to not interfere with 'real' people using the GPUs on Atlas). The associated E@H account is "Atlas Condor Jobs". There is basically one host(id) for every GPU on Atlas (~2100). The main reason for setting this up was to help finish the "O3AS1" GW analysis. As this has ended now, the "automatic submission" has been turned off, and the RAC of this account should drop noticeable again.

BM

Conan
Conan
Joined: 19 Jun 05
Posts: 172
Credit: 8338672
RAC: 9310

Will any of the new work

Will any of the new work types (or old GPU types be converted to use CPU)  be for CPUs as well, can BRP7 be done on a CPU if not why not? Time to run I suppose will be a limiting factor, memory wont be.

The current Arecibo large work units can take up to 13 hours if a number are run together but that is not a problem (getting less credit than gamma ray #5 is).

I would just like to run some new CPU work.

 

Conan

Comment viewing options

Select your preferred way to display the comments and click "Save settings" to activate your changes.