Can Einstein@Home pass the 1 Petaflop (1000 Teraflop) barrier?

The computing power of Einstein@Home has exceeded 950 Teraflops for the first time since the project was begun in 2005. Based on the rate that our computing power has been growing, I am hopeful that Einstein@Home will pass the 1 Petaflop barrier before the end of 2012. Einstein@Home volunteers: please keep your computers running over the holiday season, and please sign up any new ones that you might receive as a gift!

Bruce Allen
Director, Einstein@Home

Comments

pvismara@inwind.it
pvismara@inwind.it
Joined: 6 Jan 10
Posts: 1
Credit: 1161094
RAC: 0

I hope that my computer that

I hope that my computer that running 24 Hours can help you.
Kind Regards
Paolo

AI4FR
AI4FR
Joined: 30 Sep 11
Posts: 2
Credit: 646415
RAC: 0

It's at 1023.1. Congrats to

It's at 1023.1. Congrats to all.

Mad_Max
Mad_Max
Joined: 2 Jan 10
Posts: 154
Credit: 2213461391
RAC: 389131

Congratulations to all! If

Congratulations to all!

If I have not forgotten anything, Einstein@Home will be a second(after Folding@Home)scientific* distributed computing project in history which has pass the 1 petaflop mark of average (steady) computing speed! If can keep it after FGRP2 CR correction of course.
Many project was very close at some point of time (SETI GPUGrid POEM Milkyway) but failed to keep reached peak speed.

* Projects on cracking ciphers/numbers i do not count as scientific dc.

Richard Haselgrove
Richard Haselgrove
Joined: 10 Dec 05
Posts: 2143
Credit: 2957779637
RAC: 715316

At least for the time being,

At least for the time being, we seem to have reached a plateau at 1026.9

SETI is running this weekend after all (the electrical work was postponed), and the new FGRP2s have been corrected to 70 credits each, so I imagine we'll be here for a little while.

Alex
Alex
Joined: 1 Mar 05
Posts: 451
Credit: 507068263
RAC: 78843

FGRP2 TFLOPS decrease now

FGRP2 TFLOPS decrease now (190) and overall TFLOPS still increase (1034), so I would say it is a true, honest record.

Jeroen
Jeroen
Joined: 25 Nov 05
Posts: 379
Credit: 740030628
RAC: 0

Congratulations on passing 1

Congratulations on passing 1 PFLOPS!

Alex
Alex
Joined: 1 Mar 05
Posts: 451
Credit: 507068263
RAC: 78843
Bernd Machenschalk
Bernd Machenschalk
Moderator
Administrator
Joined: 15 Oct 04
Posts: 4312
Credit: 250509603
RAC: 34419

RE: At least for the time

Quote:
At least for the time being, we seem to have reached a plateau at 1026.9

FTR: 1030.4 TFLOPS for almost 2h now.

BM

BM

Dan
Dan
Joined: 28 Sep 12
Posts: 2
Credit: 331016
RAC: 0

i signed up for this project

i signed up for this project on the 24th and haven't stopped working on it. yay me i helped do something!!!!!!

geonerd
geonerd
Joined: 8 Nov 04
Posts: 10
Credit: 370024
RAC: 0

Oh, no! ;) Throughput has

Oh, no! ;)
Throughput has dropped to 0.9959 PF, presumably(?) as a result of the recent FGRP2 credit reduction. How large is the averaging window? Any estimates of what the new equilibrium will be, and how long it will take for E@H to re-pass 1.0 PF?

ggesmundo
ggesmundo
Joined: 3 Jun 12
Posts: 31
Credit: 18699116
RAC: 0

I suspect the drop is more do

I suspect the drop is more do to the BRP4 validators being off line until tomorrow.

Richard Haselgrove
Richard Haselgrove
Joined: 10 Dec 05
Posts: 2143
Credit: 2957779637
RAC: 715316

RE: I suspect the drop is

Quote:
I suspect the drop is more do to the BRP4 validators being off line until tomorrow.


Indeed so. We're back above 1000 TFLOPS (1000.8, to be exact), and there are still 31,448 BRP4 tasks in the validation queue.

Gary Roberts
Gary Roberts
Moderator
Joined: 9 Feb 05
Posts: 5872
Credit: 117620452949
RAC: 35233793

RE: RE: I suspect the

Quote:
Quote:
I suspect the drop is more do to the BRP4 validators being off line until tomorrow.

Indeed so. We're back above 1000 TFLOPS (1000.8, to be exact), and there are still 31,448 BRP4 tasks in the validation queue.


There seems to be something quite peculiar going on. We are now back to 1022 and the BRP4 backlog is 9082 so a *lot* of BRP4 tasks have now been cleared. That's fine you say - the spurt is coming from the spurt of BRP4 validations. If that is true, then why has the BRP4 individual TFLOPS continued to go down. Its now 511 and I'm sure that it was about 550-560 (or even more) about a day ago.

I'm only going on memory - I didn't copy anything down - but I've been watching the decline in the FGRP2 value as a result of the credit adjustment and have been expecting the total figure to really drop under the combined effect of both pulsar searches having (for the moment) significantly lower numbers.

Someone can correct me if my memory is faulty, but just after new year when 1 PF was reached, we had the following very approximate figures.

  • * BRP4 -- 600+ (possibly around 630)
    * FGRP2 - 200+ (possibly around 210)
    * S6LV1 - 200- (possibly around 160)

Today the figures are (Total of 1022)

  • * BRP4 -- 511
    * FGRP2 - 158
    * S6LV1 - 353

So, for me, the conundrum is why has the S6LV1 score approximately doubled in the last week? It's rising so fast that it, alone, has caused the TFLOPs to rebound from the 960s to 1022 in the last day or so - or so it seems??

Maybe someone has been recording the actual figures and can correct me.

Cheers,
Gary.

Bernd Machenschalk
Bernd Machenschalk
Moderator
Administrator
Joined: 15 Oct 04
Posts: 4312
Credit: 250509603
RAC: 34419

RE: If that is true, then

Quote:
If that is true, then why has the BRP4 individual TFLOPS continued to go down.

The "individual FLOPS" on the server status page are highly misleading, especially for applications that feature GPU versions. Shown there are really CPU flops based on CPU time of reported results and finally scaled such that the sum equals the total project FLOPS.

So if the individual FLOPS of one application is declining, it could (and in this case probably does) just mean that we get more tasks (i.e. credit) from the GPUs of that search.

BM

BM

Bernd Machenschalk
Bernd Machenschalk
Moderator
Administrator
Joined: 15 Oct 04
Posts: 4312
Credit: 250509603
RAC: 34419

RE: This is what I'm

Quote:
This is what I'm planning to do: Link the application with zlib, so it can easily handle both gzipped and plain data, then add gzip as the last step of pre-processing. Shouldn't require much of a change.

This feature is currently being tested on the BRP4 applications over on Albert@Home (the current OpenCL app versions aren't working for different reasons).

If you're interested in that feature, you my want to help us testing.

BM

BM

Mad_Max
Mad_Max
Joined: 2 Jan 10
Posts: 154
Credit: 2213461391
RAC: 389131

2 Gary Roberts CPU TFLOPS for

2 Gary Roberts
CPU TFLOPS for individual subproject are not accurate because BOINC still do not handle speed of GPU apps correctly.
If you want estimate distribution between subproject better look for score. For current moment:
BRP4 - 3,831,740 ~ 71%
S6LV1 - 1,049,780 ~ 19%
FGRP2 - 543,301 ~ 10%

Almost all speed of BRP4 come now from GPU calculations and GPU can not crunch other subprojects so seems it is a main cause of such uneven distribution...

P.S.
BTW seems S6LV1 and S6BucketLVE statistic is mixed up? I think S6LV1 is already finished and replaced by S6BucketLVE?

Gary Roberts
Gary Roberts
Moderator
Joined: 9 Feb 05
Posts: 5872
Credit: 117620452949
RAC: 35233793

RE: BTW seems S6LV1 and

Quote:
BTW seems S6LV1 and S6BucketLVE statistic is mixed up? I think S6LV1 is already finished and replaced by S6BucketLVE?


S6BucketLVE has not started yet - take a look in the 'Workunit and Tasks' block - all the values are zero. In contrast, S6LV1 suddenly has over 200K tasks 'ready to send' and that is the clue as to what is going on.

A project is listed as '100%' when there are no more tasks to generate and not when all tasks have actually been completed. It is common practice at E@H to generate and dump into the database all remaining tasks for GW runs when the run is getting quite close to completion. That has now happened. It will actually take quite a few days yet for the 200K remaining 'primary' tasks to be distributed and then it will take many weeks to months for all the 'secondary' and 'tertiary', etc, tasks (required because of failure of primary tasks) to be sent out and returned. It will be quite a long time before S6LV1 is 'finished'.

The new run will be started shortly and this will tend to divert resources away from 'finishing' the previous run. This can create a problem for volunteers who have stringent monthly bandwidth caps. Some of that bandwidth will need to go to new large data files required for the new run. An increasing amount of bandwidth will be required for blocks of large data files (for the old run) that will need to be sent if a host is allocated a 'resend' task (for a failed primary task) for a frequency bin other than that for which it has the appropriate large data files already on board. When all 200K primary tasks are gone, the LV1 'diet' will be resends only (for however long it takes for the run to 'finish'.

If bandwidth is not a problem, it would really be appreciated if hosts are left 'available' to accept resends. If bandwidth is a problem, check out your E@H preferences and you will find there are now separate listings for both GW runs.

Cheers,
Gary.

Neil Newell
Neil Newell
Joined: 20 Nov 12
Posts: 176
Credit: 169699457
RAC: 0

RE: ...it would really be

Quote:
...it would really be appreciated if hosts are left 'available' to accept resends.

As a relative newbie, does that mean doing something special? (or just not fiddling :) ).

Mike Hewson
Mike Hewson
Moderator
Joined: 1 Dec 05
Posts: 6588
Credit: 317154831
RAC: 367389

RE: RE: ...it would

Quote:
Quote:
...it would really be appreciated if hosts are left 'available' to accept resends.

As a relative newbie, does that mean doing something special? (or just not fiddling :) ).


When we are at the end of one run and ramping up a new one, then there is a lot of tidying up to do with respect to 'loose ends' or work units not satisfactorily completed for the retiring run. These work units often come from rather different places in our search parameter space with the effect of sometimes requiring large downloads for users. This is because here at E@H we use 'locality scheduling' which has the marvellous benefit of examining what data a given host already has and, if possible, making good use of that circumstance by allocating work units relevant to that already held data. But if we jump around parameter space then that benefit is lost as new data sets relevant to disparate parameter choices need downloading. Some users prefer not to be involved in that scenario and thus come back to E@H later on, after such work units are dealt with and the new run is well on the go.

Cheers, Mike.

I have made this letter longer than usual because I lack the time to make it shorter ...

... and my other CPU is a Ryzen 5950X :-) Blaise Pascal

Gary Roberts
Gary Roberts
Moderator
Joined: 9 Feb 05
Posts: 5872
Credit: 117620452949
RAC: 35233793

RE: RE: ...it would

Quote:
Quote:
...it would really be appreciated if hosts are left 'available' to accept resends.

As a relative newbie, does that mean doing something special? (or just not fiddling :) ).


It means, "Don't disable S6LV1 in your prefs just because you think it's finished" ;-).

The nice thing about 'appropriate' defaults for new preferences is that most people will be unaware of the change. So most will end up doing the 'right' thing :-). Since I may have alerted people to a new pref setting they could play with, (this is a popular thread), I thought I'd better put in a 'plug' for the important job of cleaning up the resends - by not taking advantage of the new pref and bailing out.

Cheers,
Gary.

Mad_Max
Mad_Max
Joined: 2 Jan 10
Posts: 154
Credit: 2213461391
RAC: 389131

2 Gary Roberts Ok, thanks for

2 Gary Roberts
Ok, thanks for explanation.
I just think "S6LV1 search progress" blok of stats count WU as "Already done" only after WU pass validation (i.e. at least 2 matching result in quorium).
And it was strange to see from one side 100% completed, and on the other >200k tasks ready to send.
If it mark task as "done" after it pass from WU generator to DB it can explain "strange" statistic.

But then how to explain this message poping up(all last day) in the logs of of my BOINC client?
................
14/01/2013 01:32:31 | Einstein@Home | No work is available for Gravitational Wave S6 LineVeto search (extended)
14/01/2013 01:32:31 | Einstein@Home | No work is available for Gravitational Wave S6 LineVeto search
...............

Server offer for me only BRP4 and FGRP2 Wus last days. When I try to opt-out of them in prefs then server respond by:
14/01/2013 03:42:16 | Einstein@Home | Requesting new tasks for CPU
14/01/2013 03:42:19 | Einstein@Home | Scheduler request completed: got 0 new tasks
14/01/2013 03:42:19 | Einstein@Home | No work sent
14/01/2013 03:42:19 | Einstein@Home | No work is available for Gravitational Wave S6 LineVeto search (extended)
14/01/2013 03:42:19 | Einstein@Home | No work is available for Gravitational Wave S6 LineVeto search
14/01/2013 03:42:19 | Einstein@Home | see scheduler log messages on http://einstein.phys.uwm.edu//host_sched_logs/6180/6180386
14/01/2013 03:42:19 | Einstein@Home | No work available for the applications you have selected. Please check your preferences on the web site.

Gary Roberts
Gary Roberts
Moderator
Joined: 9 Feb 05
Posts: 5872
Credit: 117620452949
RAC: 35233793

RE: If it mark task as

Quote:
If it mark task as "done" after it pass from WU generator to DB it can explain "strange" statistic.


This would be a bit clearer if the 'search progress' block had different headings which said, "Total Needed", "Already Generated" and "Still to Generate" or something like that. It still leaves the (pretty much unsolvable) problem of how best to indicate that there will be a large (but unknown) number of resends to be sent out later. You may not know a resend is needed until a 14 day deadline passes and this cycle can be repeated quite a few times. If people have all moved on to new and greener pastures, it will slow down the cleanup. In the end, to shorten the agony, the Devs may well send out extra copies of resends just to make sure at least one is returned in a timely fashion. Another possible choice is to do the final cleanup 'in-house'.

Quote:
But then how to explain this message poping up(all last day) in the logs of of my BOINC client?
................
14/01/2013 01:32:31 | Einstein@Home | No work is available for Gravitational Wave S6 LineVeto search (extended)
14/01/2013 01:32:31 | Einstein@Home | No work is available for Gravitational Wave S6 LineVeto search
...............


Well, the first one is a 'no-brainer' - there genuinely was no work to send. The second one needs a bit more explanation.

I've seen this quite regularly on certain hosts over the last year or three :-). It doesn't happen on all my hosts but I use all available 'venues' and I do have a mixture of server-side prefs on most and local over-ride prefs on some and I haven't sat down systematically to diagnose exactly what is going on. I did mention it in a message a long time ago and Bernd did make an explanation which I don't think I understood at the time and which I've certainly long forgotten now anyway.

I didn't pursue the matter at the time because the client was always able to (eventually) get what it wanted by being a bit persistent with the server. Maybe it would ask a few times for a certain type of work and be refused with a 'no work available' answer but then, after yet another request to the server, the request would suddenly be granted.

I don't know for sure, but I think the explanation goes something like this. There are (I think) server-side settings that control what the scheduler 'prefers' to give you when you request work. To keep it simple, let's assume that for 40% of requests, the scheduler wants to send you FGRP work and for 60% of requests the scheduler wants to send GW work. If your prefs allow any type of CPU tasks or if you have said 'yes' to the pref setting about 'other apps' if work for your preferred app is not available, I think you will always get what the scheduler has chosen to send. If you have excluded one of the apps and said 'no' to 'other apps' I think you may very well get the 'no work available' message, even though there actually is work available. It's possible you may need to go through this cycle a few times until the scheduler's choice of what it prefers to send actually coincides with the search that your prefs allow and so you get the work. I have seen this sort of cycle so many times (with the ultimate happy ending) that I don't even worry about it anymore.

The full explanation must be more complicated than this because it doesn't happen on every host that always asks for just one particular type of work. It seems to happen on hosts that have complicated preference settings, particularly when there are some local over-ride settings in play. I just haven't felt the urgency to attempt to document it better, particularly as the server code is rather old and will eventually be updated anyway.

Quote:
Server offer for me only BRP4 and FGRP2 Wus last days. When I try to opt-out of them in prefs then server respond by:
14/01/2013 03:42:16 | Einstein@Home | Requesting new tasks for CPU
14/01/2013 03:42:19 | Einstein@Home | Scheduler request completed: got 0 new tasks
14/01/2013 03:42:19 | Einstein@Home | No work sent
14/01/2013 03:42:19 | Einstein@Home | No work is available for Gravitational Wave S6 LineVeto search (extended)
14/01/2013 03:42:19 | Einstein@Home | No work is available for Gravitational Wave S6 LineVeto search
14/01/2013 03:42:19 | Einstein@Home | see scheduler log messages on http://einstein.phys.uwm.edu//host_sched_logs/6180/6180386
14/01/2013 03:42:19 | Einstein@Home | No work available for the applications you have selected. Please check your preferences on the web site.


Be persistent with extra work requests. The scheduler will eventually give you what you want. At least it does for me.

Cheers,
Gary.

Holmis
Joined: 4 Jan 05
Posts: 1118
Credit: 1055935564
RAC: 0

RE: ... I did mention it in

Quote:

...
I did mention it in a message a long time ago and Bernd did make an explanation which I don't think I understood at the time and which I've certainly long forgotten now anyway.

I didn't pursue the matter at the time because the client was always able to (eventually) get what it wanted by being a bit persistent with the server. Maybe it would ask a few times for a certain type of work and be refused with a 'no work available' answer but then, after yet another request to the server, the request would suddenly be granted.

I don't know for sure, but I think the explanation goes something like this. There are (I think) server-side settings that control what the scheduler 'prefers' to give you when you request work. To keep it simple, let's assume that for 40% of requests, the scheduler wants to send you FGRP work and for 60% of requests the scheduler wants to send GW work. If your prefs allow any type of CPU tasks or if you have said 'yes' to the pref setting about 'other apps' if work for your preferred app is not available, I think you will always get what the scheduler has chosen to send. If you have excluded one of the apps and said 'no' to 'other apps' I think you may very well get the 'no work available' message, even though there actually is work available. It's possible you may need to go through this cycle a few times until the scheduler's choice of what it prefers to send actually coincides with the search that your prefs allow and so you get the work. I have seen this sort of cycle so many times (with the ultimate happy ending) that I don't even worry about it anymore.

The full explanation must be more complicated than this because it doesn't happen on every host that always asks for just one particular type of work. It seems to happen on hosts that have complicated preference settings, particularly when there are some local over-ride settings in play. I just haven't felt the urgency to attempt to document it better, particularly as the server code is rather old and will eventually be updated anyway.

*Disclaimer: This could very well be totally wrong or outdated info.

I have a faint memory of an explanation that the server is a bit reluctant to switch you over to a new full set of large data-files if that could be avoided, it waits for a host that already has the right files. The time the server is willing to wait is limited and that is why the request eventually succeeds, probably giving you a larger download. Or maybe some resends has been generated while you wait...

Mike Hewson
Mike Hewson
Moderator
Joined: 1 Dec 05
Posts: 6588
Credit: 317154831
RAC: 367389

RE: * June 21st 2013 at

Quote:
* June 21st 2013 at 05:04 UTC

Sigh. That's blown ....

Cheers, Mike.

I have made this letter longer than usual because I lack the time to make it shorter ...

... and my other CPU is a Ryzen 5950X :-) Blaise Pascal

Filipe
Filipe
Joined: 10 Mar 05
Posts: 186
Credit: 406330184
RAC: 357186

Uau. We are down to

Uau. We are down to 999Tflops. What happened?

Tom*
Tom*
Joined: 9 Oct 11
Posts: 54
Credit: 366729484
RAC: 0

FGRP3 happened

FGRP3 happened

Maximilian Mieth
Maximilian Mieth
Joined: 4 Oct 12
Posts: 130
Credit: 10296863
RAC: 4572

We're back to >1000

We're back to >1000 Teraflops!

AMONRA
AMONRA
Joined: 11 Sep 10
Posts: 4
Credit: 1069352
RAC: 0

That's very positive! The

That's very positive! The more Teraflops donated , the better for science. Let's hope Einstein@Home will pass 1 exaflop barrier in near future:)))) Now, this sounds as a joke but soon - who knows... http://www.industrytap.com/exaflop-computing-will-save-world-can-afford/15485

Bikeman (Heinz-Bernd Eggenstein)
Bikeman (Heinz-...
Moderator
Joined: 28 Aug 06
Posts: 3522
Credit: 727730457
RAC: 1228899

RE: Uau. We are down to

Quote:
Uau. We are down to 999Tflops. What happened?

There are also usually seasonal fluctuations: e.g. summer on the northern hemisphere.

As for exaflops: I'm a bit skeptical when (like in the article you linked), people draw trend lines in log-scale into the future, essentially predicting that an exponential growth will just go on and on and on...it won't, exponential growth is never sustainable in the real world of limited resources. But at one point there might be "quantum leaps" in computing power of course .. pun intended.

Cheers
HB

Aurel
Aurel
Joined: 21 Sep 12
Posts: 29
Credit: 2502988
RAC: 0

Floating point speed (from

Floating point speed 
(from recent average credit of all users)	1666.5 TFLOPS

:)
The bigest supercomputer in Germany has 5,9 Petaflops. It´s JUQUEEN at the Institute for Advanced Simulation (IAS) [Juelich Supercomputing Centre JSC]

Can Einstein@Home pass the 7 Petaflops barrier?

Stef
Stef
Joined: 8 Mar 05
Posts: 206
Credit: 110568193
RAC: 0

Can it pass 7TFlops? Yes.

Can it pass 7TFlops? Yes.
Will it? Probably not within the next years ...

But i'm a bit confused about the number on the server status page.
In the "Workunits and tasks"-Part it says 1519 CPU-TFLOPS. But it says the same value on the overall TFLOPS in the "Computing" part, so where are the GPU TFLOPS gone?

astro-marwil
astro-marwil
Joined: 28 May 05
Posts: 532
Credit: 646936543
RAC: 1114868

Hallo! Last night we passed

Hallo!
Last night we passed firsttime the 2 PFLOPS, just 2 jears 5 month and 9 days after we reached 1 PFLOPS on Jan 3rd 2013.
Congratulation to all of us.

So this is well within the rule of doubling crunching power every 2 years.
Hopefully we will have 4 PFLOPS in 2017 ?

Kind regards and happy crunching
Martin

Logforme
Logforme
Joined: 13 Aug 10
Posts: 332
Credit: 1714373961
RAC: 0

RE: So this is well within

Quote:
So this is well within the rule of doubling crunching power every 2 years.
Hopefully we will have 4 PFLOPS in 2017 ?


I wonder how much of the increase is from crunchers using faster hardware and how much is from the new faster v1.52 BRP app.
With Moore's law and Bikeman's mad CUDA/OpenCL skillz we're sure to break 4 PFLOPS :)