FGRPB1G GPU work for both NVIDIA and AMD GPUs not available

Betreger

Joined: 25 Feb 05

Posts: 992

Credit: 1645892017

RAC: 659092

2 Mar 2019 20:31:57 UTC

Topic 218299

(moderation:

)

Any thoughts on why GPU work is not being available?

Keith Myers

Joined: 11 Feb 11

Posts: 5061

Credit: 19278443751

RAC: 7262916

Maybe no staff around this

2 Mar 2019 21:30:29 UTC

Message 169876

(moderation:

)

Maybe no staff around this weekend to load more work to the FGRPB1G splitters? I believe I read somewhere else in another forum thread that the percent done is not a reflection on the actual amount of work left. That is in a post worrying about the project status page. And someone, Bernd? posted that the data coming off the sky surveys is continuously being added to the splitters.

Gary Roberts

Moderator

Joined: 9 Feb 05

Posts: 5888

Credit: 119682987265

RAC: 25324263

Betreger wrote:Any thoughts

2 Mar 2019 22:09:03 UTC

Message 169878

(moderation:

)

Betreger wrote:

Any thoughts on why GPU work is not being available?

There are a finite number of tasks associated with a given data file. The current file was LATeah1046L.dat which was first issued around late 27th Feb. These files tend to 'last' around 4 days which makes a new file pretty much 'due' right around now.

My guess is that all primary tasks for LATeah1046L.dat have been issued and whatever procedure is in place to transition to the next data file simply hasn't happened - for whatever reason.

I'd be extremely surprised if it's anything to do with completely running out of data. As Keith mentions, new data from the large area telescope (LAT) on board the Fermi satellite continues to be available so that the stats on the server status page to do with availability of work for this particular search can never reflect the true state of affairs. As they say, "It ain't over till the fat lady sings", and I haven't heard a peep out of her yet :-).

Worst case scenario - we might have to wait until Monday for this to be fixed. Best case scenario - new work might be stuck in some pipeline and some kind soul might intervene to clear the blockage. There hasn't been any sign of abnormal usage that I know of and the Staff do try to make sure that available work will outlast the weekend so maybe this could get fixed relatively quickly.

Cheers,
Gary.

Jim1348

Joined: 19 Jan 06

Posts: 463

Credit: 257957147

RAC: 0

I think a more interesting

2 Mar 2019 22:24:57 UTC

Message 169881

(moderation:

)

I think a more interesting question, which I have not seen addressed recently, is what happens to O1OD1 in 22.9 days?

Betreger

Joined: 25 Feb 05

Posts: 992

Credit: 1645892017

RAC: 659092

Gary I hope you are correct,

2 Mar 2019 23:34:49 UTC

Message 169882 in response to message 169878

(moderation:

)

Gary I hope you are correct, with Seti being broken and this I had to attach to GPUGRID in order to keep busy.

mmonnin

Joined: 29 May 16

Posts: 292

Credit: 3444726540

RAC: 2099

MilkyWay is also down hard.

3 Mar 2019 1:22:04 UTC

Message 169885

(moderation:

)

MilkyWay is also down hard. Crazy weekend if 3 major GPU projects are down at once.

Cherokee150

Joined: 13 May 11

Posts: 24

Credit: 909060996

RAC: 299005

With multiple BOINC projects

3 Mar 2019 7:07:06 UTC

Message 169890

(moderation:

)

With multiple BOINC projects down at the same time, the probability that they all have a common cause is extremely high. Since the only thing all projects have in common is BOINC, the problem very likely comes from the BOINC system itself. The message I believe provides the best clue is in the message I am getting from SETI. At the end of each attempt to contact the server I get the following message:

3/3/2019 0:32:18 | SETI@home | [error] No scheduler URLs found in master file

I have seen this message before. It was a number of years ago, so I don't remember all the details, but I do believe it refers to a master list of send and receive addresses that BOINC uses to route all communications to and from each project. As I recall, the BOINC client software refreshes this from the BOINC host periodically. I believe it is once every 24 hours. If the BOINC host does not respond, the BOINC client, under its current logic, does not proceed until it can get the new list. It then sets a timer before trying again.

This would explain why, one by one, multiple projects are going down. It also seems plausible because the BOINC project, servers and software are basically run by the SETI people at Berkeley. SETI was the first to go down. It is most likely, therefore, that a problem at Berkeley, whether hardware or software, is affecting both the BOINC and SETI servers.

Unless someone like Kittyman or one of the other people with close connections with the Berkeley staff are lucky enough to get a quick answer from them during this major disaster, I guess we will have to sit back and wait until things get back up before we learn the whole story.

If anyone has more input into this theory, please let us know as soon as possible.

Thanks!

MarkJ

Joined: 28 Feb 08

Posts: 437

Credit: 139002861

RAC: 0

I think you’ll find the

3 Mar 2019 9:40:45 UTC

Message 169893

(moderation:

)

I think you’ll find the master url is per project. That way each project can manage their own. There is a url used to check connectivity when comms fail, it defaults to google.com. You can disable it or use a different URL via cc_config.

There is a separate project list (all projects) which is maintained at boinc.berkeley.edu, but not being able to download it would just mean it can’t update until next time. That is the BOINC server, not the Seti servers. The BOINC message boards are still working, suggesting there is no issues with their server.

BOINC blog

Jim1348

Joined: 19 Jan 06

Posts: 463

Credit: 257957147

RAC: 0

Also, MilkyWay has been

3 Mar 2019 9:48:50 UTC

Message 169895 in response to message 169893

(moderation:

)

Also, MilkyWay has been falling apart for weeks. I think they have hit rock bottom. Most BOINC projects (including GPUGrid) are still up.

mikey

Joined: 22 Jan 05

Posts: 12941

Credit: 1884483140

RAC: 27898

Cherokee150 wrote:With

3 Mar 2019 12:49:37 UTC

Message 169899 in response to message 169890

(moderation:

)

Cherokee150 wrote:

With multiple BOINC projects down at the same time, the probability that they all have a common cause is extremely high. Since the only thing all projects have in common is BOINC, the problem very likely comes from the BOINC system itself. The message I believe provides the best clue is in the message I am getting from SETI. At the end of each attempt to contact the server I get the following message:

3/3/2019 0:32:18 | SETI@home | [error] No scheduler URLs found in master file

I have seen this message before. It was a number of years ago, so I don't remember all the details, but I do believe it refers to a master list of send and receive addresses that BOINC uses to route all communications to and from each project. As I recall, the BOINC client software refreshes this from the BOINC host periodically. I believe it is once every 24 hours. If the BOINC host does not respond, the BOINC client, under its current logic, does not proceed until it can get the new list. It then sets a timer before trying again.

This would explain why, one by one, multiple projects are going down. It also seems plausible because the BOINC project, servers and software are basically run by the SETI people at Berkeley. SETI was the first to go down. It is most likely, therefore, that a problem at Berkeley, whether hardware or software, is affecting both the BOINC and SETI servers.

Unless someone like Kittyman or one of the other people with close connections with the Berkeley staff are lucky enough to get a quick answer from them during this major disaster, I guess we will have to sit back and wait until things get back up before we learn the whole story.

If anyone has more input into this theory, please let us know as soon as possible.

Thanks!

Primegrid has tons of gpu work available, as does Collatz, Amicable Numbers, GpuGrid and Moo Wrapper.

Primegrid: http://www.primegrid.com/

Collatz: https://boinc.thesonntags.com/collatz/

Collatz units pay the most credits of any gpu project if you use the optimization codes listed in the Number Crunching forum

Amicable Numbers: https://sech.me/boinc/Amicable/

GpuGrid: http://www.gpugrid.net/

IF you choose GpuGrid keep the cache very low as they do give bonus credits if you return the units within a pretty short deadline, it's explained on the website, and they ONLY accept higher end gpu's.

MoowWrapper: https://moowrap.net/

mmonnin

Joined: 29 May 16

Posts: 292

Credit: 3444726540

RAC: 2099

MW: Server is down hard and

3 Mar 2019 13:57:13 UTC

Message 169900

(moderation:

)

MW: Server is down hard and site is not responding. Site was always available before when the db was not accessible.

SETI: 2nd Download server hasn't been working right since Tuesday maint. Site is back up today with DLing improvements

E@H: No GPU work.

Not related at all.

FGRPB1G GPU work for both NVIDIA and AMD GPUs not available

Forums › Problems and Bug Reports

Comment viewing options

Forums › Problems and Bug Reports