After starting up and running for 8 minutes, the estimated time remaining jumps up to a very high value, typically 7 to 9 days and briefly even more. Then, after running for maybe half an hour or so, the time estimate returns to a reasonable value.
Do you use app_config? My guess is that fraction_done_exact could cause this.
After starting up and running for 8 minutes, the estimated time remaining jumps up to a very high value, typically 7 to 9 days and briefly even more. Then, after running for maybe half an hour or so, the time estimate returns to a reasonable value.
Do you use app_config? My guess is that fraction_done_exact could cause this.
That is weird. I use an app_config but I'm not seeing that. Looking at some right now, there are anywhere from 11 minutes to 3 minutes in and no change in time to complete. Also not seeing a shift in priority of work. I have Einstein on the CPU, GPUGrid and Seti on the GPUs. No alterations of how they are processing.
Could it be due to the OS? Running windows 7, i7 5960X
What causes some of these to go into pending status?
I, too am curious as to how these are handled. My first return is still pending, while all but the most recent other returned have validated.
Mine have arrived as single replication. Is this only done for trusted hosts, or are all WUs initially going to a single host?
Some of mine have sat for over a day returned but not validated. Is the validator running in big batches, or in such small batches that it is effective running continuously at the moment?
My (two) which are returned but pending both list Status as "Complete, waiting for validation", while the corresponding WU page states simply "Tasks are pending for this workunit." without listing either my returned task, or (if one exists) a second try sent to someone else. The Task page lists Validation state as "initial". What outcomes can a validator pass have? I assume pass and fail are options, but is there a "maybe". In which cases is an additional task copy sent to another host? What status can we see if that has happened and is pending? Is it intentional that we can't see a checkout task in this case? If the "maybe" option exists, what happens if the second task returns an effectively identical result.
All this is just my curiosity. The answers may be well-known to those who have been running gravity CPU' tasks recently, but I've only been running pulsar GPU tasks, so the behavior is new to me.
That is weird. I use an app_config but I'm not seeing that. Looking at some right now, there are anywhere from 11 minutes to 3 minutes in and no change in time to complete. Also not seeing a shift in priority of work. I have Einstein on the CPU, GPUGrid and Seti on the GPUs. No alterations of how they are processing.
Could it be due to the OS? Running windows 7, i7 5960X
Possibly. I was running that on an Ubuntu 16.04 machine. But after taking out fraction_done_exact, it cured the problem immediately.
... after taking out fraction_done_exact, it cured the problem immediately.
I believe this may be something to do with the use (or not) of some sort of simulated progress until the first checkpoint is written. I'm not sure, but maybe the Einstein app shows zero progress for a while, maybe until the first checkpoint is written. If you are using fraction_done_exact, and there is no simulated %done (ie zero progress) for some initial period, this could translate to a huge estimate for the full crunch time - a divide by zero type scenario.
Maybe this situation changes with different BOINC versions which might explain why some people have a problem and others don't. I suspect it's not directly to do with the OS.
I believe this may be something to do with the use (or not) of some sort of simulated progress until the first checkpoint is written.
That is certainly it. Einstein shows a very small amount of progress (about 1%) for many minutes, and of course fraction_done_exact then extrapolates that to a very large remaining time value. It all makes perfect sense, I just had never put it together before.
Jim1348 schrieb:After
)
Do you use app_config? My guess is that fraction_done_exact could cause this.
floyd_7 wrote:Jim1348
)
That is weird. I use an app_config but I'm not seeing that. Looking at some right now, there are anywhere from 11 minutes to 3 minutes in and no change in time to complete. Also not seeing a shift in priority of work. I have Einstein on the CPU, GPUGrid and Seti on the GPUs. No alterations of how they are processing.
Could it be due to the OS? Running windows 7, i7 5960X
floyd_7 wrote:Do you use
)
Yes, I do indeed. I will eliminate that, and post if it does not fix it. Thanks. I have never seen it before.
What causes some of these to
)
What causes some of these to go into pending status?
Betreger wrote:What causes
)
I, too am curious as to how these are handled. My first return is still pending, while all but the most recent other returned have validated.
Mine have arrived as single replication. Is this only done for trusted hosts, or are all WUs initially going to a single host?
Some of mine have sat for over a day returned but not validated. Is the validator running in big batches, or in such small batches that it is effective running continuously at the moment?
My (two) which are returned but pending both list Status as "Complete, waiting for validation", while the corresponding WU page states simply "Tasks are pending for this workunit." without listing either my returned task, or (if one exists) a second try sent to someone else. The Task page lists Validation state as "initial". What outcomes can a validator pass have? I assume pass and fail are options, but is there a "maybe". In which cases is an additional task copy sent to another host? What status can we see if that has happened and is pending? Is it intentional that we can't see a checkout task in this case? If the "maybe" option exists, what happens if the second task returns an effectively identical result.
All this is just my curiosity. The answers may be well-known to those who have been running gravity CPU' tasks recently, but I've only been running pulsar GPU tasks, so the behavior is new to me.
Zalster wrote:That is weird.
)
Possibly. I was running that on an Ubuntu 16.04 machine. But after taking out fraction_done_exact, it cured the problem immediately.
Jim1348 wrote:... after
)
I believe this may be something to do with the use (or not) of some sort of simulated progress until the first checkpoint is written. I'm not sure, but maybe the Einstein app shows zero progress for a while, maybe until the first checkpoint is written. If you are using fraction_done_exact, and there is no simulated %done (ie zero progress) for some initial period, this could translate to a huge estimate for the full crunch time - a divide by zero type scenario.
Maybe this situation changes with different BOINC versions which might explain why some people have a problem and others don't. I suspect it's not directly to do with the OS.
Cheers,
Gary.
Gary Roberts wrote:I believe
)
That is certainly it. Einstein shows a very small amount of progress (about 1%) for many minutes, and of course fraction_done_exact then extrapolates that to a very large remaining time value. It all makes perfect sense, I just had never put it together before.
Hi, Will we have some mor
)
Hi,
Will we have some more info about these units? (O2AS20-500).
It is not shown in server_status , it would be nice ot see how many wor units do we have, how many are done...etc (like we have for rest of units).
Thx¡
Javi
Guiri-1_Andalucia_ wrote:It
)
Its on the right side of the server status page, if you look at the headings there is a column for O2AS20-500 under Workunits and Tasks.
BOINC blog