Yep, we're aware of that but got distracted by the FGRP issue. Bernd is now back on the upload issue and we're going to automate things such that uploads shouldn't get stuck anymore while we're out of office.
The new data file (LATeah2004L.dat) has been available for well over a day now. As mentioned by Oliver, tasks based on it do take the 'normal' amount of time. I did promote a task to confirm that.
By now, people with reasonably small work caches will have run out of 2003L tasks and started 2004L. They will have seen a big increase in the estimated crunch times for everything on board as soon as the first 2004L task finished.
For any people still crunching 2003L tasks, there could be a nasty surprise for you when you start 2004L tasks, if you haven't already taken precautions. I've tried to explain this in this message.
Just an update on data files and expected crunching performance of tasks based on them.
Apart from the special case of the LATeah2003L.dat file (commented on in previous messages), all the other data files in the '200?L' series have produced tasks of pretty standard performance - ie. tasks that are of the 'slow crunching' variety and are therefore unsuited to Turing series GPUs.
There is now a new member of this series, 2008L, which started arriving overnight local time. I haven't specifically tested crunching performance but I expect (based on the data file size in bytes) a very similar crunching performance (including failure on Turing GPUs) as was experienced for the other data files in this group. For reference, here is the list to date of these data files, including size in bytes and date when first received which is local time (UTC+10). The columns represent file permissions, owner/group, # of hard links, size in bytes, date received and full filename. Clearly 2003L is the odd one in the group. You can see these files 'last' about 5 days on average before tasks run out and the file is replaced.
-rw-r--r-- 5 gary gary 1935482 Dec 8 09:19 LATeah2001L.dat
-rw-r--r-- 5 gary gary 1935482 Dec 13 06:22 LATeah2002L.dat
-rw-r--r-- 5 gary gary 2725678 Dec 5 2016 LATeah2003L.dat
-rw-r--r-- 5 gary gary 1935482 Dec 22 09:19 LATeah2004L.dat
-rw-r--r-- 5 gary gary 1935482 Dec 26 07:15 LATeah2005L.dat
-rw-r--r-- 5 gary gary 1935482 Dec 31 05:57 LATeah2006L.dat
-rw-r--r-- 5 gary gary 1935482 Jan 5 09:54 LATeah2007L.dat
-rw-r--r-- 5 gary gary 1935482 Jan 10 07:04 LATeah2008L.dat
I received 2009L this morning 8:48 EST and starting working on them this evening with 0.5 day queue. Tasks went from ~20min to ~16min from compared to 2008L.
CPU % also dropped from 19-20% to 14-15%.
File size is the same 1891kb as 2006L and 2008L.
I noticed as the tasks in queue were at a slightly higher count.
I received 2009L this morning 8:48 EST and starting working on them this evening with 0.5 day queue. Tasks went from ~20min to ~16min from compared to 2008L.
I looked at tasks on my host. Current 2009L_740.0_ (pending) have identical run times with 2008L_740.0_ (validated).
* I chose a frequency this time because I was kindly reminded in another thread a few days ago about the importance of comparing tasks from the same frequency era Great info and graphs in this thread started by archae86: https://einsteinathome.org/content/fgrpb1g-wu-distribution-data
My prior tasks for 2008L were 1100-1200 and 2009L tasks have been 164-692. The new file started the new frequencies and thus are faster.
This is bound to be quite confusing to casual readers. As a standalone statement and without already fully knowing the relation between pulsar spin frequency and crunch time, it doesn't make much sense ;-). For the benefit of those 'not in the know' (and for the purpose of explaining why your original comment needed to be adjusted), I think it would have been better to explain the situation properly, something like:-
"It can be a trap to judge crunching performance for a new data file by just looking at the very first results. The effect of pulsar spin frequency on the elapsed time for crunching needs to be considered. As shown in the graphs in the thread Ritchie linked to, those tasks which look for signals in the data file at low spin frequencies (eg 164 to 692Hz, but particularly the very low values) always crunch faster than tasks looking for much higher frequencies (like 1100 to 1200Hz). When a new data file is deployed, the first tasks are always looking for the lowest spin frequencies. Those don't last long and as tasks for higher frequencies start being issued, the crunch time will always be noticeably longer."
Apart from the special case of the LATeah2003L.dat file (commented on in previous messages), all the other data files in the '200?L' series have produced tasks of pretty standard performance - ie. tasks that are of the 'slow crunching' variety and are therefore unsuited to Turing series GPUs.
Just a short note to advise that the 'corrected' version of LATeah2003L.dat has now been issued. The name is LATeah2103L.dat (as mentioned at the time 2003L was withdrawn). The size of the new file is different from that of 2003L and is now exactly the same size as the others in the 200?L series. The tasks based on it appear to crunch similarly as well. So these tasks are likely to still give problems if crunched using Turing series GPUs.
Yep, we're aware of that but
)
Yep, we're aware of that but got distracted by the FGRP issue. Bernd is now back on the upload issue and we're going to automate things such that uploads shouldn't get stuck anymore while we're out of office.
Cheers
Einstein@Home Project
The new data file
)
The new data file (LATeah2004L.dat) has been available for well over a day now. As mentioned by Oliver, tasks based on it do take the 'normal' amount of time. I did promote a task to confirm that.
By now, people with reasonably small work caches will have run out of 2003L tasks and started 2004L. They will have seen a big increase in the estimated crunch times for everything on board as soon as the first 2004L task finished.
For any people still crunching 2003L tasks, there could be a nasty surprise for you when you start 2004L tasks, if you haven't already taken precautions. I've tried to explain this in this message.
Cheers,
Gary.
Just an update on data files
)
Just an update on data files and expected crunching performance of tasks based on them.
Apart from the special case of the LATeah2003L.dat file (commented on in previous messages), all the other data files in the '200?L' series have produced tasks of pretty standard performance - ie. tasks that are of the 'slow crunching' variety and are therefore unsuited to Turing series GPUs.
There is now a new member of this series, 2008L, which started arriving overnight local time. I haven't specifically tested crunching performance but I expect (based on the data file size in bytes) a very similar crunching performance (including failure on Turing GPUs) as was experienced for the other data files in this group. For reference, here is the list to date of these data files, including size in bytes and date when first received which is local time (UTC+10). The columns represent file permissions, owner/group, # of hard links, size in bytes, date received and full filename. Clearly 2003L is the odd one in the group. You can see these files 'last' about 5 days on average before tasks run out and the file is replaced.
Cheers,
Gary.
Yep, 2008L completion times
)
Yep, 2008L completion times are similar to 2007L, 2006L ...
I received 2009L this morning
)
I received 2009L this morning 8:48 EST and starting working on them this evening with 0.5 day queue. Tasks went from ~20min to ~16min from compared to 2008L.
CPU % also dropped from 19-20% to 14-15%.
File size is the same 1891kb as 2006L and 2008L.
I noticed as the tasks in queue were at a slightly higher count.
mmonnin wrote:I received
)
I looked at tasks on my host. Current 2009L_740.0_ (pending) have identical run times with 2008L_740.0_ (validated).
* I chose a frequency this time because I was kindly reminded in another thread a few days ago about the importance of comparing tasks from the same frequency era Great info and graphs in this thread started by archae86: https://einsteinathome.org/content/fgrpb1g-wu-distribution-data
My prior tasks for 2008L were
)
My prior tasks for 2008L were 1100-1200 and 2009L tasks have been 164-692. The new file started the new frequencies and thus are faster.
mmonnin wrote:My prior tasks
)
This is bound to be quite confusing to casual readers. As a standalone statement and without already fully knowing the relation between pulsar spin frequency and crunch time, it doesn't make much sense ;-). For the benefit of those 'not in the know' (and for the purpose of explaining why your original comment needed to be adjusted), I think it would have been better to explain the situation properly, something like:-
"It can be a trap to judge crunching performance for a new data file by just looking at the very first results. The effect of pulsar spin frequency on the elapsed time for crunching needs to be considered. As shown in the graphs in the thread Ritchie linked to, those tasks which look for signals in the data file at low spin frequencies (eg 164 to 692Hz, but particularly the very low values) always crunch faster than tasks looking for much higher frequencies (like 1100 to 1200Hz). When a new data file is deployed, the first tasks are always looking for the lowest spin frequencies. Those don't last long and as tasks for higher frequencies start being issued, the crunch time will always be noticeably longer."
Cheers,
Gary.
Gary Roberts wrote:Apart from
)
Just a short note to advise that the 'corrected' version of LATeah2003L.dat has now been issued. The name is LATeah2103L.dat (as mentioned at the time 2003L was withdrawn). The size of the new file is different from that of 2003L and is now exactly the same size as the others in the 200?L series. The tasks based on it appear to crunch similarly as well. So these tasks are likely to still give problems if crunched using Turing series GPUs.
Cheers,
Gary.
Gary Roberts wrote:So these
)
FYI, we're going to look into this problem as soon as we possibly can.
Cheers,
Oliver
Einstein@Home Project