The advantage that the PIII has is that it's only got a 10-stage pipeline vs something in the 20's for the Core 2 Duo architechture.
I think the pipeline length of Core 2 family CPUs is more like 13 or 14, it was the Pentium 4 that started with a 20 stage pipeline.
I should have made this clearer: The reason for the poor performance of teh Cor2 2 CPU here compared to teh P-III is the fact that here a stock, non-SSE version is used, as compared to a more optimized app under Linux on the P-III. This was to hoghlight how the optimization can make older hardware competitive again :-)
First 4.35 WU crunched and reported. It certainly shows a speed increase, I'd say roughly 20% compared to 4.27... since the WUs were from the same frequency, it should be about comparable, shouldn't it? Of course, if someone with a deeper understanding of the maths involved wants to have a look, I'd appreciate...
I also noticed a somewhat higher average number of sky positions between checkpoints, btw, which I normally find a reliable sign of a speed increase.
I had one "no heartbeat from core client" error, but that occasionally happened with the predecessor apps as well on this box and there don't seem to be any significant consequences unless you count having to restart from the last checkpoint.
If this WU is indeed just as "long" as my others, I must say this app really rocks performance-wise :-D
Should be the same kind of CPU Bikeman mentioned earlier so it's no wonder it reacts about the same...
Strange, I haven't experienced any of those yet and this box was very prone to this kind of error in the past. I did have a "no heartbeat from core client", which would probably have been a signal 11 with the pre-4.24 apps, but nothing worse... maybe it's not the same problem after all? Afaik, a signal 11 can be more or less anything.
> I have experienced a lot of the "no heartbeat" messages recently. Seems to come in phases and the Linux machine does not lose the result but gets a heartbeat sooner or later and then completes.
A Windows AMD host I have gets heaps more and sometimes trashes a WU as a result.
For me this happens on any project at any time and with different Boinc Clients, from 5.10.15, 5.10.21 and 5.10.38.
Sometimes on Hydrogen@home I can get a failure rate as high as 50% of WU's and half of those failure have been "no heartbeat" messages.
But it does not happen all the time so can't track it down very easily.
I was under the impression it was the Boinc client losing contact with the process for a while and not knowing what it is doing, it later catches up to itself and all things go back to normal.
I have now on 3 occasions had my data reset in Boinc Manager on some CPDN work units for no apparent reason with 2 different Boinc Clients, this is why I suspect Boinc Client not the project application.
Pooh - looks like the same issur (after loss of heartbeat).
We should look into this again - Bikeman, do you have time for another debugger session?
Actually it isn't quite the same. A loss of heartbeat (i.e. stopping the client) leads to a "normal" exit of the App and a restart after the client is back at my machine (Pentium-M (1 core), BOINC 5.10.21). No segfault. Not that easy...
Pooh - looks like the same issur (after loss of heartbeat).
We should look into this again - Bikeman, do you have time for another debugger session?
Actually it isn't quite the same. A loss of heartbeat (i.e. stopping the client) leads to a "normal" exit of the App and a restart after the client is back at my machine (Pentium-M (1 core), BOINC 5.10.21). No segfault. Not that easy...
BM
If it helps, here are the error messages from when this happened.
Wed 27 Feb 2008 05:45:44 AM EST|Einstein@Home|Sending scheduler request: To fetch work. Requesting 81 seconds of work, reporting
0 completed tasks
Wed 27 Feb 2008 05:46:24 AM EST|Einstein@Home|Task h1_0847.50_S5R3__434_S5R3b_0 exited with zero status but no 'finished' file
Wed 27 Feb 2008 05:46:24 AM EST|Einstein@Home|If this happens repeatedly you may need to reset the project.
Wed 27 Feb 2008 05:46:24 AM EST||Project communication failed: attempting access to reference site
Wed 27 Feb 2008 05:46:28 AM EST|Einstein@Home|Restarting task h1_0847.50_S5R3__434_S5R3b_0 using einstein_S5R3 version 435
Wed 27 Feb 2008 05:46:28 AM EST|Einstein@Home|Scheduler request failed: Couldn't resolve host name
Wed 27 Feb 2008 05:46:28 AM EST|Einstein@Home|Computation for task h1_0847.50_S5R3__430_S5R3b_1 finished
Wed 27 Feb 2008 05:46:28 AM EST|Einstein@Home|Output file h1_0847.50_S5R3__430_S5R3b_1_0 for task h1_0847.50_S5R3__430_S5R3b_1 ab
sent
Wed 27 Feb 2008 05:46:45 AM EST||Access to reference site succeeded - project servers may be temporarily down.
Wed 27 Feb 2008 05:47:29 AM EST|Einstein@Home|Sending scheduler request: To fetch work. Requesting 30284 seconds of work, report
ing 1 completed tasks
Wed 27 Feb 2008 05:47:34 AM EST|Einstein@Home|Scheduler request succeeded: got 1 new tasks
Wed 27 Feb 2008 05:47:36 AM EST|Einstein@Home|Starting h1_0847.50_S5R3__417_S5R3b_1
Wed 27 Feb 2008 05:47:37 AM EST|Einstein@Home|Starting task h1_0847.50_S5R3__417_S5R3b_1 using einstein_S5R3 version 435
Wed 27 Feb 2008 05:48:35 AM EST|Einstein@Home|Sending scheduler request: To fetch work. Requesting 28 seconds of work, reporting
0 completed tasks
Wed 27 Feb 2008 05:48:40 AM EST|Einstein@Home|Scheduler request succeeded: got 1 new tasks
RE: The advantage that the
)
I think the pipeline length of Core 2 family CPUs is more like 13 or 14, it was the Pentium 4 that started with a 20 stage pipeline.
I should have made this clearer: The reason for the poor performance of teh Cor2 2 CPU here compared to teh P-III is the fact that here a stock, non-SSE version is used, as compared to a more optimized app under Linux on the P-III. This was to hoghlight how the optimization can make older hardware competitive again :-)
CU
Bikeman
First 4.35 WU crunched and
)
First 4.35 WU crunched and reported. It certainly shows a speed increase, I'd say roughly 20% compared to 4.27... since the WUs were from the same frequency, it should be about comparable, shouldn't it? Of course, if someone with a deeper understanding of the maths involved wants to have a look, I'd appreciate...
I also noticed a somewhat higher average number of sky positions between checkpoints, btw, which I normally find a reliable sign of a speed increase.
I had one "no heartbeat from core client" error, but that occasionally happened with the predecessor apps as well on this box and there don't seem to be any significant consequences unless you count having to restart from the last checkpoint.
If this WU is indeed just as "long" as my others, I must say this app really rocks performance-wise :-D
Should be the same kind of CPU Bikeman mentioned earlier so it's no wonder it reacts about the same...
It seems that the signal 11
)
It seems that the signal 11 problem is back. This is my second one since switching to the 4.35 app. (I had none at all with the 4.27 app.)
2nd Signal 11
Strange, I haven't
)
Strange, I haven't experienced any of those yet and this box was very prone to this kind of error in the past. I did have a "no heartbeat from core client", which would probably have been a signal 11 with the pre-4.24 apps, but nothing worse... maybe it's not the same problem after all? Afaik, a signal 11 can be more or less anything.
RE: It seems that the
)
Pooh - looks like the same issur (after loss of heartbeat).
We should look into this again - Bikeman, do you have time for another debugger session?
BM
BM
> I have experienced a lot of
)
> I have experienced a lot of the "no heartbeat" messages recently. Seems to come in phases and the Linux machine does not lose the result but gets a heartbeat sooner or later and then completes.
A Windows AMD host I have gets heaps more and sometimes trashes a WU as a result.
For me this happens on any project at any time and with different Boinc Clients, from 5.10.15, 5.10.21 and 5.10.38.
Sometimes on Hydrogen@home I can get a failure rate as high as 50% of WU's and half of those failure have been "no heartbeat" messages.
But it does not happen all the time so can't track it down very easily.
I was under the impression it was the Boinc client losing contact with the process for a while and not knowing what it is doing, it later catches up to itself and all things go back to normal.
I have now on 3 occasions had my data reset in Boinc Manager on some CPDN work units for no apparent reason with 2 different Boinc Clients, this is why I suspect Boinc Client not the project application.
RE: RE: It seems that the
)
Actually it isn't quite the same. A loss of heartbeat (i.e. stopping the client) leads to a "normal" exit of the App and a restart after the client is back at my machine (Pentium-M (1 core), BOINC 5.10.21). No segfault. Not that easy...
BM
BM
RE: RE: RE: It seems
)
If it helps, here are the error messages from when this happened.
Wed 27 Feb 2008 05:45:44 AM EST|Einstein@Home|Sending scheduler request: To fetch work. Requesting 81 seconds of work, reporting
0 completed tasks
Wed 27 Feb 2008 05:46:24 AM EST|Einstein@Home|Task h1_0847.50_S5R3__434_S5R3b_0 exited with zero status but no 'finished' file
Wed 27 Feb 2008 05:46:24 AM EST|Einstein@Home|If this happens repeatedly you may need to reset the project.
Wed 27 Feb 2008 05:46:24 AM EST||Project communication failed: attempting access to reference site
Wed 27 Feb 2008 05:46:28 AM EST|Einstein@Home|Restarting task h1_0847.50_S5R3__434_S5R3b_0 using einstein_S5R3 version 435
Wed 27 Feb 2008 05:46:28 AM EST|Einstein@Home|Scheduler request failed: Couldn't resolve host name
Wed 27 Feb 2008 05:46:28 AM EST|Einstein@Home|Computation for task h1_0847.50_S5R3__430_S5R3b_1 finished
Wed 27 Feb 2008 05:46:28 AM EST|Einstein@Home|Output file h1_0847.50_S5R3__430_S5R3b_1_0 for task h1_0847.50_S5R3__430_S5R3b_1 ab
sent
Wed 27 Feb 2008 05:46:45 AM EST||Access to reference site succeeded - project servers may be temporarily down.
Wed 27 Feb 2008 05:47:29 AM EST|Einstein@Home|Sending scheduler request: To fetch work. Requesting 30284 seconds of work, report
ing 1 completed tasks
Wed 27 Feb 2008 05:47:34 AM EST|Einstein@Home|Scheduler request succeeded: got 1 new tasks
Wed 27 Feb 2008 05:47:36 AM EST|Einstein@Home|Starting h1_0847.50_S5R3__417_S5R3b_1
Wed 27 Feb 2008 05:47:37 AM EST|Einstein@Home|Starting task h1_0847.50_S5R3__417_S5R3b_1 using einstein_S5R3 version 435
Wed 27 Feb 2008 05:48:35 AM EST|Einstein@Home|Sending scheduler request: To fetch work. Requesting 28 seconds of work, reporting
0 completed tasks
Wed 27 Feb 2008 05:48:40 AM EST|Einstein@Home|Scheduler request succeeded: got 1 new tasks
Looks like internet
)
Looks like internet trouble... maybe that still plays a role?
Bernd, that's just what happens on my Core machine...
RE: Looks like internet
)
Maybe it's more likely to happen on multi-cores.... I'lll do some tests
CU
Bikeman