I've tried about everything and I can't get anything to work. One machine has zero errors the other 6%. Is there a way to tell from the data available which device, 0 or 1 or both is the culprit on that machine. I could sit there and stare at the screen all day but that is hard to do with old eyes like mine.
I haven't tried the removal of virus protection; that kind of puts the fear of god in me.
merle
What is freedom of expression? Without the freedom to offend, it ceases to exist.
Is there a way to tell from the data available which device, 0 or 1 or both is the culprit on that machine.
Read the stderr.txt:
Quote:
7.2.42
Activated exception handling...
[20:45:49][3264][INFO ] Starting data processing...
[20:45:49][3264][INFO ] Using OpenCL platform provided by: Advanced Micro Devices, Inc.
[20:45:49][3264][INFO ] Using OpenCL device "Pitcairn" by: Advanced Micro Devices, Inc.
[20:45:50][3264][INFO ] Checkpoint file unavailable: status.cpt (No such file or directory).
------> Starting from scratch...
Or:
Quote:
7.2.42
Activated exception handling...
[16:37:06][3080][INFO ] Starting data processing...
[16:37:07][3080][INFO ] Using OpenCL platform provided by: Advanced Micro Devices, Inc.
[16:37:07][3080][INFO ] Using OpenCL device "Tahiti" by: Advanced Micro Devices, Inc.
[16:37:07][3080][INFO ] Checkpoint file unavailable: status.cpt (No such file or directory).
------> Starting from scratch...
Is there a way to tell from the data available which device, 0 or 1 or both is the culprit on that machine.
Read the stderr.txt:
Quote:
7.2.42
Activated exception handling...
[20:45:49][3264][INFO ] Starting data processing...
[20:45:49][3264][INFO ] Using OpenCL platform provided by: Advanced Micro Devices, Inc.
[20:45:49][3264][INFO ] Using OpenCL device "Pitcairn" by: Advanced Micro Devices, Inc.
[20:45:50][3264][INFO ] Checkpoint file unavailable: status.cpt (No such file or directory).
------> Starting from scratch...
Or:
Quote:
7.2.42
Activated exception handling...
[16:37:06][3080][INFO ] Starting data processing...
[16:37:07][3080][INFO ] Using OpenCL platform provided by: Advanced Micro Devices, Inc.
[16:37:07][3080][INFO ] Using OpenCL device "Tahiti" by: Advanced Micro Devices, Inc.
[16:37:07][3080][INFO ] Checkpoint file unavailable: status.cpt (No such file or directory).
------> Starting from scratch...
Claggy
Yes but both of mine are Tahiti 270x and a 280x. :-(
Wait I just looked it up and the 270x says its a curacao!
I need to check. Thanks Claggy
--edit
It seems that both of them are being reported to Einstein as Tahiti.
I guess I have to watch as they clear thru the system.
Thanks though.
merle
What is freedom of expression? Without the freedom to offend, it ceases to exist.
--edit
It seems that both of them are being reported to Einstein as Tahiti.
I guess I have to watch as they clear thru the system.
Thanks though.
The server only shows the most Capable GPUs (for each vendor), so it'll only display that your host has Tahiti GPUs, even through the 2nd AMD/ATI GPU is a Pitcairn,
I gave up with the watching games after watching 8 valid in a row. I analyzed the times of the invalid ones and found that they all have to be from my fastest card the 280x. I increased the fan speed on the 280x alone to 75% (it was at 56%) to see if this helps or not. The fan speed is the only thing I can control. I can increase the core clock and the memory clock but it won't let me decrease them. It's a sapphire.
merle
What is freedom of expression? Without the freedom to offend, it ceases to exist.
This is the stderr from an inconclusive that just passed thru. Can you tell if it's probably bad or not? Or can you tell me any info about it that might lead me to a reason for it being invalid? I know it's not yet invalid but it's from my 280x and I suspect it's invalid since I've had 8 in a row come thru as valid.
Stderr output
7.2.42
Activated exception handling...
[07:54:00][3912][INFO ] Starting data processing...
[07:54:00][3912][INFO ] Using OpenCL platform provided by: Advanced Micro Devices, Inc.
[07:54:00][3912][INFO ] Using OpenCL device "Tahiti" by: Advanced Micro Devices, Inc.
[07:54:01][3912][INFO ] Checkpoint file unavailable: status.cpt (No such file or directory).
------> Starting from scratch...
[07:54:01][3912][INFO ] Header contents:
------> Original WAPP file: ./PB0057_010B1_DM252.00
------> Sample time in microseconds: 1000
------> Observation time in seconds: 2097.152
------> Time stamp (MJD): 53843.193587979025
------> Number of samples/record: 0
------> Center freq in MHz: 1231.5
------> Channel band in MHz: 3
------> Number of channels/record: 96
------> Nifs: 1
------> RA (J2000): 63740.9799004
------> DEC (J2000): -51902.684
------> Galactic l: 0
------> Galactic b: 0
------> Name: G4383473
------> Lagformat: 0
------> Sum: 1
------> Level: 3
------> AZ at start: 0
------> ZA at start: 0
------> AST at start: 0
------> LST at start: 0
------> Project ID: --
------> Observers: --
------> File size (bytes): 0
------> Data size (bytes): 0
------> Number of samples: 2097152
------> Trial dispersion measure: 252 cm^-3 pc
------> Scale factor: 1.82206
[07:54:02][3912][INFO ] Seed for random number generator is 1082949649.
[07:54:03][3912][INFO ] Derived global search parameters:
------> f_A probability = 0.04
------> single bin prob(P_noise > P_thr) = 1.2977e-008
------> thr1 = 18.1601
------> thr2 = 21.263
------> thr4 = 26.2923
------> thr8 = 34.674
------> thr16 = 48.9881
[07:59:28][3912][INFO ] Checkpoint committed!
[08:04:55][3912][INFO ] Checkpoint committed!
[08:10:23][3912][INFO ] Checkpoint committed!
[08:15:51][3912][INFO ] Checkpoint committed!
[08:21:18][3912][INFO ] Checkpoint committed!
[08:26:46][3912][INFO ] Checkpoint committed!
[08:32:13][3912][INFO ] Checkpoint committed!
[08:37:41][3912][INFO ] Checkpoint committed!
[08:42:17][3912][INFO ] OpenCL shutdown complete!
[08:42:18][3912][INFO ] Statistics: count dirty SumSpec pages 42974 (not checkpointed), Page Size 1024, fundamental_idx_hi-window_2: 1100505
[08:42:18][3912][INFO ] Data processing finished successfully!
[08:42:18][3912][INFO ] Starting data processing...
[08:42:18][3912][INFO ] Using OpenCL platform provided by: Advanced Micro Devices, Inc.
[08:42:18][3912][INFO ] Using OpenCL device "Tahiti" by: Advanced Micro Devices, Inc.
[08:42:18][3912][INFO ] Checkpoint file unavailable: status.cpt (No such file or directory).
------> Starting from scratch...
[08:42:18][3912][INFO ] Header contents:
------> Original WAPP file: ./PB0057_010B1_DM254.00
------> Sample time in microseconds: 1000
------> Observation time in seconds: 2097.152
------> Time stamp (MJD): 53843.193587937269
------> Number of samples/record: 0
------> Center freq in MHz: 1231.5
------> Channel band in MHz: 3
------> Number of channels/record: 96
------> Nifs: 1
------> RA (J2000): 63740.9799004
------> DEC (J2000): -51902.684
------> Galactic l: 0
------> Galactic b: 0
------> Name: G4383473
------> Lagformat: 0
------> Sum: 1
------> Level: 3
------> AZ at start: 0
------> ZA at start: 0
------> AST at start: 0
------> LST at start: 0
------> Project ID: --
------> Observers: --
------> File size (bytes): 0
------> Data size (bytes): 0
------> Number of samples: 2097152
------> Trial dispersion measure: 254 cm^-3 pc
------> Scale factor: 1.819
[08:42:19][3912][INFO ] Seed for random number generator is 1084118061.
[08:42:20][3912][INFO ] Derived global search parameters:
------> f_A probability = 0.04
------> single bin prob(P_noise > P_thr) = 1.2977e-008
------> thr1 = 18.1601
------> thr2 = 21.263
------> thr4 = 26.2923
------> thr8 = 34.674
------> thr16 = 48.9881
[08:43:09][3912][INFO ] Checkpoint committed!
[08:48:36][3912][INFO ] Checkpoint committed!
[08:54:04][3912][INFO ] Checkpoint committed!
[08:59:32][3912][INFO ] Checkpoint committed!
[09:04:59][3912][INFO ] Checkpoint committed!
[09:10:27][3912][INFO ] Checkpoint committed!
[09:15:54][3912][INFO ] Checkpoint committed!
[09:21:22][3912][INFO ] Checkpoint committed!
[09:26:49][3912][INFO ] Checkpoint committed!
[09:30:35][3912][INFO ] OpenCL shutdown complete!
[09:30:35][3912][INFO ] Statistics: count dirty SumSpec pages 47746 (not checkpointed), Page Size 1024, fundamental_idx_hi-window_2: 1100505
[09:30:35][3912][INFO ] Data processing finished successfully!
09:30:35 (3912): called boinc_finish
]]>
merle
What is freedom of expression? Without the freedom to offend, it ceases to exist.
Hallo Merle!
If I remember coorectly, this looks very much the same as the error remarks I got, about a month ago , whe I had in the mean over 3 weeks and 200 tasks about 10% errorrate, before I changed the gpu card driver. See my thread. Since then I crunched more than 300 tasks without any invalids or inconlusivs. But I had to use the beta-version of the driver!
For applications running on cpu only I get very, very little errors, far less than 1 in 1000 seen over the last years. I believe all these applications became best optimized for Windows OS, as this is by far the most used one. Orther OS like Mac and Linux haver higher error rates and the rate you can see, and/or derive from the Server Status Page is the average over all, and so higher than that for Windows OS. But BM could tell you more about all this.
I've tried about everything
)
I've tried about everything and I can't get anything to work. One machine has zero errors the other 6%. Is there a way to tell from the data available which device, 0 or 1 or both is the culprit on that machine. I could sit there and stare at the screen all day but that is hard to do with old eyes like mine.
I haven't tried the removal of virus protection; that kind of puts the fear of god in me.
merle
What is freedom of expression? Without the freedom to offend, it ceases to exist.
— Salman Rushdie
RE: Is there a way to tell
)
Read the stderr.txt:
Or:
Claggy
RE: RE: Is there a way to
)
Yes but both of mine are Tahiti 270x and a 280x. :-(
Wait I just looked it up and the 270x says its a curacao!
I need to check. Thanks Claggy
--edit
It seems that both of them are being reported to Einstein as Tahiti.
I guess I have to watch as they clear thru the system.
Thanks though.
merle
What is freedom of expression? Without the freedom to offend, it ceases to exist.
— Salman Rushdie
Check more of your valid
)
Check more of your valid tasks. One GPU is reported as Pitcairn...http://einsteinathome.org/task/464681263
I think that is your 270.
See if there's any invalid on that one or just the Tahiti/280
RE: --edit It seems that
)
The server only shows the most Capable GPUs (for each vendor), so it'll only display that your host has Tahiti GPUs, even through the 2nd AMD/ATI GPU is a Pitcairn,
Claggy
RE: Check more of your
)
I don't know why they would report a pitcairn on that computer. I have pitcairn's on my other computer??
merle
What is freedom of expression? Without the freedom to offend, it ceases to exist.
— Salman Rushdie
thanks Claggy, that's my
)
thanks Claggy,
that's my answer.
merle
What is freedom of expression? Without the freedom to offend, it ceases to exist.
— Salman Rushdie
I gave up with the watching
)
I gave up with the watching games after watching 8 valid in a row. I analyzed the times of the invalid ones and found that they all have to be from my fastest card the 280x. I increased the fan speed on the 280x alone to 75% (it was at 56%) to see if this helps or not. The fan speed is the only thing I can control. I can increase the core clock and the memory clock but it won't let me decrease them. It's a sapphire.
merle
What is freedom of expression? Without the freedom to offend, it ceases to exist.
— Salman Rushdie
This is the stderr from an
)
This is the stderr from an inconclusive that just passed thru. Can you tell if it's probably bad or not? Or can you tell me any info about it that might lead me to a reason for it being invalid? I know it's not yet invalid but it's from my 280x and I suspect it's invalid since I've had 8 in a row come thru as valid.
Stderr output
7.2.42
Activated exception handling...
[07:54:00][3912][INFO ] Starting data processing...
[07:54:00][3912][INFO ] Using OpenCL platform provided by: Advanced Micro Devices, Inc.
[07:54:00][3912][INFO ] Using OpenCL device "Tahiti" by: Advanced Micro Devices, Inc.
[07:54:01][3912][INFO ] Checkpoint file unavailable: status.cpt (No such file or directory).
------> Starting from scratch...
[07:54:01][3912][INFO ] Header contents:
------> Original WAPP file: ./PB0057_010B1_DM252.00
------> Sample time in microseconds: 1000
------> Observation time in seconds: 2097.152
------> Time stamp (MJD): 53843.193587979025
------> Number of samples/record: 0
------> Center freq in MHz: 1231.5
------> Channel band in MHz: 3
------> Number of channels/record: 96
------> Nifs: 1
------> RA (J2000): 63740.9799004
------> DEC (J2000): -51902.684
------> Galactic l: 0
------> Galactic b: 0
------> Name: G4383473
------> Lagformat: 0
------> Sum: 1
------> Level: 3
------> AZ at start: 0
------> ZA at start: 0
------> AST at start: 0
------> LST at start: 0
------> Project ID: --
------> Observers: --
------> File size (bytes): 0
------> Data size (bytes): 0
------> Number of samples: 2097152
------> Trial dispersion measure: 252 cm^-3 pc
------> Scale factor: 1.82206
[07:54:02][3912][INFO ] Seed for random number generator is 1082949649.
[07:54:03][3912][INFO ] Derived global search parameters:
------> f_A probability = 0.04
------> single bin prob(P_noise > P_thr) = 1.2977e-008
------> thr1 = 18.1601
------> thr2 = 21.263
------> thr4 = 26.2923
------> thr8 = 34.674
------> thr16 = 48.9881
[07:59:28][3912][INFO ] Checkpoint committed!
[08:04:55][3912][INFO ] Checkpoint committed!
[08:10:23][3912][INFO ] Checkpoint committed!
[08:15:51][3912][INFO ] Checkpoint committed!
[08:21:18][3912][INFO ] Checkpoint committed!
[08:26:46][3912][INFO ] Checkpoint committed!
[08:32:13][3912][INFO ] Checkpoint committed!
[08:37:41][3912][INFO ] Checkpoint committed!
[08:42:17][3912][INFO ] OpenCL shutdown complete!
[08:42:18][3912][INFO ] Statistics: count dirty SumSpec pages 42974 (not checkpointed), Page Size 1024, fundamental_idx_hi-window_2: 1100505
[08:42:18][3912][INFO ] Data processing finished successfully!
[08:42:18][3912][INFO ] Starting data processing...
[08:42:18][3912][INFO ] Using OpenCL platform provided by: Advanced Micro Devices, Inc.
[08:42:18][3912][INFO ] Using OpenCL device "Tahiti" by: Advanced Micro Devices, Inc.
[08:42:18][3912][INFO ] Checkpoint file unavailable: status.cpt (No such file or directory).
------> Starting from scratch...
[08:42:18][3912][INFO ] Header contents:
------> Original WAPP file: ./PB0057_010B1_DM254.00
------> Sample time in microseconds: 1000
------> Observation time in seconds: 2097.152
------> Time stamp (MJD): 53843.193587937269
------> Number of samples/record: 0
------> Center freq in MHz: 1231.5
------> Channel band in MHz: 3
------> Number of channels/record: 96
------> Nifs: 1
------> RA (J2000): 63740.9799004
------> DEC (J2000): -51902.684
------> Galactic l: 0
------> Galactic b: 0
------> Name: G4383473
------> Lagformat: 0
------> Sum: 1
------> Level: 3
------> AZ at start: 0
------> ZA at start: 0
------> AST at start: 0
------> LST at start: 0
------> Project ID: --
------> Observers: --
------> File size (bytes): 0
------> Data size (bytes): 0
------> Number of samples: 2097152
------> Trial dispersion measure: 254 cm^-3 pc
------> Scale factor: 1.819
[08:42:19][3912][INFO ] Seed for random number generator is 1084118061.
[08:42:20][3912][INFO ] Derived global search parameters:
------> f_A probability = 0.04
------> single bin prob(P_noise > P_thr) = 1.2977e-008
------> thr1 = 18.1601
------> thr2 = 21.263
------> thr4 = 26.2923
------> thr8 = 34.674
------> thr16 = 48.9881
[08:43:09][3912][INFO ] Checkpoint committed!
[08:48:36][3912][INFO ] Checkpoint committed!
[08:54:04][3912][INFO ] Checkpoint committed!
[08:59:32][3912][INFO ] Checkpoint committed!
[09:04:59][3912][INFO ] Checkpoint committed!
[09:10:27][3912][INFO ] Checkpoint committed!
[09:15:54][3912][INFO ] Checkpoint committed!
[09:21:22][3912][INFO ] Checkpoint committed!
[09:26:49][3912][INFO ] Checkpoint committed!
[09:30:35][3912][INFO ] OpenCL shutdown complete!
[09:30:35][3912][INFO ] Statistics: count dirty SumSpec pages 47746 (not checkpointed), Page Size 1024, fundamental_idx_hi-window_2: 1100505
[09:30:35][3912][INFO ] Data processing finished successfully!
09:30:35 (3912): called boinc_finish
]]>
merle
What is freedom of expression? Without the freedom to offend, it ceases to exist.
— Salman Rushdie
Hallo Merle! If I remember
)
Hallo Merle!
If I remember coorectly, this looks very much the same as the error remarks I got, about a month ago , whe I had in the mean over 3 weeks and 200 tasks about 10% errorrate, before I changed the gpu card driver. See my thread. Since then I crunched more than 300 tasks without any invalids or inconlusivs. But I had to use the beta-version of the driver!
For applications running on cpu only I get very, very little errors, far less than 1 in 1000 seen over the last years. I believe all these applications became best optimized for Windows OS, as this is by far the most used one. Orther OS like Mac and Linux haver higher error rates and the rate you can see, and/or derive from the Server Status Page is the average over all, and so higher than that for Windows OS. But BM could tell you more about all this.
Kind regards and happy crunching
Martin