Bernd Machenschalk commented on CUDA application for the O3ASHF search
What makes this part slow is the rather random memory access,
Which means that indeed running multiple instances in...
13th March 2024
|
Bernd Machenschalk commented on CUDA application for the O3ASHF search
The core operation is single precision (derived from the SSE version). I'm not entirely sure there aren't any few double...
13th March 2024
|
Bernd Machenschalk commented on CUDA application for the O3ASHF search
Ian&Steve C. wrote:
Bernd, is the recalc on GPU step for the midpoint and end of 1.14 a FP64 load?
Actually,...
13th March 2024
|
Bernd Machenschalk commented on CUDA application for the O3ASHF search
Could it be that 1.14 is slower than 1.08 (only) if you run multiple instances/tasks in parallel?
13th March 2024
|
Bernd Machenschalk commented on CUDA application for the O3ASHF search
Yep - new version underway.
7th March 2024
|
Copyright © 2024 Einstein@Home. All rights reserved.