Maths & Observations : einsteinathome.org Work units GPU+SiMD QE 2021

QuantumHelos
QuantumHelos
Joined: 5 Nov 17
Posts: 187
Credit: 61,854,810
RAC: 0
Topic 225255

Maths & Observations : einsteinathome.org Work units GPU+SiMD QE 2021

Speed testing for SSE>AVX FState Method? (like linus RAID Driver)
How can we make this coherent ? FFT?
FFT Examples : https://is.gd/ProcessorLasso in the SiMD Folder...
Advanced FFT & 3D Audio functions for CPU & GPU https://gpuopen.com/true-audio-next/

LATeah 1 row Suggest theading to advantage SiMD (FP16,x2FP16 : Precision)
Work unit size 160 FPU Threads 4MB Per group, Aligning work unit before compute.
Dataset size 400MB..? Optimise for 512MB, 1GB or 2GB Chunks

<core_client_version>7.16.11</core_client_version>
12:50:57 (2636): [normal]: This Einstein@home App was built at: May  8 2019 13:29:27
OpenCL device has FP64 support
12:50:57 (2636): [debug]: Flags: X64 SSE SSE2 GNUC X86 GNUX86
12:50:57 (2636): [debug]: Set up communication with graphics process.

Win:
Peak working set size (MB):588.14
Peak swap size (MB):1023.74

H1 Spotlight 8 rows Work unit size 160 FPU Threads 4MB Per group, Aligning work unit before compute.
Dataset size 400MB..? Optimise for 512MB, 1GB or 2GB Chunks
How can we make this coherent ? FFT?
FFT Examples : https://is.gd/ProcessorLasso in the SiMD Folder...
Advanced FFT & 3D Audio functions for CPU & GPU https://gpuopen.com/true-audio-next/

2021-04-12 21:55:14.2649 (16972) [debug]: Flags: LAL_DEBUG, OPTIMIZE, HS_OPTIMIZATION, GC_SSE2_OPT, X64, SSE, SSE2, GNUC X86 GNUX86
021-04-12 22:00:07.1870 (16972) [normal]: Search FstatMethod used: 'ResampOpenCL'
2021-04-12 22:00:07.1890 (16972) [normal]: Recalc FstatMethod used: 'DemodSSE'
2021-04-12 22:00:07.1960 (16972) [normal]: OpenCL version is used for the semi-coherent step!

Fail:
Peak working set size (MB):492.02
Peak swap size (MB):2898.58 5RAM overload possible
Win:
Peak working set size (MB):400.71
Peak swap size (MB):1827.34
Peak disk usage (MB):4.46
 

QuantumHelos
QuantumHelos
Joined: 5 Nov 17
Posts: 187
Credit: 61,854,810
RAC: 0

Some visual samples of the

Some visual samples of the test dataset https://is.gd/EinsteinE_MC_2

Comment viewing options

Select your preferred way to display the comments and click "Save settings" to activate your changes.