Gravitational Wave search O1 all-sky tuning (O1AS20-100T)

Gary Roberts
Gary Roberts
Moderator
Joined: 9 Feb 05
Posts: 5,247
Credit: 44,975,640,584
RAC: 36,053,851

RE: First X64 V1.04 task on

Quote:
First X64 V1.04 task on my Westmere took 44,544.48 ET, 42,204.43 CPU time. While this is a single sample, it is unambiguously improved from the observation of a mix of 1.02, 1.03, and 1.04 SSE2 executable tasks returned form the same host while running in the same configuration, for which ET ran a little under 52,000, and CPU time a somewhat over 50,000. I'll do statistics when more of the returns are in. But these are several sigma out of the previous distribution--so "better" is quite clear.


Thanks for the update on your host's performance with the 64bit app. That's a really pleasing boost.

I just had a look at the GW tasks list for the Daniels_Parents host that sparked this in the first place. Lots of X64 tasks but still there aren't any that have been returned yet. I trust there will be a similar improvement there too.

Quote:
Maybe the Hewson machine of similar architecture will give us some insight on the heavily loaded end of the scale.


I hadn't realised Mike had a machine in this category. I just looked at the GW tasks there and see he has a couple returned with an obvious speed improvement as well. From the slow times and the big difference between CPU and elapsed, it must be a pretty heavily loaded workhorse. There are a number of inconclusive results, seemingly due to 1.03-1.04 comparisons which fail validation.

Cheers,
Gary.

Robert
Robert
Joined: 5 Nov 05
Posts: 42
Credit: 296,781,268
RAC: 13,298

RE: So yes, cache size

Quote:

So yes, cache size (per running task) seems to matter a lot.

Your post on cache size and it’s influence on runtimes got me to wonder how fast would a single GW v1.04 AVX work unit run.

So I adjusted my machine with hyperthreading turned off and currently running 4 GW v1.04 AVX work units to run just a single task. This machine has 8 MB of L3 cache, runs at 4.2 GHz and is currently executing 4 tasks together at 6 hours/task.

The 2 reported tasks as of this morning, individually run, average 5.5 hours. So a 8% speedup from exclusive use of the 8 MB L3 cache.

Bikeman (Heinz-Bernd Eggenstein)
Bikeman (Heinz-...
Moderator
Joined: 28 Aug 06
Posts: 3,516
Credit: 458,444,412
RAC: 60,734

RE: So I adjusted my

Quote:

So I adjusted my machine with hyperthreading turned off and currently running 4 GW v1.04 AVX work units to run just a single task.

Just so I understand this correctly: That is 4 GW tasks one after another, or 4 simultaneously? You later write " exclusive use of the 8 MB L3 cache" so this would mean running just 1 task at a time.

Quote:


This machine has 8 MB of L3 cache, runs at 4.2 GHz and is currently executing 4 tasks together at 6 hours/task.

The 2 reported tasks as of this morning, individually run, average 5.5 hours. So a 8% speedup from exclusive use of the 8 MB L3 cache.

So 4 tasks running in parallel would each have 8MB/4 = 2MB cache available, which seems to be already kind of OK (compared to the 1MB per task tests you did earlier in full hyperthreading that showed a rather poor performance). 8MB cache per task is then probably already overkill. Very interesting, thanks for the data points.

Cheers
HB

robl
robl
Joined: 2 Jan 13
Posts: 1,642
Credit: 1,127,428,138
RAC: 723,177

Noticed I completed and

Noticed I completed and validated my first 01 (version 1.04AVX) on an AMD cruncher - time: ~29k

Jim1348
Jim1348
Joined: 19 Jan 06
Posts: 381
Credit: 201,962,179
RAC: 3,947

Isn't it the i7-4770 CPU?

Isn't it the i7-4770 CPU?

robl
robl
Joined: 2 Jan 13
Posts: 1,642
Credit: 1,127,428,138
RAC: 723,177

RE: Isn't it the i7-4770

Quote:
Isn't it the i7-4770 CPU?

Yes.

Mike Hewson
Mike Hewson
Moderator
Joined: 1 Dec 05
Posts: 6,126
Credit: 127,356,222
RAC: 8,091

RE: RE: Maybe the Hewson

Quote:
Quote:
Maybe the Hewson machine of similar architecture will give us some insight on the heavily loaded end of the scale.

I hadn't realised Mike had a machine in this category. I just looked at the GW tasks there and see he has a couple returned with an obvious speed improvement as well. From the slow times and the big difference between CPU and elapsed, it must be a pretty heavily loaded workhorse. There are a number of inconclusive results, seemingly due to 1.03-1.04 comparisons which fail validation.


Ah ! I've looked into that. The inconclusive results specifically straddled a power outage last weekend, which also naturally explains the CPU/wall-clock disparity. We get alot of thunderstorms and strong wind this time of year. That machine isn't so much heavily loaded but too frequently interrupted, so alas then not such a good comparator. The Linux box has a different - and evidently the better - UPS. :-)

Cheers, Mike.

I have made this letter longer than usual because I lack the time to make it shorter. Blaise Pascal

tullio
tullio
Joined: 22 Jan 05
Posts: 2,044
Credit: 40,065,588
RAC: 11,284

I've started two AVX tasks on

I've started two AVX tasks on my A10 6700 CPU at 3.7 GHZ on the Windows 10 PC together with a Gamma-ray task and a CMS-dev Virtual Machine, plus the GPU task on Arecibo data which seems to progress on the Geforce GTX 750 graphic board. Checking the BOINC manager after about 10 hours I found the percentage of work done on the AVX tasks reduced to a mere 0.10%, while it is normal both on the Gamma-ray task and the CMS-dev. The Linux box with its SuSE Leap 42.1 running on the Opteron 1210 at 1.8 GHz, with only 8 GB DDR2 RAM in contrast to the 24 GB DDR3 RAM of the Windows 10 PC, crunch X64 tasks in about 47 hours with no pain.
Tullio

Jim1348
Jim1348
Joined: 19 Jan 06
Posts: 381
Credit: 201,962,179
RAC: 3,947

Without going into all the

Without going into all the details, I have found that VirtualBox sometimes interferes with the running of non-VBox projects. They were GPU projects in my case, but it might apply to the Einstein AVX tasks also. I would suggest the use of separate machines for VBox and non-VBox work, if that is possible.

Gary Roberts
Gary Roberts
Moderator
Joined: 9 Feb 05
Posts: 5,247
Credit: 44,975,640,584
RAC: 36,053,851

RE: The inconclusive

Quote:
The inconclusive results specifically straddled a power outage last weekend, which also naturally explains the CPU/wall-clock disparity.


I think you'll find that all three of those will ultimately validate against the 3rd task that has been sent out in each case because yours will match the new 1.04 app version being used. It will be the original quorum partner that will miss out because they are all V1.03s.

Power outages aren't supposed to cause erroneous results - the task can restart from the last saved checkpoint once power is restored. However it's probably a bit hard to get the same answers if there's a change in the science code between the two different versions ;-).

Cheers,
Gary.

Comment viewing options

Select your preferred way to display the comments and click "Save settings" to activate your changes.