Crunching in a Virtual Machine

mikey

Joined: 22 Jan 05

Posts: 12718

Credit: 1839121911

RAC: 3569

RE: RE: Finally got Boinc

30 Jan 2014 13:50:16 UTC

Message 119904 in response to message 119902

(moderation:

)

Quote:

Quote:
Finally got Boinc to use a GPU in a Virtual Machine.

Congrats!!!

Quote:
It looks like my AMD 7970 loses 30% performance on the BRP4G tasks and 40% on the BRP5 tasks (only one data point on virtual BRP5 so far).

BRP4G
Virtual 4114 4055 4041 4112 3930. Ave 4050
Native 3024 2972 3218 3217 3210. Ave 3128
Delta 1.3

in the above example the Native data point 3024. What is it referring to? Where is this info located? [EDIT] is the 3024 a posted "run time"?
Quote:

BRP5
Virtual 15952
Native 11759 11148 11112. Ave 11340
Delta 1.4[/code]
Setup during this test was a virtual Windows Server 2008 machine with 4 vCPU running 3 simultaneous GPU tasks. At the same time another Debian VM was running a fairly CPU intense job.
Compared to native Windows 7 machine with a 4 core CPU with hyper threading running 3 simultaneous GPU tasks and 3 simultaneous iGPU tasks.
I'll let the effect of the Debian VM cancel out the iGPU jobs for simplicity :)

40% stings a bit. But at the moment I think I'll keep the GPU in the VM and play around with it.

Big thanks to Robl and the rest who helped me through this ordeal :)

:>)

The performance of my Radeon card is most disappointing. 30 hours into 3 jobs and only half way if I am to believe the "status/performance" bars. What I really need to do and the thought scares the crap out of me is to replace the radeon with a NVIDIA 650 TI. I have two. This way I could do a comparison across a Win7 VM and an Ubuntu box (not a VM). Courage. If a 650TI on a VM drops down significantly then I would have to rethink virtualization from a crunching perspective.

Just drive around and look for a pc on the curb on trash day and build yourself a second pc to test with, it would be easier and cheaper in the long run. Harddrives are cheap and since you are thinking Ubuntu so is the OS, the old pc should come with a cd/dvd drive in it and a dual core is plenty fast enough for testing and crunching. If you lived near DC I would GIVE you one as I have several on the shelf unused, shipping a pc is cost prohibitive though!!

Anonymous

Not good. My 1st 3

30 Jan 2014 16:29:37 UTC

Message 119905 in response to message 119904

(moderation:

)

Not good. My 1st 3 BRP5-opend-ati tasks finished in error: Maximum elapsed time exceeded.

I will now have to abort the remaining downloads. I will give BRP4G-opencl-ati tasks a run and see what happens to them. I also scaled back the GPU Utilizaton factor to 1 from .33. We will see what this does for the cause.

It appears that your choice of GPU will determine your success at crunching in a virtual environment. And even then if you might see a significant performance drop. This would tell me that virtualization might not be the ideal environment for crunching. Waiting on Logforme to report back on his performance with his Radeon card which is a top end GPU.

Anonymous

Not good. My 1st 3

30 Jan 2014 16:29:38 UTC

Message 119906 in response to message 119904

(moderation:

)

Not good. My 1st 3 BRP5-opend-ati tasks finished in error: Maximum elapsed time exceeded.

It appears that your choice of GPU will determine your success at crunching in a virtual environment. And you might see a significant performance drop. This would tell me that virtualization might not be the ideal environment for crunching. Waiting on Logforme to report back on his performance with his Radeon card which is a top end GPU.

Logforme

Joined: 13 Aug 10

Posts: 332

Credit: 1714373961

RAC: 0

Well .. color me confused. I

30 Jan 2014 16:59:39 UTC

Message 119907 in response to message 119906

(moderation:

)

Well .. color me confused.
I had a look at the workunits crunched today and got totally different and better runtimes. Today the BRP4G tasks averaged 3150 seconds and the BRP5 tasks averaged 12303 seconds. That's 0.7% and 8% "slower" than native respectively.
What's going on? Does the workunits vary so much in length and I just happened to compare "short" ones on the native with "long" ones on virtual yesterday? If that's the case this kind of comparison is useless and the only way to tell is to see if my RAC changes over a long time.
Or .. the VM / XenServer somehow "learned" and optimized itself? Not read about anything like that so I'm skeptical.

No matter what, today the "virtual" GPU performed at native speed.

ExtraTerrestria...

Joined: 10 Nov 04

Posts: 770

Credit: 579986853

RAC: 199401

Thanks for your numbers so

30 Jan 2014 21:36:23 UTC

Message 119908 in response to message 119907

(moderation:

)

Thanks for your numbers so far guys!

Logforme, did you change the CPU load outside the VM? I could imagine some problem with task priorities and starving the GPU if e.g. tasks in the hypervisor run at normal priority and the VM completely at low.

Robl, that AMD is a very small GPU, far slower than your GTX650Ti. I'd at most run 2 concurrent WUs on it, but could well imagine that 1 is enough. It is about twice as fast as a HD5450, which used to need 36500s for a single WU with BRP4 1.28.

MrS

Scanning for our furry friends since Jan 2002

Anonymous

RE: Thanks for your numbers

30 Jan 2014 23:14:01 UTC

Message 119909 in response to message 119908

(moderation:

)

Quote:

Thanks for your numbers so far guys!

Logforme, did you change the CPU load outside the VM? I could imagine some problem with task priorities and starving the GPU if e.g. tasks in the hypervisor run at normal priority and the VM completely at low.

Robl, that AMD is a very small GPU, far slower than your GTX650Ti. I'd at most run 2 concurrent WUs on it, but could well imagine that 1 is enough. It is about twice as fast as a HD5450, which used to need 36500s for a single WU with BRP4 1.28.

I have downloaded BRP4G WUs and I am currently crunching a single GPU WU. I am now 6 hours into that one job and it is 60% complete. I feel that this job will complete. I will then set the utilization factor to process 2 concurrent tasks and see how that goes.

I knew when I bought this card it was not a top end card and that the effort was to get pass through to work first so I am not really that disappointed. I did not want to invest big $s in a GPU if pass-through would not work. I am still waiting to see how Logforme's GPU performs. I still have the option of moving a 650TI and might give it a go tomorrow.

Quote:

MrS

Anonymous

Finished the 1st BRP4G WU in

31 Jan 2014 14:44:56 UTC

Message 119910 in response to message 119909

(moderation:

)

Finished the 1st BRP4G WU in 35348 (run time secs)on a RADEON HD6450. No other WUs being processed.

Logforme

Joined: 13 Aug 10

Posts: 332

Credit: 1714373961

RAC: 0

RE: Logforme, did you

31 Jan 2014 15:57:56 UTC

Message 119911 in response to message 119908

(moderation:

)

Quote:

Logforme, did you change the CPU load outside the VM?

I've deliberately kept my hands off the machine to get some consistent data.
The first evening should if anything have less CPU load outside the VM since the only other VM I run at the moment was still ramping up it's CPU usage after the restart.
The second day showed runtimes very close to what they were when the GPU was in my Win7 workstation.
Today the runtimes for BRP5 (don't get any BRP4G anymore) are even faster. 10643 seconds on average. 6% faster than native.

CPU usage for the GPU VM is 23% (4 vCPU), other VM is 28% (2 vCPU) and the XenServer as a whole is only at 22% (8 CPU). Nowhere near being CPU starved.

My conclusion is that the GPU is running at (or very close to) native speed in the VM. The different runtimes must be due to the workunits, cosmic radiation or acts of the FSM.

ExtraTerrestria...

Joined: 10 Nov 04

Posts: 770

Credit: 579986853

RAC: 199401

Yeah, inconsistent

31 Jan 2014 21:09:05 UTC

Message 119912 in response to message 119911

(moderation:

)

Yeah, inconsistent single-digit timing differences can be due to running different WUs at Einstein. It's unlikely to get very different ones within a short time frame, but there's nothing forbidding it either.

Your first results with +30 - 40% time were surely not due to WU differences. But if these don't repeat we might just write them off as outliers.

MrS

Scanning for our furry friends since Jan 2002

Anonymous

Spent the better part of

1 Feb 2014 13:15:43 UTC

Message 119913 in response to message 119912

(moderation:

)

Spent the better part of yesterday trying to get the Win 7 VM to play with a NVIDIA 650TI. No go.

I deleted and rebuilt the VM using the same Win 7 distro I had originally used. Ran windows Update. After the installation completed Window's device manager showed two VGA graphics devices. Same as when installing the ATI card. Dropped in the driver disk for the 650 card and it refused to install because it did not see/recognize the NVIDIA card. Looking at the DVD contents I clicked on "setup" under the drivers folder. This time the drivers along with other "pieces" installed. The
"Standard VGA" entry changed to NVIDIA GTX 650 in device manager BUT had the dreaded yellow bang. Searched for new drivers. Windows found some and installed them. No change. Yellow bang. Unlike the ATI install in which Win 7 placed a yellow bang by the "standard VGA" entry and NOT by the ATI device, Win 7 now "bangs" the NVIDIA entry. The error code indicates that the drivers were not properly installed.

Deleted the VM, rebuild another for the ATI device and am back to the original ATI card.

Would certainly like for someone to tell me what I did wrong because I have no idea other than in my reading NVIDIA cards seem to be a bit of a ....

Crunching in a Virtual Machine

Forums › Cruncher's Corner

Comment viewing options

Forums › Cruncher's Corner