Adapting a GPU to a server platform

Matt White
Matt White
Joined: 9 Jul 19
Posts: 120
Credit: 280798376
RAC: 0
Topic 219284

Greetings all,

I was recently able to acquire an HP DL360 G6 server. (Actually 2, one is being used as a parts pig.) I have equipped the unit with two identical CPUs Xeon 5650's, running at 2.67 GHz. This gives me 12 physical cores (24 virtual) with a total of 36GB of ECC memory. The unit had VMWare and two instances of 2008R2 server, which I promptly blew away, and loaded a single instance of Windows 7. The RAID array was left as configured, 4 300GB drives, giving a total of 831GB. The BBWC battery is bad, so I'll need to replace that.

I was able to load Win 7 and BIONIC on the platform with no issues. I also run TThrottle and BIONC Tasks, as I run my boxes headless in the basement, and doing an Remote Desktop session into a box will kill the GPU. The first thing I had to do was set the number of usable processors to 50%, allowing one physical core for each WU, and keeping the box from going into launch mode. With TThrottle, I set the max core temp to 60C, reducing fan noise and power consumption. The new box seems to run fine with this configuration.

Now, if I were smart, I'd leave well enough alone and let GPU tasks run on the workstation, but where is the fun in that? :) I would really like to get the GPU moved over to the new box. Doing some research, I found that the PCIe port in the server is limited to a card drawing no more that 75 watts, unless you add the additional HP power cable. Since the GT1030 draws 30 watts max, power should not be an issue. So yesterday afternoon, I shut down both the workstation and the new server, pulled the GPU out of the workstation and installed it in the server. I set the BIOS to run from the optional video and disable the integrated video, and booted the machine. The server came up and all was well under the standard VGA driver. Great! All I have to do is download and install the NVIDIA driver and I'm off to the races! NOT!

The driver downloaded and installed, but when I rebooted, Windows would hang as soon as it began to load drivers. In fact, it would not even come up in Safe mode! The only way I was able to recover was using System Restore. (Note to self; always set a Restore point before changing configuration.) In the process, after several lengthy reboots, I tried disabling the virtual machine settings in the BIOS, to no avail. Windows would hang as soon as it loaded drivers. Obviously, I have a driver issue. Fortunately, System restore got me back to the starting line and I didn't orphan any work.

I'm quite sure I'm not the only one to go down this rabbit hole. :) Last night, while searching for a solution, I read an interesting thread, here on the forum concerning virtualization and GPUs. So, can anyone shed some light on this issue? Did I miss a step? Would switching the platform over to LINUX allow me to run the GPU on the server hardware?

Any thoughts are welcome.

Thanks, Matt

Clear skies,
Matt
mikey
mikey
Joined: 22 Jan 05
Posts: 11851
Credit: 1826549757
RAC: 268404

Try an older driver for your

Try an older driver for your card, the biggest problem I've seen with older machines is trying to load the latest drivers on them. I have some older machines too and am using more year old drivers allow my gpu's to crunch just fine.

Matt White
Matt White
Joined: 9 Jul 19
Posts: 120
Credit: 280798376
RAC: 0

Thanks for the tip Mikey, I

Thanks for the tip Mikey, I did try that today, to no avail. I did get another card to work; a GT710 which was in another computer. Unfortunately it runs at half the speed of the GT1030. A half speed GPU is better than no GPU. :) I did read somewhere, that the GT710 is one of the cards which will work on a DL360G6 server. I'll have to track down that list to see if I can find a better card that won't break the bank. Some of the high end dedicated GPU processors, like the Tesla line are not cheap!

I am considering a move to LINUX which might make it easier to cobble together older hardware. Now that I have a box freed up, I can play a little.

Clear skies,
Matt
mikey
mikey
Joined: 22 Jan 05
Posts: 11851
Credit: 1826549757
RAC: 268404

ka1bqp wrote:Thanks for the

ka1bqp wrote:

Thanks for the tip Mikey, I did try that today, to no avail. I did get another card to work; a GT710 which was in another computer. Unfortunately it runs at half the speed of the GT1030. A half speed GPU is better than no GPU. :) I did read somewhere, that the GT710 is one of the cards which will work on a DL360G6 server. I'll have to track down that list to see if I can find a better card that won't break the bank. Some of the high end dedicated GPU processors, like the Tesla line are not cheap!

I am considering a move to LINUX which might make it easier to cobble together older hardware. Now that I have a box freed up, I can play a little.

I found some old forums that had alot of stuff in them but it took awhile to read thru all the non relevant stuff to find the stuff I needed. My old boxes are HP's and Dell's with dual quad core Xeon cpu's in them so there was info available it just took me a LOT of research time to find what I needed. The machines ran non crunching gpu's until I found what I needed.

Some of my older machines will only take a 750Ti gpu max without a bios update and I'm unwilling to do that as the machines crunch as they are. I did play around with some AMD gpu's though and some machines took them that wouldn't take newer Nvidia gpu's. I would not recommend buying any 750Ti gpu's from China though, a couple of years ago I bought 5 or 6 of them as 'brand new still in the box' ones and they were refurbs instead and most have quit on me. I found one company from China selling them as 'brand new' but they were made from parts from unsold stock bins that they were putting together and selling as brand new, they were NOT Nvidia quality gpu's though and also died on me. I have one still running and 3 that kinda sorta work sometimes but mostly don't.

Matt White
Matt White
Joined: 9 Jul 19
Posts: 120
Credit: 280798376
RAC: 0

The NVIDIA GT1030, running on

The NVIDIA GT1030, running on my XW4600, ran surprisingly well, for a low end card. Last night, i poured through the Windows logs to see if I could confirm a driver issue, which I was able to do. It looks like the Vulkin driver is causing the issue. I don't think I'm dealing with a straight hardware issue, as the box would have punted on boot up even before the drivers were loaded.

I'm thinking that a move to LINUX is going to be the path forward. Reading some information on the Ubuntu website, it looks like the only difference between the server distro and the desktop version is the type of packages included. The OS core and kernel are the same, according to what I read. So, I should be able to load up a LINUX test bed on my currently unused XW4600 and the GT1030. If I can get that configuration up and running reliably crunching GPU tasks, I can consider reformatting the server and converting to LINUX. While I'm at it, I'll reconfigure the RAID array. I don't think it is configured for the fastest R/W configuration, and I don't need 899 GB for a dedicated BIONIC box. In a previous life, this box was a VM Ware host for a pair of virtual machines, one of them, a domain controller.

My DL360G6 is currently running with 55% of the CPUs being utilized for BIONIC. In addition, I reset my TThrottle settings to cap at 70C, instead of 60C. The GT710 card is running reliably in this setup, with no computation errors, but, compared to the GT1030, this card is painfully slow, providing, at best, 1/3rd of the performance of the newer card. The CPU tasks seem to take slightly more time than they did on the XW4600 (it's hard to compare unless the tasks are identical) but since the server clock is 2.67 GHz, as compared to the 3 GHz on the workstation, I expected a small performance hit. The ability to run 12 CPU tasks concurrently more than makes up the difference. :)

Clear skies,
Matt
mikey
mikey
Joined: 22 Jan 05
Posts: 11851
Credit: 1826549757
RAC: 268404

ka1bqp wrote:The NVIDIA

ka1bqp wrote:

The NVIDIA GT1030, running on my XW4600, ran surprisingly well, for a low end card. Last night, i poured through the Windows logs to see if I could confirm a driver issue, which I was able to do. It looks like the Vulkin driver is causing the issue. I don't think I'm dealing with a straight hardware issue, as the box would have punted on boot up even before the drivers were loaded.

I'm thinking that a move to LINUX is going to be the path forward. Reading some information on the Ubuntu website, it looks like the only difference between the server distro and the desktop version is the type of packages included. The OS core and kernel are the same, according to what I read. So, I should be able to load up a LINUX test bed on my currently unused XW4600 and the GT1030. If I can get that configuration up and running reliably crunching GPU tasks, I can consider reformatting the server and converting to LINUX. While I'm at it, I'll reconfigure the RAID array. I don't think it is configured for the fastest R/W configuration, and I don't need 899 GB for a dedicated BIONIC box. In a previous life, this box was a VM Ware host for a pair of virtual machines, one of them, a domain controller.

My DL360G6 is currently running with 55% of the CPUs being utilized for BIONIC. In addition, I reset my TThrottle settings to cap at 70C, instead of 60C. The GT710 card is running reliably in this setup, with no computation errors, but, compared to the GT1030, this card is painfully slow, providing, at best, 1/3rd of the performance of the newer card. The CPU tasks seem to take slightly more time than they did on the XW4600 (it's hard to compare unless the tasks are identical) but since the server clock is 2.67 GHz, as compared to the 3 GHz on the workstation, I expected a small performance hit. The ability to run 12 CPU tasks concurrently more than makes up the difference. :)

I do similar things to my boxes too...install Linux I mean. I like Ubuntu or Linux Mint as both are more Windows like than most and I have no need for a Server I just like the 16, with HT, cpu cores that bang out workunits. Mine run between 2.27 ghz to 2.93 ghz between the 5 pc's but as I said they just bang out workunits and like you speed isn't what I'm after it's quantity. Having a gpu in them that can crunch is a bonus for me too. I got mine off of Ebay for under $200US each and all I had to do was add a harddrive and a gpu to each one. They all have at least 16gb of ram and booted right up when I got them. 

One of the boxes is running Linux but the others are running Windows 10 because it was a free update at the time, the problem is as Windows has evolved it's going beyond the capabilities of these boxes so they will all go to Linux eventually. I started with a 160gb harddrive in each one but most now have a 250gb ssd in them, one doesn't but the new drive is on the shelf waiting for the pc to have some problems so I can take it down for a bit to make the change over.

QuantumHelos
QuantumHelos
Joined: 5 Nov 17
Posts: 190
Credit: 64239858
RAC: 0

ka1bqp wrote:Greetings

ka1bqp wrote:

Greetings all,

I was recently able to acquire an HP DL360 G6 server. (Actually 2, one is being used as a parts pig.) I have equipped the unit with two identical CPUs Xeon 5650's, running at 2.67 GHz. This gives me 12 physical cores (24 virtual) with a total of 36GB of ECC memory. The unit had VMWare and two instances of 2008R2 server, which I promptly blew away, and loaded a single instance of Windows 7. The RAID array was left as configured, 4 300GB drives, giving a total of 831GB. The BBWC battery is bad, so I'll need to replace that.

I was able to load Win 7 and BIONIC on the platform with no issues. I also run TThrottle and BIONC Tasks, as I run my boxes headless in the basement, and doing an Remote Desktop session into a box will kill the GPU. The first thing I had to do was set the number of usable processors to 50%, allowing one physical core for each WU, and keeping the box from going into launch mode. With TThrottle, I set the max core temp to 60C, reducing fan noise and power consumption. The new box seems to run fine with this configuration.

Now, if I were smart, I'd leave well enough alone and let GPU tasks run on the workstation, but where is the fun in that? :) I would really like to get the GPU moved over to the new box. Doing some research, I found that the PCIe port in the server is limited to a card drawing no more that 75 watts, unless you add the additional HP power cable. Since the GT1030 draws 30 watts max, power should not be an issue. So yesterday afternoon, I shut down both the workstation and the new server, pulled the GPU out of the workstation and installed it in the server. I set the BIOS to run from the optional video and disable the integrated video, and booted the machine. The server came up and all was well under the standard VGA driver. Great! All I have to do is download and install the NVIDIA driver and I'm off to the races! NOT!

The driver downloaded and installed, but when I rebooted, Windows would hang as soon as it began to load drivers. In fact, it would not even come up in Safe mode! The only way I was able to recover was using System Restore. (Note to self; always set a Restore point before changing configuration.) In the process, after several lengthy reboots, I tried disabling the virtual machine settings in the BIOS, to no avail. Windows would hang as soon as it loaded drivers. Obviously, I have a driver issue. Fortunately, System restore got me back to the starting line and I didn't orphan any work.

I'm quite sure I'm not the only one to go down this rabbit hole. :) Last night, while searching for a solution, I read an interesting thread, here on the forum concerning virtualization and GPUs. So, can anyone shed some light on this issue? Did I miss a step? Would switching the platform over to LINUX allow me to run the GPU on the server hardware?

Any thoughts are welcome.

Thanks, Matt

Not to critisize but get both servers running! even 1 core, Alternative proposal is use both power packs or ..

Get corsair ATX Gold power packs 100$ and run both boxes, the new digital Corsair Power units are efficient and have to say wonderful!

 

QE

mikey
mikey
Joined: 22 Jan 05
Posts: 11851
Credit: 1826549757
RAC: 268404

ka1bqp wrote:Greetings

ka1bqp wrote:

Greetings all,

The driver downloaded and installed, but when I rebooted, Windows would hang as soon as it began to load drivers. In fact, it would not even come up in Safe mode! The only way I was able to recover was using System Restore. (Note to self; always set a Restore point before changing configuration.) In the process, after several lengthy reboots, I tried disabling the virtual machine settings in the BIOS, to no avail. Windows would hang as soon as it loaded drivers. Obviously, I have a driver issue. Fortunately, System restore got me back to the starting line and I didn't orphan any work.

I'm quite sure I'm not the only one to go down this rabbit hole. :) Last night, while searching for a solution, I read an interesting thread, here on the forum concerning virtualization and GPUs. So, can anyone shed some light on this issue? Did I miss a step? Would switching the platform over to LINUX allow me to run the GPU on the server hardware?

Any thoughts are welcome.

Thanks, Matt

Sorry I missed this part.

You can not use the gpu for Boinc  crunching in a 'virtual' environment in Windows, it just doesn't allow that. Windows also won't let you have multiple logins and switch between them with the gpu continuing to crunch the same file under each login.

Matt White
Matt White
Joined: 9 Jul 19
Posts: 120
Credit: 280798376
RAC: 0

SUCCESS!!!My new (to me)

SUCCESS!!!

My new (to me) DL360 is up and crunching with the GT1030 on Windows 7! I'm quite embarrassed to admit, my issue was a self inflicted wound.

I have a habit of disabling any device in Device Manager which shows it doesn't have a driver. Apparently the two devices in this box did not like that. What tipped me of was, I encountered the same issue with the GT710, after I tried to put things back together at the end of my last windows attempt. This was a card I know was working prior. Lesson learned: Don't mess with other settings when working on a specific issue. I probably did this knowing how long this box takes to boot after a configuration change.

So, after getting things running, I settled on the following settings. 55% of CPUs, running at 100%. This allows 12 concurrent tasks, plus one additional GPU task. The box has 24 processors being reported, but there are only 12 physical cores. Box is set to favor processes over programs. TThrottle is set to a max CPU core temp of 70C, and a max GPU temp of 80C. It's quite cool downstairs, even in the summer, and the temp graphs are safely below the thresholds.

I'm using Teamviewer to do maintenance tasks, and BOINC Tasks to administer the box from upstairs.

Thanks again to everyone who helped out.

Matt

Clear skies,
Matt
cecht
cecht
Joined: 7 Mar 18
Posts: 1407
Credit: 2431522106
RAC: 1518892

KA2BQP, how many watts does

KA2BQP, how many watts does that DL360 pull while crunching?

Ideas are not fixed, nor should they be; we live in model-dependent reality.

Matt White
Matt White
Joined: 9 Jul 19
Posts: 120
Credit: 280798376
RAC: 0

As configured, the power

As configured, the power consumption reported in iLO is about 315 Watts. That's not running full tilt. I would imagine it could suck down every bit of juice the dual 460 Watt supplies can produce. I've configured it for a single power supply. As stated above, I've limited the number of CPUs used in Boinc at 55%. One task per physical core plus overhead for the GPU.

The cooling fans are running at a respectable pace, but they are not in launch mode. :)

Matt

Clear skies,
Matt

Comment viewing options

Select your preferred way to display the comments and click "Save settings" to activate your changes.