I am about to do something that no one has ever done before!

Keith Myers
Keith Myers
Joined: 11 Feb 11
Posts: 4964
Credit: 18736907007
RAC: 7009446

Mikey, your spreadsheet needs

Mikey, your spreadsheet needs a bit of updating to current project states.  Who runs what, who is active still etc.

 

Ian&Steve C.
Ian&Steve C.
Joined: 19 Jan 20
Posts: 3956
Credit: 46905392642
RAC: 64502885

Dido wrote: I was wondering,

Dido wrote:

I was wondering, do you own the computers with server CPUs on your profile here? 

yes I do. The EPYC systems make great multi-GPU systems. 

_________________________________________________________________________

ASROCK
ASROCK
Joined: 12 Jan 15
Posts: 15
Credit: 1327826581
RAC: 0

Does anybody know why I can't

Does anybody know why I can't merge some of my VMs from the "Computers" tab on the Web?

I have manually assigned different external IP addresses to some of them upon creation. Could that be the reason?

EDIT: I'm also seeing some weird GPU load dips from 98% to 55% on multiple GPUs in the load profiling software that the data center runs on the entire mainframe. I can confirm the CPU cores are at low utilization, so that's not the problem. Any ideas what could be causing that. Is there a debugging feature available for Boinc apps?

The app in question is "Gamma-ray pulsar binary search #1 on GPUs v1.24 () x86_64-pc-linux-gnu" 

mikey
mikey
Joined: 22 Jan 05
Posts: 12687
Credit: 1839092724
RAC: 3800

Keith Myers wrote: Mikey,

Keith Myers wrote:

Mikey, your spreadsheet needs a bit of updating to current project states.  Who runs what, who is active still etc.

It's not mine it's the Official Grid Coin White List page

but yes I agree WCG is no longer run by IBM for a one and Minecraft is no longer active for another.

Rodrigo
Rodrigo
Joined: 5 Aug 17
Posts: 22
Credit: 249784206
RAC: 4312

EDIT: I'm also seeing some

EDIT: I'm also seeing some weird GPU load dips from 98% to 55% on multiple GPUs in the load profiling software that the data center runs on the entire mainframe. I can confirm the CPU cores are at low utilization, so that's not the problem. Any ideas what could be causing that. Is there a debugging feature available for Boinc apps?

  Im sure someone can tell better than me, if im wrong, please, correct it. You mentioned on other post that the GPUs are FP32 capable, as far as i know, when that's the case, with FGRPB1G tasks (the Gamma-ray app you mentioned), the FP64 portion of the task is done on the CPU, so the GPU load drops. But as you also mentioned the CPU cores are at low utilization, i dont really know what could be.

   Until last year i had a notebook with a Nvidia MX-150 dedicated graphics crunching FGRPB1G tasks, the last 10% of each task took longer and the GPU load drops with the CPU load increasing. But, the MX-150 is FP64 capable, so i'm not sure whats happening.

EDIT: The Tesla T4 is also FP64 capable, like the MX-150. 

Tom M
Tom M
Joined: 2 Feb 06
Posts: 6451
Credit: 9577623186
RAC: 7748669

Dido, Glad you got it

Dido,

Glad you got it running. On the regular leaderboard you would need all those systems running under a single computer ID to compete for the top individual system.

You should be able to compete for the top user however.

Your nodes will use considerable more power under load than idling. Who is paying for it?

I note you are showing 46 distinct systems. Not 1,000 :)

May this activity not cause you harm.

Tom M

A Proud member of the O.F.A.  (Old Farts Association).  Be well, do good work, and keep in touch.® (Garrison Keillor)  I want some more patience. RIGHT NOW!

ASROCK
ASROCK
Joined: 12 Jan 15
Posts: 15
Credit: 1327826581
RAC: 0

I thought I got it running,

I thought I got it running, but no. If I deploy around 20 VMs concurrently, everything seems to work just fine, but as I increase the number of active VMs, everything becomes unstable. On some VMs the Boinc client would randomly crash, on others I see inexplicable performance degradation, chaotic fluctuations in GPU load, while CPU utilization never exceeds 40% per thread. I've looked extensively at the performance and load profiling data from the mainframe and I can't explain why these issues occur. This is a typical case of demonic possesion. Perhaps, I should hire an exorcist. WTF.

To answer your question about who is paying for the electricity - nobody. The entire data center has an independent power grid. The source of that power is renewable energy. When you see big companies brag about being "carbon neutral", this is what it means. 

mikey
mikey
Joined: 22 Jan 05
Posts: 12687
Credit: 1839092724
RAC: 3800

Dido wrote: I thought I got

Dido wrote:

I thought I got it running, but no. If I deploy around 20 VMs concurrently, everything seems to work just fine, but as I increase the number of active VMs, everything becomes unstable. On some VMs the Boinc client would randomly crash, on others I see inexplicable performance degradation, chaotic fluctuations in GPU load, while CPU utilization never exceeds 40% per thread. I've looked extensively at the performance and load profiling data from the mainframe and I can't explain why these issues occur. This is a typical case of demonic possesion. Perhaps, I should hire an exorcist. WTF.

I have no clue but your process to figure it out should help the programmers once you have to give it back to them again.

Quote:
To answer your question about who is paying for the electricity - nobody. The entire data center has an independent power grid. The source of that power is renewable energy. When you see big companies brag about being "carbon neutral", this is what it means.  

WOO HOO!!

Tom M
Tom M
Joined: 2 Feb 06
Posts: 6451
Credit: 9577623186
RAC: 7748669

Did he ever demonstrate the

Did he ever demonstrate the ability to place a system in the top 5 or 10 at e@h?

A Proud member of the O.F.A.  (Old Farts Association).  Be well, do good work, and keep in touch.® (Garrison Keillor)  I want some more patience. RIGHT NOW!

Ian&Steve C.
Ian&Steve C.
Joined: 19 Jan 20
Posts: 3956
Credit: 46905392642
RAC: 64502885

Tom M wrote: Did he ever

Tom M wrote:

Did he ever demonstrate the ability to place a system in the top 5 or 10 at e@h?

each host was setup as a VM with a single GPU. Nvidia T4s aren’t that powerful and probably wouldn’t even make top 50. 
 

but I’m not sure if he ever got it working. He said before that he had a lot of weird issues when all systems were under load. 

_________________________________________________________________________

Comment viewing options

Select your preferred way to display the comments and click "Save settings" to activate your changes.