NVIDIA/AMD temp monitoring script

Anonymous
Topic 197660

As requested in another thread.

This script runs under Ubuntu 12.04. Should not be an issue on other distros.

The script name is: sys_temp.pl
Place this in your $HOME/scripts/temp directory. You can put it anywhere
you want but changes within the script will have to be made to reflect this
change.

I have modified this script for posting to exclude my actual node names, but
will be using the following for examples: lunar1_raspi, lunar2, lunar3, lunar4.
In the scripts change these to your node names/hostnames.

There are embedded debug statements in the form of "print". These are
bracketed ({}) by ($dbg) statements. setting my $dbg=1 will cause them to
print out. Make sure debug is not "on" when running as a cron job.

This script is written in perl and provides audible alarms and email when one
of the following events occur:
1. a NVIDIA GPU violates an arbitrary temperature
2. an AMD GPU violates an arbitrary temperature
3. a RaspberryPi CPU violates an arbitrary temperature

Support for CPU temps does not exist except for the Pi.

Some items to note:

1. These arbirtrary temperatures are defined in the script using these
variables:
my $violation_temp_pi = 38.0;
my $violation_temp_nvidia = 64.0;
my $violation_temp_amd = 60.0;

These values can be changed to whatever values suits your location.

2. You will have to find some "free" sound files on the web for your
audible alarms. Each machine should have a unique sound file.
The ones I use are definded in the script as follows:

if ($node_name eq 'lunar1'){
$alarm_sound = "$HOME/scripts/temp/sonar.mp3";
$send_to = 'your_user@mail.lunar1'; }
elsif ($node_name eq 'lunar2'){
$alarm_sound = "$HOME/scripts/temp/romulan_alarm.mp3";
$send_to = 'your_user@mail.lunar2';
}

I would sugges using the full path in the above rather then "$HOME"

NOTE: your /etc/hosts will have to have "mail.lunar2", etc.
in the alias
portion of your hosts entries. True for all other hosts.

3. Please read the comments at the top of the script for prerequisites.

4. In the script you will notice a reference to "/usr/sbin/sendmail". This
is because when you install "postfix" it will generate "/usr/sbin/sendmail"
as a softlink to postfix. This might be an issue if your distro handles
this differently.

5. Would be nice to also monitor CPU temps. This requires accessing the
bios and I have not yet been successful using the "sensor" command or
the "dmidecode" command. This might not be as "easy" as getting the
the GPU temps - bios' vary and so do distribution tools.

SCRIPT follows:

#!/usr/bin/perl -w

#
# sys_temp.pl - this perl script
#
# script function: a. send email when GPU temp exceeds a set value on a non-raspberry pi.
# In case of NVIDIA and AMD GPUs fan speed as a percentage will
# also be sent.
# b. send email when CPU temp exceeds a set value on a raspberry pi
#
# will be run as a cronjob every X minutes - you decide - see below for format
# initially the violation temps that trigger alarms/email will be an arbitrary value
#
# history:
# 080513 - initial write of script
# 080713 - made into a generic module which supports Linux Pi and non Pi Linux machines
# Temps on Pi are CPU temps
# Temps on non Pi machines are GPU temps. CPU temps are not provided.
# 081013 - Added alarm logic for Non Pi Linux. Pi Linux seems to have issues with sound
# 022214 - changed to postfix and away from ssmtp
# 070714 - rewrite to include AMD GPU
# supported GPUs: AMD and NVIDIA
# supported CPUs: PI only
# the sys_temp.pl script is expected to reside in $HOME/scripts/temp/
#
# prerequisites: sudo apt-get install perl $violation_temp){

#
# Pi mail starts here
#

if ($is_vcgencmd_present){ #must be Pi
#send email

open(MAIL, "|$email_program -t") || die "Can't open mail program $!\n";

# Mail Header

print MAIL "To: $send_to\n";
print MAIL "From: $mail_from\n";
print MAIL "Subject: $node_name $email_subject\n\n";

print MAIL <

current CPU temp is: $current_celsius_temp$celsius_symbol ($current_fahrenheit_temp$fahrenheit_symbol) - which violates the max operating temp of: $violation_temp$celsius_symbol.

EOF
close MAIL;

}else{ # must be GPU machine, i.e., all other machines employ either NVIDIA or AMD GPUs

#
# Non Pi mail starts here
#
open(MAIL, "|$email_program -t") || die "Can't open mail program $!\n";

# Mail Header

print MAIL "To: $send_to\n";
print MAIL "From: $mail_from\n";
print MAIL "Subject: $node_name $email_subject\n\n";

print MAIL <

current GPU temp is: $current_celsius_temp$celsius_symbol ($current_fahrenheit_temp$fahrenheit_symbol) - which violates the max operating temp of: $violation_temp$celsius_symbol.

current fan speed is: $fan_value$percent_symbol

EOF
close MAIL;

}

#
# sound alarm
# based upon hostname
# each host has a unique sound
#

system("/usr/bin/mpg123", "-q", "--loop", "5", $alarm_sound);
}
END SCRIPT

hope the cut and paste of the file is clean. I checked it and it seems fine but ....

You will need to install postfix NOT sendmail and postfix will need to be configured for email to work.

Once you make changes to the script for your hostnames you can use some mp3 file you can find on your distro for initial checkout.

Hope this helps.

Phil
Phil
Joined: 8 Jun 14
Posts: 579
Credit: 228493502
RAC: 0

NVIDIA/AMD temp monitoring script

Nice script, much larger than I thought it would be. Funny you are using Postfix. I mentioned several weeks ago that my Linux experience was about 10 years old. Back then I ran a couple of servers out of the house and was using Fedora for the web server and Postfix for the email server.

Postfix was a serious pain to set up but once working was a dream. Screaming fast, stable, and all but impossible to break into. It is actually written by a cryptographer as I recall.

I'm going to play around with your scripts a bit. I'll need to go get a good book on perl. Been a very long time, but I'm attempting to run this from a "master" station. My crunchers have no sound or monitors. Might see if I can get them to work somehow from my control station. If I can get them to work I'll post the how-to in this thread. Might be awhile since I should be returning to work in a few days.

Thanks for the post and the scripts.

Phil

Anonymous

RE: Postfix was a serious

Quote:

Postfix was a serious pain to set up but once working was a dream. Screaming fast, stable, and all but impossible to break into. It is actually written by a cryptographer as I recall.


For me Postfix was easier to configure then sendmail. And your right. Much more secure. If you have a problem with postfix I can provide a quick config file that I used for this setup. Every cruncher box will have to have postfix installed for email to work. I did not consider configuring a domain "mail server". That becomes another level of effort.

Quote:

I'm going to play around with your scripts a bit. I'll need to go get a good book on perl.


The book I strongly recommend for Perl is: "Perl by Example", written by Ellie Quigley. I have the 3rd Edition but there is a later one. It is in my opinion one of the best technical books written.

Quote:

Thanks for the post and the scripts.


Happy to contribute.
robl

Quote:

Phil
Phil
Phil
Joined: 8 Jun 14
Posts: 579
Credit: 228493502
RAC: 0

My intent is to have a bare

My intent is to have a bare minimum of anything extra on my crunchers. I'm thinking having the "master" hit the crunchers with a request for temps and such, then have the master sound the alert so to speak.

Thanks for the book title. I've never dealt with perl whatsoever, but seeing as how the docs just put me off work for another 3 months I'm going to have some time on my hands.

Phil

Anonymous

here is a writeup on postfix

here is a writeup on postfix config to support the above script.

This procedure is for Ubuntu postfix install. your distro might behave
differently.

1. a. remove sendmail if it exists
b. if postfix is installed incorrectly remove it:
sudo apt-get --purge remove postfix

2. install postfix
sudo apt-get install postfix

PAY ATTENTION TO THE OUTPUT ON AN ASCII SCREEN AND MAKE
THE CORRECT SELECTION

CHOOSE "INTERNET SITE"

you can relaunch the asscii screen with:
sudo dpkg-reconfigure postfix

now edit /etc/postfix/main.cf paying attention to the following:
"relayhost" will cause email to be relayed through your mail server,
.i.e., google.com. You will need to search/google for your
email provider's "relayhost" name

a. vi /etc/postfix/main.cf
relayhost = mail.something.something
you can find this probably by looking in your windows email client
myhostname = mail.lunar2 <---- must be in /etc/hosts

service postfix restart

b. /etc/hosts
192.168.xxx.xxx mail.lunar2 lunar2
the above must "match" the myhostname entry in main.cf

c. in the configuration of the email client, "evolution" in this case.

d. vi /etc/aliases
change the line:
#root: marc
to
root: your_user (i.e., john if this is your user)

run newaliases to generate aliases.db in /etc
newaliases
this will generate a new /etc/aliases.db
do an "ls" on this file to check current time stamp

B. sudo service postfix start

C. install dovecot
sudo apt-get install dovecot-imapd dovecot-pop3d

there must be a file by the name of /var/mail/your_username
and it must be owned by your user:group, i.e.,
-rw-r--r-- 1 ronjon ronjon 1633 Sep 6 22:08 ronjon
note the permissions
the directory /var/mail must be owned by root:mail
this is the reason for the mail_privileged_group = mail
entry in dovecot.conf (see below)

add the following two lines to /etc/dovecot/dovecot.conf

mail_privileged_group = mail
mail_location = mbox:~/mail:INBOX=/var/mail/%u

sudo service dovecot stop/start

Anonymous

RE: My intent is to have a

Quote:

My intent is to have a bare minimum of anything extra on my crunchers. I'm thinking having the "master" hit the crunchers with a request for temps and such, then have the master sound the alert so to speak.

Thanks for the book title. I've never dealt with perl whatsoever, but seeing as how the docs just put me off work for another 3 months I'm going to have some time on my hands.

I really like Perl. Some feel that Python is a more elegant/structured language and it might be. However I do not know Python and at 68 have no real
desire to learn it. Perl was quite easy for me to pick up using the above book and I found that there is enough latitude in Perl to do some very bizarre "things" so be sure to comment. After you get experience with perl then you can do Perl/Tk. Tk is a graphical interface allowing you to design some pretty nice GUIs. You can see a network sniffer written in Perl/Tk that I wrote here

Quote:

Phil


Mike Hewson
Mike Hewson
Moderator
Joined: 1 Dec 05
Posts: 6588
Credit: 311886627
RAC: 124795

Perl was written by a

Perl was written by a computer geek with linguist training and so that is reflected in the product. It is often loved for it's natural ease, especially in learning.

However that also the reason why some dislike it! :-)

Some feel that it lacks the rigour of other languages, where semantics are too context dependent. But like any tool you use it for a purpose. Probably the leading reasons are (a) you can get on with it, quick slick and (b) hard to beat for working with text. So for quick & dirty server scripts it is legend.

Do be careful about versions though, there's a distinct jump to version 6, so check which one you are reading about and what's actually being run.

Cheers, Mike.

I have made this letter longer than usual because I lack the time to make it shorter ...

... and my other CPU is a Ryzen 5950X :-) Blaise Pascal

Anonymous

RE: Do be careful about

Quote:


Do be careful about versions though, there's a distinct jump to version 6, so check which one you are reading about and what's actually being run.

Cheers, Mike.

version 6 has been a work in progress since 2000 i believe. Any idea when it will be finished/released?

Mike Hewson
Mike Hewson
Moderator
Joined: 1 Dec 05
Posts: 6588
Credit: 311886627
RAC: 124795

RE: RE: Do be careful

Quote:
Quote:


Do be careful about versions though, there's a distinct jump to version 6, so check which one you are reading about and what's actually being run.

Cheers, Mike.

version 6 has been a work in progress since 2000 i believe. Any idea when it will be finished/released?


At perl.org, Perl6

Quote:
... is currently being developed by a team of dedicated and enthusiastic volunteers.


and so is nuclear fusion. :-)

Ironically it's Perl5 that's a movin'. Latest is 5.20, was ~5.14 say three years ago IIRC. Perl6 has a dev/compiler release called Sudoku or something like that.

I like Perl mainly because of all those special variables that you can just pluck when you want. I don't write it so much as read it, in order to decipher what some developer really wanted to do with some Linux package install. I also like llamas.

Like most learning exercises it's wise to activate good error feedback from the compiler eg.

#!perl -w

on the shebang line ( v5.6+ ). If nothing else it tells you what the compiler thinks about what it thinks you think you are doing ! :-) :-)

Cheers, Mike.

( edit ) NB : despite the achievements at Perl obfuscation contests you can't write opaque binaries.

( edit ) So Perl is not a 'write-only' language. But you can fudge-up anything if you want to.

I have made this letter longer than usual because I lack the time to make it shorter ...

... and my other CPU is a Ryzen 5950X :-) Blaise Pascal

Comment viewing options

Select your preferred way to display the comments and click "Save settings" to activate your changes.