NMon is useful in many ways to an Administrator. NMon can be used interactively (topas replacement) as well as in a non-interactive role. In some situations an Administrator may have the need to have some historical data on a server and its performance. One would have to set something up to do this on a regular basis. In our shop, we capture NMon data 24/7 and keep 30 days worth in the event that it is needed to isolate problems or issues. Below you will find the shell program that I have written to collect the data using NMon. There are a few processes built into the shell program to handle server reboots so that NMon picks up roughly where it left off and captures the remaining data for the day. At the end of the day, all of the files are joined into one common file. This shell program does not process the data files into HTML files as of yet. This may be included in the future or created to be its own shell program. I would like to point out that we currently use NMon for AIX v5.3 only. This shell program is tailored to AIX v5.3 for the time being. This shell program can easily be altered to support multiple platforms running NMon (see startNmon() function in the below code). The rest of the shell program should be written in such a way that it is portable between platforms. If you see room for improvement or some other idea that can be added, let me know!
You will need the following in addition to the code below:
NMon - for collecting the data.
Main web page:
http://www-941.ibm.com/collaboration/wiki/display/WikiPtype/nmonThe latest stable version for AIX is nmon4aix_11e.tar.gz.
Download here:
http://www-941.ibm.com/collaboration/wiki/download/attachments/437/nmon4aix_11e.tar.gz?version=1The latest stable version for Linux is nmon4linux_power_11d.zip.
Download here:
http://www-941.ibm.com/collaboration/wiki/download/attachments/437/nmon4linux_power_11d.zip?version=2NMon Merge - for assembling multiple data files.
Download here:
http://www-941.ibm.com/collaboration/wiki/download/attachments/437/nmonmerge.tar.gz?version=1Peformance Graph Viewer - for viewing charts based on the collected data.
Main web page:
http://www-941.ibm.com/collaboration/wiki/display/WikiPtype/Performance+Graph+ViewerThe latest stable version for all platforms is pGraph.jar v1.4.
Download here:
http://www-941.ibm.com/collaboration/wiki/download/attachments/3714/pGraph.jar?version=2nmonStart.ksh:The latest version is 1.0.0.0
#!/usr/bin/ksh
#-------------------------------------------------------------------------------------------||
# Shellscript Name: nmonStart.ksh
# Version: 1.0.0.0
# Hostname: AIX Servers
# Created: 04/18/2007
# Created by: TEP
#
# Description:
# This script does a multitude of things. It starts up Nmon on a server. It keeps track of
# the snapshots taken. In the event of a reboot, system failure, accidental PID kill, and
# etc., Nmon is restarted to capture the remaining snapshots left in a twenty-four hour
# period. If an event occurs where nmon is restarted, at the end of the cycle all nmon data
# files will be combined into one file. This script takes an Nmon snapshot once every five
# minutes over a twenty-four hour period. This should give more than enough detail of what
# is happening with a server to pinpoint problems.
#
# The following assumptions are made:
# 1. You have nmon installed somewhere.
# 1a. "someDir/"nmon files"
# 1b. This script (nmonStart.ksh) is installed in the same directory as 1a.
# 2. You have nmonmerge installed in the same directory as nmon.
# 2a. "someDir/nmonmerge"
# 3. You have the following sub-directories in the nmon directory:
# 3a. "someDir/data"
# 3b. "someDir/data/archive"
# 3c. "someDir/logs"
# 4. Scheduled in cron to run once a day starting at midnight.
# 4a. "00 00 * * * /someDir/nmonStart.ksh > /dev/null 2>&1"
# 5. Has an inittab entry for system restarts.
# 5a. "nmon:2:wait:/someDir/nmonStart.ksh > /dev/null 2>&1 # Start Nmon Data Collection"
#
# Parameters:
# None.
#
# Modifications (include date and name when changes are made):
# 04/18/2007, tep - Initial script created.
# 04/30/2007, tep - Initial bugs resolved, released for production use.
#
#-------------------------------------------------------------------------------------------||
#+-- Directory Variables
workDir=/home/nmon
archiveDir=$workDir/data/archive
dataDir=$workDir/data
logDir=$workDir/logs
#+-- Common Variables
currentDate=`date +%m%d%Y`
host=`hostname`
osLevel=`oslevel`
#+-- File Variables
countFile=$logDir/nmonCount.$currentDate
#+-- Functions
cleanUp(){
#+-- Call Function(s)
readCount
#+-- Get date from $countFile
fileDate=`echo $F3 | cut -d "." -f2`
#+-- CleanUp
mv $dataDir/$fileName.1 $archiveDir/$fileName.csv
rm -fr $dataDir/*$fileDate*
rm -fr $logDir/*$fileDate*
}
process(){
#+-- Call Function(s)
readCount
#+-- Get filename
fileName=`echo $F3 | cut -d "." -f1-3`
#+-- Count the number of data files
fileTest=`ls -l $dataDir/$fileName* | wc -l`
if [ $fileTest -gt 1 ]; then
#+-- Combine extra files
fileList=`ls $dataDir/$fileName* | grep -v $dataDir/$fileName.1`
for file in `echo $fileList`; do
$workDir/nmonmerge -a $dataDir/$fileName.1 $file
done
fi
#+-- Call function(s)
cleanUp
}
readCount(){
#+-- Read $countFile
more $countFile | read F1 F2 F3
}
restart(){
#+-- Call Function(s)
readCount
snapsLeft
#+-- Setup next dataFile
fileName=`echo $F3 | cut -d "." -f1-3`
oldFileNum=`echo $F3 | cut -d "." -f4`
(( newFileNum = $oldFileNum + 1 ))
dataFile=$fileName.$newFileNum
#+-- Interval between snapshots in seconds
secs=300
#+-- Number of snapshots to capture
snaps=$snapsLeft
#+-- Call function(s)
startNmon
}
snapsLeft(){
#+-- Minutes Left Calculation:
#+-- Minutes Left = "Hours/Day" minus "Current Hour" multiplied by "Minutes/Hour"
#+-- An example: ( 24 - 12 ) * 60 = 12 * 60 = 720 minutes
currentHour=`date +%H`
(( minLeft = (( 24 - $currentHour )) * 60 ))
#+-- Snaps Left Calculation:
#+-- Snaps Left = "Minutes Left" minus "Current Minutes" divided by "5"
#+-- An example: ( 720 - 30 ) / 5 = 690 / 5 = 190 snapshots left
currentMin=`date +%M`
(( snapsLeft = (( $minLeft - $currentMin )) / 5 ))
}
start(){
#+-- Function Variables
dataFile=$host.$currentDate.nmon.1
#+-- Interval between snapshots in seconds
secs=300
#+-- Number of snapshots to capture
snaps=288
#+-- Call function(s)
startNmon
}
startNmon(){
#+-- Start NMon based on O/S Level
#+-- Note: Only AIX 5.3 is checked, modify for 5.1, 5.2, or Linux
if [ "$osLevel" = "5.3.0.0" ]; then
#+-- Start the capture
$workDir/nmon_aix53 -F $dataDir/$dataFile -TEP -s$secs -c$snaps
#+-- Write out the number of snapshots to take, the time, and filename.
echo "$snaps `date +%T` $dataFile" > $countFile
fi
}
trackSnaps(){
#+-- Track snapshots
while [ $snaps -gt 0 ]; do
#+-- Wait for the next interval
sleep $secs
#+-- Subtract one snapshot for each interval capture.
#+-- This is to keep track of the number of intervals
#+-- in case of a system restart or some other issue.
(( snaps = $snaps - 1 ))
#+-- Write out how many snapshots are left, the time, and filename.
echo "$snaps `date +%T` $dataFile" > $countFile
done
}
#+-- Start Script
if [ -f $countFile ]; then
restart
trackSnaps
process
else
start
trackSnaps
process
fi
#+-- End Script