Translations of this page:

Osmius

Osmius agent for Solaris solaris_logo.jpg
Agent name: osm_ag_SOLARIS1 Agent code: SOLARIS1
Subject: User manual agente de Osmius para sistemas Solaris
Date: 06/11/2008 Revision date: 17/11/2008

General Information

This agent can monitor several parameters of Solaris systems, initially Sparc Systems running Solaris, 10 and has been tested against a Sun Solaris T1000 Sparc SunOs 5.10 system. However, we recommend checking its functionality before deploying it in production environments.

The Osmius Solaris agent has been developed using functionalities and enhancements of Osmius framework and ACE libraries, so it is necessary to install the ACE libraries for the proper deployment and operation of the Solaris Osmius agent. See chapter: installation.

This SOLARIS1 agent provides up to 14 basic events with configuration parameters to allow scaling of a very simple way. Events have been selected by the Research and Development Osmius Team as the most interesting for this systems.

All of this agent events are local ones, so the agent must be running in the same server you want to monitor. This agent uses solaris system calls as well as OS commands.

Solaris Instance

As a general rule each Osmius agent can monitor one instance type. If you are not familiar with these concepts check out the glossary. Each instance is individually defined in the configuration file (if you want further information go to agents and instances); depending on agent type is the instance type and depending on instance is the connection info.

CONNECTION_INFO

The connection information or connection_info is data that the agent needs to know to connect to the instance. (See more about the connection_info)

In the specific case of Solaris agent the connection_info will be empty, and that's because we don't need to connect nowhere as we are already into the instance we want to monitor.

CONNECTION_INFO= 

TYPE

The type defines the instance type to be monitored. Every declared instance must be associated with a type as you can see here

For Solaris Instances:

TYPE= SOLARIS1

Event summary table for Solaris

Here is briefly the capabilities of this agent, further down on this page each event is described in more detail.

EVENT DESCRIPTION c w a tseconds Extra parameters / Remarks
SOLUPTIM Number of seconds from last startup 1 600 300 600 Silent mode ( -s) recommmended
SOLPRCPU CPU Load % 0 85 95 300 Interesting parameter to capacity plannings
SOLNUCPU Number of detected CPUs 1 3 3 604800 Silent mode ( -s) recommmended
SOLPRMEM Used Memory % 0 75 90 600 Silent mode ( -s) recommmended
SOLFRMEM Number of Free memory MB1 256 128 600 Silent mode ( -s) recommmended
SOLPRSWP Used Swap memory % 0 75 90 600 Silent mode ( -s) recommmended
SOLFRSWP Number of Free swap memory MB 1 Administrator Administrator 600 Silent mode ( -s) recommmended
SOLIP4IN Number of active IP4 interfaces 1 Administrator Administrator 600 Silent mode ( -s) recommmended
SOLIP6IN Number of active IP4 interfaces 1 Administrator Administrator 600 Silent mode ( -s) recommmended
SOLNUPRC Number of running processes 0 Administrator Administrator 600 Silent mode ( -s) recommmended
SOLPRDWN Check if the processes of the list are all running 0 1 1 600 Silent mode ( -s) recommmended
-L processes_list
SOLPRUFS Check free space from the list of filesystems 0 80 90 600 Silent mode ( -s) recommmended
-L filesystem_list
SOLUSERS Number of connected users 0 30 50 600 Silent mode ( -s) recommmended
SOLLOG01 Searches a string into files 0 1 1 600 Silent mode ( -s) recommmended
You can define from SOLLOG00 to SOLLOG99 (100 events)
-L log_file_txt -S string

Information Events

Info events retieve general data about instance, usually this data does not change over time. This kind of events have no severity, simply provides instance details.

EVENT DESCRIPTION tseconds Remarks
SLINFNAM Hostname 86400 (1 day) DNS name
SLINFOSK OS Version 86400 (1 day) From the kernel
SLINFMCH Hardware info 86400 (1 day) Data about the machine we're running on
SLINFTMZ Timzone 86400 (1 day) Data about Time Zone
SLINFMEM Memory 86400 (1 day) Phisical memory installed in the server
SLINFCPU Processors info 86400 (1 day) CPU model, speed, etc
SLINFNET Network interfacecs 86400 (1 day) -
SLINFFSM Filesystems 86400 (1 day) Info about filesystems
SLINFTOP top output (first lines)86400 (1 day) Shows consuming processes
SLINFUSR Users 86400 (1 day) Show up to 40 system users

Solaris Events

SOLUPTIM

SOLUPTIM returns uptime in seconds since last system reboot.

Return values:

VALUE MEANING
-1 Error
X Number of seconds

Recommended parameters:

Comparison type Inverse. The higher the value the lower the severity (-c 1)
Monitoring interval 300 seconds – 1 hour –> depends on instance importance
Warning threshold Contact your Solaris administrator
Alert threshold Contact your Solaris administrator

Parameter setting example:

SOLUPTIM = -t 300 -c 1 -w 240 -a 60 -T "Solaris uptime"

Remarks: The text associated with this event returns the uptime in an human readable format like [X] days [Y] hours [Z] minutes.

SOLPRCPU

SOLPRCPU returns the CPU load percentage used by all the processes in the system.

Return values:

VALUE MEANING
-1 Error
0 - 100 CPU Load %

Recommended parameters:

Comparison type Direct. The higher value the higher severity (-c 0)
Monitoring interval 60 seconds – 1 hour –> depends on instance importance
Warning threshold 90 - Depends on system's load
Alert threshold 95 - Depends on system's load

Parameter setting example:

SOLPRCPU = -t 300 -c 0 -w 85 -a 95 -T "CPU Load %"

Remarks: What this event does is: ksh -c ”/usr/bin/sar -u 1 2 | tail -1 | awk '{ print $5}'” and substract that value from 100. If you want to modify it go ahead and send us the modifications if you want to get involved with this Open/Free source project and help us to improve Osmius.

SOLNUCPU

SOLNUCPU returns the number of CPUs installed into the system. This can be usefull to check CPU failures.

Return values:

VALUE MEANING
-1 Error
X Number of detected CPUs

Recommended parameters:

Comparison type Inverse. The lower value the higher severity (-c 1)
Monitoring interval 1 week – 1 month or never –> depends on instance importance
Warning threshold Contact your Solaris administrator
Alert threshold Contact your Solaris administrator

Parameter setting example:

SOLNUCPU = -t 604800 -c 1 -w 23 -a 12 -T "Number of CPUS"

Remarks: What this event does is: ksh -c “uname -X | grep CPU | cut -f3 -d ' '”. If you want to modify it go ahead and send us the modifications if you want to get involved with this Open/Free source project and help us to improve Osmius.

SOLPRMEM

SOLPRMEM returns the memory percentage used by all the process in the system.

Return values:

VALUE MEANING
-1 Error
0 - 100 Used memory %

Recommended parameters:

Comparison type Direct. The higher value the higher severity (-c 0)
Monitoring interval 60 seconds – 1 hour –> depends on instance importance
Warning threshold Depends on system's load
Alert threshold Depends on system's load

Parameter setting example:

SOLPRMEM = -t 300 -c 0 -w 60 -a 75 -T "% used memory"

Remarks: This event uses Solaris system libraries and sysconfig.h development fuctions.

SOLFRMEM

SOLFRMEM returns the number of free MBytes in the system memory.

Return values:

VALUE MEANING
-1 Error
X Free Megabytes

Recommended parameters:

Comparison type Inverse. The lower value the higher severity (-c 1)
Monitoring interval 60 seconds – 1 hour –> depends on instance importance
Warning threshold Depends on system's load
Alert threshold Depends on system's load

Parameter setting example:

SOLFRMEM = -t 300 -c 1 -w 256 -a 128 -T "MB memoria libre"

Remarks: This event uses Solaris system libraries and sysconfig.h development fuctions.

SOLPRSWP

SOLPRSWP returns the used Swap memory percentage.

Return values:

VALUE MEANING
-1 Error
0 - 100 Used Swap %

Recommended parameters:

Comparison type Direct. The higher value the higher severity (-c 0)
Monitoring interval 60 seconds – 1 hour –> depends on instance importance
Warning threshold Depends on system's load
Alert threshold Depends on system's load

Parameter setting example:

SOLPRSWP = -t 300 -c 0 -w 40 -a 60 -T "Used swap %"

Remarks: What this event does is: ksh -c “swap -s | awk '{print $9$11}'”. If you want to modify it go ahead and send us the modifications if you want to get involved with this Open/Free source project and help us to improve Osmius.

SOLFRSWP

SOLFRSWP returns the number of swap memory free MBytes.

Return values:

VALUE MEANING
-1 Error
X Swap memory free MBytes

Recommended parameters:

Comparison type Inverse. The lower value the higher severity (-c 1)
Monitoring interval 60 seconds – 1 hour –> depends on instance importance
Warning threshold Depends on system's load
Alert threshold Depends on system's load

Parameter setting example:

SOLFRSWP = -t 300 -c 1 -w 200 -a 100 -T "Swap memory Free MBytes"

Remarks: What this event does is: ksh -c “swap -s | awk '{print $11}' | cut -dk -f1”. If you want to modify it go ahead and send us the modifications if you want to get involved with this Open/Free source project and help us to improve Osmius.

SOLIP4IN

SOLIP4IN returns the number of active IP4 Interfaces.

Return values:

VALUE MEANING
-1 Error
X Number of IP4 Interfaces

Recommended parameters:

Comparison type Inverse. The lower value the higher severity (-c 1)
Monitoring interval 600 seconds – 1 hour –> depends on instance importance
Warning threshold Number of needed IP4 interfaces
Alert threshold Number of needed IP4 interfaces

Parameter setting example:

SOLIP4IN = -t 300 -c 1 -w 4 -a 4 -T "IP4 Interfaces"

Remarks: This events uses ACE portable libraries so you can use it over other platforms without changing the code. See the use of the ACE wrappers into the code if interested.

SOLIP6IN

SOLIP6IN returns the number of active IP6 Interfaces.

Return values:

VALUE MEANING
-1 Error
X Number of IP6 Interfaces

Recommended parameters:

Comparison type Inverse. The lower value the higher severity (-c 1)
Monitoring interval 600 seconds – 1 hour –> depends on instance importance
Warning threshold Number of needed IP6 interfaces
Alert threshold Number of needed IP6 interfaces

Parameter setting example:

SOLIP6IN = -t 300 -c 1 -w 4 -a 4 -T "IP6 Interfaces"

Remarks: This events uses ACE portable libraries so you can use it over other platforms without changing the code. See the use of the ACE wrappers into the code if interested.

SOLNUPRC

SOLNUPRC returns the total number of processes into the system.

Return values:

VALUE MEANING
-1 Error
X Number of processes

Recommended parameters:

Comparison type Direct. The higher value the higher severity (-c 0)
Monitoring interval 60 seconds – 1 hour –> depends on instance importance
Warning threshold Contact your administrator
Alert threshold Contact your administrator

Parameter setting example:

SOLNUPRC = -t 300 -c 0 -w 3000 -a 5000 -T "Total number of processes"

Remarks: What this event does is: ksh -c “ps -A | wc -l | awk '{ print $1 }'”. If you want to modify it go ahead and send us the modifications if you want to get involved with this Open/Free source project and help us to improve Osmius.

SOLPRDWN

SOLPRDWN events checks if all of the processes in the list are up and running.

Extra parameters:
This event need an extra parameter to work:

PARAMETER MEANING Mandatory
-L “proc1[,proc2,procN]” - Processes list. (Don't use space between them, only ”,”). Yes

Return values:

VALUE MEANING
-1 Error
0 OK. All the processes are running
1 At least one of the processes in the list is down

Recommended parameters:

Comparison type Direct. The higher value the higher severity (-c 0)
Monitoring interval 60 seconds – 1 hour –> depends on instance importance
Warning threshold 1
Alert threshold 1

Parameter setting example:

SOLPRDWN = -t 300 -c 0 -w 1 -a 1 -L "osmius,osm_ag_SOLARIS1,sshd" - T "Checking processes: Osmius and sshd"

Remarks: If one of the processes is not running the event associated text is like this: Process [procN] not found This event uses “ps” command. If you want to modify it go ahead and send us the modifications if you want to get involved with this Open/Free source project and help us to improve Osmius.

SOLPRUFS

SOLPRUFS checks if the filesystems supplied in the list are used more than the percentage threshold defined by the user.

Extra parameters:
This event needs an extra parameter to work:

PARAMETER MEANING Mandatory
-L -L “fs1[,fs2,fsN]” - Filesystems list (Don't use spaces between them, only ”,”). Yes

Return values:

VALUE MEANING
-1 Error
0 - 100 occupied % of the fullest filesystem

Recommended parameters:

Comparison type Direct. The higher value the higher severity (-c 0)
Monitoring interval 60 seconds – 1 hour –> depends on instance importance
Warning threshold 1
Alert threshold 1

Parameter setting example:

SOLPRUFS = -t 600 -c 0 -w 80 -a 90 -L "/,/var/log,/tmp"

Remarks: The text associated with this event will return the filesystem with the higher percentage of used space.
This event uses “df -v <filesystem> | tail -1”. If you want to modify it go ahead and send us the modifications if you want to get involved with this Open/Free source project and help us to improve Osmius.

SOLUSERS

SOLUSERS returns the number of users into the Solaris system.

Return values:

VALUE MEANING
-1 Error
X Nmber of users

Recommended parameters:

Comparison type Direct. The higher value the higher severity (-c 0)
Monitoring interval 60 seconds – 1 hour –> depends on instance importance
Warning threshold Contact your Solaris administrator
Alert threshold Contact your Solaris administrator

Parameter setting example:

SOLUSERS = -t 300 -c 0 -w 100 -a 300 -T "users"

Remarks: What this event does is: ksh -c “who -q | tail -1 | cut -f2 -d'='”. If you want to modify it go ahead and send us the modifications if you want to get involved with this Open/Free source project and help us to improve Osmius.

SOLLOG01

SOLLOG01 searches for coincidences of the supplies string into text files.

Extra parameters:
This event needs two extra parameters.

PARAMETER MEANING Mandatory
-S -S “string” - String to search into the text file Yes
-L -L “text_file” - Complete path to the text file in which search for the string Yes

Return values:

VALUE MEANING
-1 Error
0 No coincidences found
1 At least 1 coincidence found in the text file

Recommended parameters:

Comparison type Direct. The higher value the higher severity (-c 0)
Monitoring interval 60 seconds – 1 hour –> depends on instance importance
Warning threshold 1
Alert threshold 1

Parameter setting example:

SOLLOG01 = -t 60 -c 0 -w 1 -a 1 -S "error" -L "/home/osmius/osmius/test.txt"

Remarks: This event remembers the last position read from the text file and in next executions the search starts from the last point read.
Use event names from SOLLOG00 to SOLLOG99, 100 different events.

Prerequisites

In order to compile, this agent requires a set of prerequisites, which are generic to compile any Osmius agent, you can see these prerequisites.

Please, check that all the command line calls work properly on yuor system.

Makefiles and Compiling

  • Make Project Creator (MPC) is used by Osmius, so creating Makefiles is a trivial task. If you want to learn more about MPC and Osmius check out the section of Makefiles on Osmius.
  • In the particular case of Solaris Osmius agent you can easily generate Makefile as follows:

From the agent directory using console or terminal.

$ACE_ROOT/bin/mpc.pl -type gnuace osm_ag_solaris.mpc
  • Now that you have created the Makefile, agent compiling is extremely simple.
gmake -f Makefile.Osm_Ag_Solaris_Osmius

Binaries are automatically installed in the bin directory of OSM_ROOT base directory.

Running the agent

The Solaris Osmius agent have the same running features of the other Osmius agents. You can check it out int he section Start and Stop Agents.

To run the agent without the Osmius Web Console:

Running in standalone mode

The Solaris Osmius agent, like the others Osmius agents, allows the execution in standalone mode. This option may be particularly useful when developing a new agent or to perform specific agent tests.

Basically you have to add a new value, called SNDCMD, to Osmius agent configuration file agente de Osmius para Solaris (osm_ag_SOLARIS1.ini) as shown here.

Then you must run the agent setting the Master Agent communications port to zero, for example:

osm_ag_SOLARIS1 -c osm_ag_SOLARIS1.ini -m 00000000 -p 0 -d

Tests list

Test performed to agent de Osmius para Solaris.

Date: 19/11/2008
Test Result Remarks
Creating an instance with all its events in silent mode OK N/A
Creating an instance with all its events with custom text OK N/A
Creating an instance with all its events but no custom text OK N/A
Declaración de 3 instancias con todos sus eventos a 5 seconds y mantenerlo
running for 48 hours
OK, but only using one insytance Every thing works fine and there are no memory leaks
Declare 2 instances, cause a disconnect and then reconnectN/A N/A
Declare 1 instance and test each event OK All the events behave as spected
Elimination of general parameter and check unbootable OK N/A
Elimination of instance CONN_INFO and check unbootable N/A This agent lacks Connection Info

Results after 10 days of intensive tests:

  PID USERNAME  SIZE   RSS STATE  PRI NICE      TIME  CPU PROCESS/NLWP
19724 osmius   7872K 6728K sleep   47    4   0:00:01 0.0% osm_ag_SOLARIS1/2
19724 osmius   7872K 6728K sleep   39    4   0:00:02 0.0% osm_ag_SOLARIS1/2
19724 osmius   7872K 6728K sleep   47    4   0:00:05 0.0% osm_ag_SOLARIS1/2
19724 osmius	7936K 6792K sleep   39	  4   0:11:44 0.0% osm_ag_SOLARIS1/2
19724 osmius	7936K 6792K sleep   47	  4   0:28:41 0.0% osm_ag_SOLARIS1/2
19724 osmius	7936K 6792K sleep    0	  4   2:14:40 0.0% osm_ag_SOLARIS1/2
 
en/agentes/solaris1.txt · Last modified: 2012/12/05 18:18 by osmius
 
Recent changes RSS feed Creative Commons License Donate Powered by PHP Valid XHTML 1.0 Valid CSS Driven by DokuWiki