As we now know, Osmius monitors the Instances of our installation, asking for Types of Events periodically. Once the events are received in the Central Server, they are correlated and the status of the Instances and the Services are updated.
Let’s see the elements integrating this process.
They are the ones in charge of executing periodically the actions associated to each Type of Event. They are the ones which know how to collect the percentage of the CPU use in a Linux machine or the number of users connected to a Database.
In Osmius, an agent is the one responsible for collecting events of only one type of Instance. Therefore, we will have Agents for instances of Oracle or Windows, or Values of the Stock Market that would normally use the API provided by the different manufacturers of each type of instance. The Osmius Agents are built on C++ and use the ACE Framework (ADAPTIVE Communication Environment) as well as the Osmius own Framework, which enables the reutilization of nearly all the code when we create a new agent, so that we can concentrate on the recollection of the new types of events needed.
Each agent, when started, reads its own setup file, parses it and starts to monitor the defined instances found. For each defined instance in the setup file, reads the setup events and its values and execution periods, and starts to monitor. Every time it reads a new value and a text for a type of event for a specific instance, an event that will be sent to the master agent is created.
The appearance of a Linux Agent setup file is as follows:
#Use Error event criticity when no connection occurs instead of Critical. [0-Use ERRCON = 0 #Local Port for listening to commands from Master Agent PORTCM = 11982 #Timeout in seconds for network operations. Don't change TIMOUT = 130
# Instance Type TYPE = LINUX001 # Connection string used by the agent to connect to the instance CONNECTION_INFO = No [OSMIUS_INSTANCES\OSM_Host\EVENTS] OSNUMPRC = -t 3600 -c 0 -w 300 -a 500 -T "" OSPRCCPU = -t 300 -c 0 -w 80 -a 95 -T "" OSPRCMEM = -t 600 -c 0 -w 80 -a 90 -T "" OSPRCSWP = -t 3600 -c 0 -w 10 -a 30 -T "" -s OSPRCUFS = -t 3600 -c 0 -w 80 -a 95 -T "" -s -L "/" OSUPTIME = -t 300 -c 1 -w 500 -a 300 -T ""''
The first part refers to the generic parameters of the agent (TIMOUT = 130 which is the waiting time for the net operations), and in the second part we can see the setup for the instance called OSM_Host, instance type LINUX001.
After this, we can see the events setup for their monitoring. For example, we see that the CPU percentage (OSPRCCPU) is being monitored every 5 minutes (-t 300), and that we will receive a warning if it’s over 80 (-w 80) and a critical alarm, if it’s over 95 (-a 95).
The Osmius agents can work separately from the Osmius group, that is, with no master agent supervising them. We can set the “Stand Alone” mode, so they can send the events using a script that, at the same time, can send an email or connect with another Monitoring System.
You can find the Osmius agents complete handbook. here.
Each Master Agent is in charge of controlling and managing a set of Osmius Agents within its own server. The Master Agents are the ones in charge of:
Basically, the Master Agents are in charge and gather all the managerial tasks to maintain the monitoring infrastructure. In this way, the Agents do what they have to do: “Monitor instances and send events”, and it’s easier and more flexible to manage big infrastructures.
The truth is that with a Master Agent and a selection of Osmius Agents we can monitor many instances remotely, but there will be the case when we need to have the agents locally installed in every server.
The setup file of a Master Agent looks like the following:
#Master Agent Unique Code CODMST = MASTER01 #Osmius server IP IPSRV1 = 127.0.0.1 #Osmius Server Message receiver Port PORTS1 = 2000 #Master Host IP used to accept commands IPADCM = 127.0.0.1 #Master Port used to accept commands from the Central Server PORTCM = 1970 #Server IP Address to send Master Agent commands IPSDCM = 127.0.0.1 #Server Port Addressto send Master Agent commands PORSCM = 1971 #Master Port used to receive messages and events from our agents PORTAG = 1950 #Timeout in seconds for network operations. Don't change TIMOUT = 160 #Debug all commands sent to the master agent DEBUGA = 0
# Show debug info 0-No 1-Yes DEBINF = 1 # Start this agent when the master start working 0-No 1-Yes STARTA = 1 # Additional command line if needed ACMDLN =
The first part refers to the general parameters for the Master Agent and the second one, to the specific parameters that indicate if an Agent should be started or not, or if it must be launched in a debug mode.
The Central Server is the one that receives the events of every Agent from the monitoring infrastructure that we have setup in our installations, and is also the one in charge of sending the required tasks to be processed by the Master Agents.
In fact, the Central Server is composed of a set of processes:
These processes act as an interface between the Database, which is updated with the actions of the users in the Control Panel, and all the infrastructure of the Master Agents and the Osmius Agents.
Configuration files of these process:
# IP Address in which listen for incomming messages from Master Agents # If 0.0.0.0 server will listen in all available interfaces. IPMAMS = 192.168.1.2 # Local port in which listen for incomming messages from Master Agents. Default 2001. PORTMS = 2001 # Time out for network and queue operations in seconds. TIMOUT = 120 # Maximum number of retries to connect to server. RECONN = 4 # Osmius Repository Database Parameters. DBUSER = osmius DBPASS = osmius DBPORT = 3306 DBNAME = osmius DBHOST = localhost
# IP Address used to accept commands. IPSDCM = 192.168.1.2 PORSCM = 2002 # Maximum number of retries to connect to server. RECONN = 4 # Osmius Repository Database Parameters. DBNAME = osmius DBHOST = localhost DBPORT = 3306 DBUSER = osmius DBPASS = osmius # Number of retries to process one task. RETRIS = 50 # Interval Time to process tasks TIMTSK = 10 # Number of tasks to process in one interval NUMTSK = 10 # Print all command sent and received or not DEBUGA = 0
# IP Address of the state manager. IPSTMG = 192.168.1.2 #Osmius server IP IPSRV1 = 192.168.1.2 #Osmius Server Message receiver Port Default 2001 PORTS1 = 2001 # Maximum number of retries to connect to server. RECONN = 4 # Osmius Repository Database Parameters. DBNAME = osmius DBHOST = localhost DBPORT = 3306 DBUSER = osmius DBPASS = osmius # Number of retries to process one task. RETRIS = 50 # Time to Data Warehouse calculations TIMDWH = 300 # Interval Time to calculate global note TIMGLN = 10 # Interval Time to test infrastructure state TIMLIF = 300 # Number of master to process in one test interval NUMLIF = 100 # Print all command sent and received or not DEBUGA = 0
# Maximum number of retries to connect to server. RECONN = 4 # Osmius Repository Database Parameters. DBNAME = osmius DBHOST = localhost DBPORT = 3306 DBUSER = osmius DBPASS = osmius # Number of hours in which a notification is valid. NHOURS = 8 # Interval Time to process notifications TIMNTF = 60 DEBUGA = 0
#Server Unique Code CODSVR = SERVER01 # Osmius Repository Database Parameters. DBNAME = osmius DBHOST = localhost DBPORT = 3306 DBUSER = osmius DBPASS = osmius # Timeout to abort the discovery process TIMOUT = 1800 # Script to search the net finding alive hosts SCANNT = nmap.sh # Print all command sent and received or not DEBUGA = 0
The Osmius Database stores all kind of information:
Any user with a Browser (preferably Firefox) and with the adequate authorizations can connect to the control panel and start to monitor and manage the entire infrastructure.
The Osmius Control Panel is a Web application built in Java and based on various Frameworks that is executed in a Tomcat Server with database connection. When a user changes the setup of an element it is stored in the database and, if necessary, the tasks that will be processed by the Central Service processes are generated, and they will probably change the behavior of an agent in a remote server.