Translations of this page:

Infrastructure Maintenance

There is a tab called infrastructure divided in three areas:

A: A tree displaying all the Master Agents along with the Agents under control.

The Master Agents can have different statuses (colors).

Color Master Agent Status
StartedStarted
PausedPaused
StoppedStopped
ErrorThe actual status of the Master Agent is not as expected 1)

The Agents can have different statuses (colors).

Color Agent Status
StartedStarted
StoppedStopped
ErrorThe actual status of the Agent is not as expected 2)

B: A list with all the Master Agents and associated data:

Host Name of the machine where the Master Agent has been installed
Identifier Internal code identifying the Master Agent
IP IP Address of the machine where the Master Agent has been installed
Port Port where the Master Agent receives the tasks to be performed
Start Last task requested from the control panel (Start, Put in Pause, Stop)
Status Actual status of the Master Agent (Started, Paused, Stopped)
PlatformOperative System running in the machine where the Master Agent has been installed
Agents Number of Agents controlling the Master Agent
Instances Number of Instances among all the Agents controlled by that Master agent

C: A list with all the pending tasks or a list with the executed tasks.

Master Name of the Master Agent to execute the task
Agent Agent associated to the task (in case there is one)
Instance Instance associated to the task (in case there is one)
Type of Task Refresh Agent Configuration, Refresh Master Agent Configuration, Restart Master Agent, Stop Master Agent of Pause Master Agent
Reattempts Number of reattempts to run the task
Status Pending, Processing, OK, error and not executed (a similar but more updated task was executed)
Execution Date Date to update the task
Result Text returned by the Master Agent after running the corresponding task

osmius_infraestructure01_bis.jpg

We can perform all the actions concerning the infrastructure maintenance from different points of this screen:

  1. Supervise the deployment of new Master agents.
  2. Start, pause, stop, drop and change the configuration of the Master Agents.
  3. Start or stop the agents and change its configurations.
  4. Set the Master Agents from which we want to monitor every instance.
  5. Supervise and maintain the tasks originated by any of the previous actions.
  6. Install new agents or update the existing ones.
  7. Change Master Agent description.

Each of the following actions creates one or several specific tasks which will be executed by the corresponding Master Agents.

Profiles

Only the root user can perform all the functionalities associated with the infrastructure maintenance.

Deployment of a New Master Agent

One user can manually deploy a new Master Agent in a given machine Deployment of a Master Agent and automatically the infrastructure screen will display the Master Agent with the machine name where it is running, the Server processes the tasks caused by the deployment and it sends the initial configuration of each of the Agents controlled by the new Master Agent and an initial configuration of itself with an internal code used to identify the Master Agent uniquely.

The server has created a new instance of the same type of the given machine's Operating System and has sent it to the corresponding Agent. This Agent starts to monitoring it immediately. The Instance name is equal to the internal code of the Master Agent and belongs to the “Discovered” group of instances.

From this moment on the user, by means of the control panel, will be able to perform any task over that Master Agent: assign instances to monitor, start the requested agents, change the configuration parameters of the Master Agent or those of any of its Agents, etc.

Action Induced Task
Deployment of a New Master Agent 1) Update Agent Configuration 3)
2) Restart the Master Agent

Restart a Master Agent

A user might have to restart manually a Master Agent already deployed once in the infrastructure, and automatically the server will send its own current configuration as well as the configuration of every one of its agents when processing the tasks provoked by this action. From this moment on the user will be able to, by means of the control panel, carry out any task over this Master Agent: assign the instances to monitor, start the required agents, change its own configuration parameters or those of any of its Agents, etc.

This Master Agent can be restarted in other machine but first of all it should be stopped from the console.

Action Induced Task
Restart a New Master Agent 1) Update Agent Configuration 4)
2) Restart the Master Agent

Master agents

osmius_infraestructure02_bis.jpg

From this screen, accessed by clicking on the machine name of a Master Agent, we are able to do the following actions:

A) From the context menu:

1) Pause a Master Agent.
2) Start a Master Agent.
3) Stop a Master Agent. Caution, take into consideration that if the Master Agent is stopped, we will not be able to restart it from the control panel; it will need to be deployed again manually from the machine in order to be registered in the system. Restart a Master Agent.
4) Master Agent configuration reload. This implies that the server sends again the last configurations of its Agents, the last Master Agent configuration, the binary files of all of its Agents and also the users scripts.
5) Delete a Master Agent. This implies the deletion of all the associated agents and tasks. Caution, we must ensure to stop it before the final deletion from the system takes place so there are no remaining undesired processes running. This happens because a Master Agent does not die when it loses the connection with the Server. Instead, they remain waiting for the notification of the successful recovery of the Server to continue monitoring.
6) Agents deployment in the Master Agent. We can select the Agents that will be monitoring from this Master Agent. The server sends to the Master Agent the binary files of these Agents.
7) Modify the Master Agent description. Change the current description of the Master Agent. It applies in the display fo the infrastructure tree when we checked the option DESC. Tree items are displayed by description rather than by hostname. This description is also displayed in the Master Agents selection elements, such as in the Instance Management screen.

Action Induced Task
Pause a Master Agent Pause the Master Agent
Start a Master Agent Restart the Master Agent
Stop a Master Agent Stop the Master Agent
Master Agent configuration reload Send a zip file for each of its Agents with their binaries
Deploy (unzip) the binary files
Update the configuration of each Agent
Send a zip file for each user script (user defined events) in this Master Agent
Deploy (unzip) the user script binary files
Restart the Master Agent
Delete a Master Agent NONE
Agents deployment in the Master Agent Send a zip file for each of its Agents with their binaries
Deploy (unzip) the binary files
Update the configuration of each Agent
Restart the Master Agent
Modify the Master Agent description. Change the current description of the Master Agent. It applies in the display fo the infrastructure tree when we checked the option DESC. Tree items are displayed by description rather than by hostname. This description is also displayed in the Master Agents selection elements, such as in the Instance Management screen.

B) This area displays the following information about the Master Agent:

Host Name of the machine where the Master Agent has been installed
Start Last task required from the control panel (Green: Start, Orange: Pause, Grey: Stop)
Status Current Status of the Master Agent (Started, Paused, Stopped)
Pending Tasks Number of Tasks pending to execute for this Master Agent
Platform OS used in the machine where this Master Agent is running
IP MA IP Address of the machine where the Master Agent has been installed
Port MA Port used for the Master Agent to receive the tasks to process
Backup Port MAPort used for the Master Agent to receive the backup tasks to process, for example binary files reception.
IP Proxy IP Address of the proxy machine or Master Agent Proxy located between server and Master Agent
Proxy Port Port used for the proxy to receive the tasks
Backup Proxy Port Port used for the proxy to receive the backup tasks

C) This area allows the user to display and/or edit the following parameters of the Master Agent:

CODMST Internal code of the Master Agent. This code was automatically assigned when the Master Agent was deployed.
IPSRV1 IP Address of the Event Manager.
PORTS1 Port of the Event Manager.
IPADCM IP Address of the Master Agent used to accept the tasks (commands) sent by the Task Manager Server through the control panel.
PORTCM Port of the Master Agent to accept the tasks.
IPSDCM IP Address of the Task Manager used to send the tasks to the Master Agent or used to ask the restart or deployment requests.
PORSCM IP Address of the Task Manager used to send the tasks to the Master Agent.
PORTAG Port of the Master Agent used to accept the events from the hanging Agents. (ADJUSTABLE)
MSGQSI Maximum buffer size in KBytes used by the Master Agent to store messages if it can't send them to the server (mínimo 1 MB) (ADJUSTABLE)
RECONN Number of minutes that the Master Agent is attempting to reconnect to server if connection has been lost. (ADJUSTABLE)
DEBUGA Create a log with all the tasks (commands) received (1) or only the essential (0). (ADJUSTABLE)
MPROXY If this Master Agent is an Master Agent Proxy (1) or not (0) (ADJUSTABLE)
IPGTWY IP Address where this Master Agent Proxy receive events and deployment of others Master Agent. (ADJUSTABLE)
PORGTW Port where this Master Agent Proxy receive deployment of others Master Agent. (ADJUSTABLE). The Port where this Master Agent Proxy receive events of others Master Agents is PORTAG

When a Master Agent loses connection with the Central Server is RECONN minutes trying to reconnect and meanwhile keeping all the messages produced by its agents to be sent later when possible. Once this time the Master Agent will pause (and therefore unmonitored nothing). When Central Server start again it will start automatically this Master Agent.

For each of the Master Agent, depending on the number of messages that produce of all its agents and the available memory on the machine where it is running, we can estimate RECONN and MSGQSI for not lose any messages in case of failures connection with the Central Server. The Master Agent needs 1545 bytes of memory to store each message locally (actually only sends 200 bytes of the network). For example, the Master Agent could store up to 678 messages with 1 MB. When memory cache is full, the master agent flush all stored messages and start again.

Action Induced Task
Changes in the parameters of the Master Agents Restart the Master Agent or
Update the Master Agent Configuration in case the Master Agent is paused

D) This area allows controlling the agents depending from this Master Agent:

Type of Agent Agent
Description Agent description
Debug Creation of the log file (1) or Not (0). (ADJUSTABLE)
Start Agent Yes (1) or Not (0). (ADJUSTABLE)
Command line Additional command line in case it is needed by the agent. (ADJUSTABLE)

Applying the changes will start and/or stop the selected Agents of this Master Agent.

Action Induced Task
Changes in the Agents Configuration Restart the Master Agent or
Update the Master Agent Configuration in case the Master Agent is paused

Agents

osmius_infraestructure03.jpg

From this screen, accessed by clicking on the name of the Agent, we are able to do the following actions over it:

A) From the context menu:

1) Start a particular Agent.
3) Stop a particular Agent.

Action Induced Task
Start the Agent Restart the dependant Master Agent
Stop the Agent Restart the dependant Master Agent

B) This area allows displaying the following information about the Agent:

Master Internal code of the Master Agent. This code was automatically assigned when the Master Agent was deployed.
IP IP Address of the machine where the Master Agent controlling this Agent has been installed
Type Type of Agent
Description Agent description
Status Current status of the Agent (Started, Stopped)

C) This area allows the user to display and/or edit the following parameters of the Agent:

ERRCON Returns error (ERRCON = 1) in case the agent is not able to connect to the instance due to a critical event (ERRCON = 0). (ADJUSTABLE)
PORTCM Local port to receive the commands from the Master Agent. (ADJUSTABLE)
TIMOUT TWaiting time for the network operations. (ADJUSTABLE. ASK THE ADMINISTRATOR)

Note: All these actions over the Master Agent can only be carried out if the Master Agent controlled by this Agent is running and performing correctly (green color).

Action Induced Task
Edit the parameters of the AgentUpdate the Configuration of the Agent

D) This area allows adding or removing the available instances in the system to be monitored by this agent. Instance Management of the same type of this agent.

Action Induced Task
Modify the monitored instances by the Agent Update the Agent Configuration

Instances

osmius_instances01.jpg

This Instance Management screen displays the list of Master Agents monitoring each of the Instances. It runs the following procedure: once the user with the Administrator role has created and set up a particular instance (Instance Management) then the root user must assign the Agent or Master Agent to monitor the Instance. It should be pointed out that consequently any change in the configuration of the Instance creates an internal task that will change the Agent configuration for each of the Master Agents assigned to the Instance.

osmius_instances03.jpg

In order to do this, the root user must check periodically in the previous screen those Instances not being monitored from any Master Agent (0 in red) and click on each of the Instances to assign them as many Master Agents as necessary. Although usually every instance is monitored from a unique Master Agent, it could happen that we need a particular instance to be monitored from several Master Agents, e.g.: we need to monitor a web server from the company intranet and the internet. This being the case, we could install an Agent in the DMZ and a second one inside the company intranet, and assign them both to the instance associated to the web server.

Action Induced Task
Assign the Instance to one or several Master Agents AUpdate the Agent Configuration of the same type as the instance in each of them

Tasks

osmius_infraestructure04.jpg

From this area of the infrastructure screen we are able to supervise and maintain the tasks generated by any of the functionalities of the infrastructure that we have seen so far. The screen shows the following information about the tasks:

Master Name of the Master Agent machine to execute the task
Agent Agent associated to the task (in case there is one)
Instance Instance associated to the task (in case there is one)
Type of Task Update Agent Configuration, Update Master Agent Configuration, Restart Master Agent, Stop or Pause Master Agent
Reattempts Number of reattempts carried out to execute the task
Status Pending, Processing, OK, error and not executed (a similar but more updated task was executed)
Execution date Date to update the task
Result Text returned by the Master Agent after running the corresponding task

Each of the tasks is remotely executed by the associated Master Agent. This screen allows us to consult the remaining tasks to be executed and/or those already executed as well as the value returned by the Master Agent in charge of executing it.

We can also reboot the tries counter of a particular task or change its status to pendant so that the task manager will try to process the task again. We can even delete the task if this is preventing the execution of other tasks of the same type but more recent (later execution date).

The task manager tries to execute each of the tasks a maximum number of attempts (is set at server installation). Up to half of the administrator's attempts performed without interruption, but then stops quarters of an hour intervals between attempts to allow time to fix the problems that are causing failures: network outages, etc..

New Agents

Right-on “infrastructure” opens the possibility of being able to register a new type of agent that has been developed subsequently to Osmius installation or upgrade an existing Agent to a later version.

New Agent: From the official website of Osmius you can download a zip file with all necessary files to have a new agent. This screen allow you rise the zip file for the agent you want to register and that thereafter will be available to be deployed in the corresponding Master Agent.

Agent Update: From the official website of Osmius you can download a zip file with all necessary files to update an agent. This screen allow you to rise this zip file and then the agent will be automatically replace in all Master Agents that use it.

Initial Infrastructure

Osmius is installed in a basic configuration within the Servers where a Master Agent is being executed in the same machine. As many Master Agents as needed to monitor our systems can be added automatically to this basic setup.

osmius_master01.jpg Our initial Master Agent always has the internal code MASTER01 and, as it can be seen in the image in this case, it has been displayed in a machine called PWPRT009 IP address: 192.168.3.48. This Master Agent controls the following Agents to which instances of its same type can be assigned:

APACHE01 Agent of the Web Server of Apache
HTTPPOLL Agent HTTP
IPINST01 Agent IP
LINUX001 Agent Linux
LOG00001 Log Agent
MYSQL001 MySQL Database Agent
TOMCAT01 Agent of the Server of Tomcat Applications

Status of the Infrastructure

Profiles

Any user with enough authorizations over the instances which form the OsmiusSV service could supervise its good performance. In the initial setup that is provided with Osmius, the instances which form this service are in the OsmGroup group of instances.

OsmiusSV Service

osmius_osmiussv.jpg

The OsmiusSV service is composed by a series of instances that monitor the status of the entire system that forms the Osmius installation, from its own infrastructure of Agents, Master Agents and Tasks, to the entire Software Architecture that sustains the system. As we know, the monitoring of each instance produces a series of events that affect the availability and criticality of the instance and of the services associated Events.

Summarizing, our OsmiusSV service and our instances can be in the following statuses:

Color Status
OKOK Status
Warning“Warning” Status
CríticoCritical Status

OsmiusAG Instance

The OsmiusAG Instance: “Represents Osmius agents and master agents” it’s a special instance whose monitoring produces four type of events:

Event Descripction Explanation
MAHEALTH Master Agent Health. Error El Agente Maestro asociado al evento no está funcionando correctamente.
AGENTE 5) Agent Health Error El Agente del Agente Maestro asociado al evento no está funcionando correctamente.
TKHEALTH Tasks Manager Health Warning Existen tareas que se han intentando realizar mas de la mitad de los intentos permitidos.
NOTIFAIL Notifications not sent Warning Existen notificaciones que no se están consiguiendo mandar a ninguno de los usuarios suscritos

Other Instances

The instances responsible of monitoring the entire Architecture software that sustain Osmius are:

Instance Description Explanation
OSM_Sql Osmius Main Database It monitors the MySQL database
OSM_Log Osmius Tomcat Server It monitors the Tomcat Server Log
OSM_Host Osmius Central Host It monitors the Linux machine
OSM_Net Osmius NetWork Services It monitors the network
OSM_TomC Osmius TomCat Instance It monitors the Tomcat Server
OSM_Http Osmius Http Instance It monitors the Tomcat Server

Note: All these instances are being monitored, according to the initial setup provided, from the Master Agent MASTER01 but they can also be assigned to other new Master Agents so they are monitored from them if wanted.

Server Processes

From this screen we can control all processes running in the Osmius Central Server. .

 Osmius Server Processes

In each process we can do:

  • Start/Restart the process.
  •  Stop the process.
  •  Save changes in its parameters

We have to remark that changes in any parameter will restart all the processes affected by this change.

Example 1

We have installed a new machine in our systems and we want to monitor its status at all times. This machine is running Windows.

In order to achieve this we have to:

  1. Deploy a new Master Agent in the machine we want to monitor.
  2. Setup this Master Agent so that it is able of monitoring the Windows OS of the machine where it is being executed.

We also want to monitor in every moment the availability of our corporative Web page from internet.

In order to achieve this, nad considering that our machine is located in the DMZ of our systems, we will have to:

  1. Create a new Type of Instance Apache.
  2. Assign the new Master Agent to this Instance. This starts the monitoring process.

Deployment of a Master Agent

We must install the Master Agent using the BitRock installer in the Windows machine that we want to monitor. We will have to indicate various parameters, some relating to the Central Server Osmius Installation and others specifics for each Master Agent.

osmius_despliegue00.jpg

Main Osmius Server IP IP of Osmius Central Server
Master IP IP of windows machine



osmius_despliegue01.jpg

Server Message Port Server Port to receive events from master agents
Task Manager Port Server Port to receive commands from Master Agents
Server Master CMD Port Master Port to receive commands from the Server
Server Master BKG Port Master Port to receive back commands from the Server



After this, the control panel will display the new Master Agent with its hostname (in this case, windows01) and all the data that we have sent it in the setup file from the windows machine, as well as an internal Master Agent code generated by the system (MA000003 in this example).

osmius_despliegue02.jpg

We will also see a configuration update task for each of the Agents that can control that version of Master Agent and another one of configuration update task of the Master Agent. These tasks will send the initial setups of the Agents and the updated setup of the Master Agent.

osmius_despliegue03.jpg

The Task Administrator will process each one of them, sending them to the Master Agent that will execute the corresponding actions. Once they are processed the Master Agent is ready to be set up.

Configuring a Master Agent

We click on the name of the new Master Agent and we set it up so it starts its Windows Agent ticking the “start” checkbox (and the “Debug” one too if we want the Agent to leave a trail in a log file of the machine where it’s being executed) as shown in the figure: osmius_despliegue04.jpg

This generates an updating task of the Master Agent Windows01 that when treated will start its Windows Agent.

osmius_despliegue07.jpg

Assigning an Instance

osmius_despliegue05.jpg

We create and setup an instance Create Instances of Windows type, capable of monitoring the OS of the new machine and we assign it to a new Master Agent as shown in the figure:

osmius_despliegue06.jpg

Note: The instance can be created, and this is what normally happens, by a user with an administrator profile, but only the root user can assign it to the Master Agents displayed in the infrastructure that they are able to and have to monitor. This assignation will generate a Windows Agent update task of the Master Agent windows01 that when treated will make its Windows agent start to monitor the new instance. This is so because in the previous step we have started the mentioned Agent; if we had not done it yet, we could do it now as the order of the actions does not affect the process.

osmius_despliegue08.jpg

Configuring an Agent

For the second scenario, we create a new instance of type Apache and we assign it to the new Master Agent. This time we do it changing the setup of the Apache Agent directly, ticking the checkbox of the new instance as shown in the figure:

osmius_despliegue09.jpg

This action will also produce, like the previous one, a configuration update task of the Apache Agent for the windows01Master Agent.

Checking the Status

If we check the status of the infrastructure we can see that the windows01 Master Agent is being executed correctly as it is green as well as its Windows Agent. Despite this, its Apache Agent is in grey, why? If we go over all the steps we can see that we have not started the Apache Agent, therefore we will have to do it if we want to start to monitor it. Agents Setup

If we take a quick look at the events, the OsmiusSV Service and its instances are in green, and if we look at the historic events of the OsmiusAG instances we can verify that the events are being received, that all the Master Agents and their respective Agents are being executed correctly and also all the task are being processed normally.

Example 2

In our systems we have installed a new machine that we want to monitor, but it is located on a network where it can not communicate with the network where the central server is. However we have one Master Agent running on a machine that can “talk” with both networks.

To achieve this we must:

  1. Configure the existing Master Agent how a Master Agent Proxy.
  2. Deploy the new Master Agent through this Master Agent Proxy.

Configuring a Proxy Master Agent

We click on the existing Master Agent installed at the ip address 192.168.3.2 and with ip adress 172.58.3.1 in the other network, and we configure it to behave as Master Agent Proxy.

For this we set the parameters:

PORTAG = 1950 –> The port where this Master Agent is already receiving the events from the various agents who depend on it, and where it will also receive new events from Master Agent that will use it as a Proxy.

MPROXY = 1 –> Master Agent Proxy: yes

IPGTWY = 172.58.3.1 –> IP of the network where we will deploy the new Master Agent.

PORGTW = 1951 –> Port where it will receive the request for deployment of the new Master Agent.

This creates a task for updating the Master Agent parameters. It will restart how Master Agent Proxy (Pxy in the screen console).

osmius_despliegue19.jpg

Deployment of a Master Agent through a Proxy Master Agent

We must install the Master Agent using the BitRock installer in the new machine that we want to monitor. We will have to indicate various parameters, some relating to the Central Server Osmius Installation Instalación and others specifics for each Master Agent. In this case, for this Master Agent, its Central Server is the Master Agent Proxy that we have configured before.

osmius_despliegue13.jpg

Main Osmius Server IP IP of the Master Agent Proxy
Master IP IP of this Master Agent



osmius_despliegue14.jpg

Server Message Port Master Agent Proxy Port to receive events
Task Manager Port Master Agent Proxy Port to receive commands
Server Master CMD Port Master Agent Port to receive commands from the Server
Server Master BKG Port Master Agent Port to receive back commands from the Server



Immediately we can see in the console screen the new Master Agent with proxy parameters corresponding to the Master Agent Proxy.

osmius_despliegue17.jpg

And the server parameters set also to the Master Agent Proxy adress.

osmius_despliegue18.jpg

1) This can happen due to an error (it has stopped or started without a direct order from the control panel) or because it has pending tasks to execute.
2) This can happen due to an error (it has been stopped or started without a direct order from the control panel) or because the status linked Master Agent is not the expected one.
3) , 4) A Task for each of the Agents linked to that Master Agent
5) Un evento distinto por cada tipo de Agente con su mismo nombre: IPINST01, LINUX001, WINDOWS1, etc.
 
en/usuario/infraestructura/funcionalidad.txt · Last modified: 2012/12/04 09:03 by joseangel.chico
 
Recent changes RSS feed Creative Commons License Donate Powered by PHP Valid XHTML 1.0 Valid CSS Driven by DokuWiki