HP OVO Forum | Monitoring Solution

Full Version: OVO Tutorials and troubleshooting Guide
You're currently viewing a stripped down version of our content. View the full version with proper formatting.
Hello All,,

Here you will find various OVO agent <- -> server communication troubleshooting situations and scenarios.

PREREQUISITES

Admin/root access to the managed node

Ok,, Lets start:

In order, to determine the likely cause of the communications issue. Lines that begin with “OVO> “ indicate to run a command on the management server. Lines that begin with “Node> “ indicate to run a command on the managed node.
1. Node is in DNS
OVO> nslookup <node name>
Should return the expected IP address of the node
OVO> nslookup <agent IP>
Should return the fully qualified node name

Solution:

Inform to DNS administrators to have the node added to the appropriate DNS domain. Must have both forward (A record) and reverse (PTR record) added.

2. Node can resolve OVO server
Node> nslookup <mgmt server>
Should return the IP address of the OVO server

Solution:

The node probably has an incorrect DNS server configured that cannot resolve the management server name. inform to the owner of the node to have the correct DNS server added.

3. Agent is installed and running on node
Node> ovc –status
Should return the status of all sub-agent processes as “running”. There should be no warning messages e.g. “xxx component is buffering” or “agent cannot be accessed remotely”.

Solution:

If the ovc command fails to run, the agent is not installed. Raise a change to install the agent package.
If the agent reports “buffering for management server” or “not accessible remotely”, stop the agent, clear the temporary files and restart the agent.
Node> ovc –kill

For Windows, remove all files under:
C:\Program Files\HP OpenView\data\tmp\OpC
C:\Program Files\HP OpenView\data\tmp\public\OpC

Node> ovc -start
Node> ovc –status

4. Node is installed in OVO
From the Motif GUI, press CTRL-F and search for the node. It should exist in one layout group and one or more node groups.

Solution: Add the node in OVO.

5. Agent coreid matches OVO recorded coreid
Node> ovcoreid
OVO> opcnode –list_id node_list=<node name>
Both coreid values should match

Solution:

If the agent requires a new unique coreid
Node> ovcoreid –create –force

If the coreid was or is now different to the recorded id on the OVO server, they must be aligned;
Node> ovcoreid
Note this string
OVO> opcnode –chg_id node_name=<node name> id=<coreid from node>



6. Agent has correct configuration variables
Node> ovconfget
The “MANAGER”, “MANAGER_ID” and “CERTIFICATE_SERVER” variables should all reference the correct management server

Solution:

Assign the correct variables to the agent:
OVO> ovcoreid –ovrg server
Note this string
Node> ovconfchg –ns sec.cm.client –set CERTIFICATE_SERVER <mgmt server>
Node> ovconfchg –ns sec.core.auth –set MANAGER <mgmt server>
Node> ovconfchg –ns sec.core.auth –set MANAGER_ID <mgmt server coreid>
Node> ovc –kill
Node> ovc –start


7. Agent has all required certificates
Node> ovcert –list
Should show the agent certificate with an asterisk next to it. Should also show all mgmt server trusted certificates.

Solution: I have not given the solution here, as there are so many steps to take care...

8. Agent can poll OVO over HTTPS
Node> bbcutil –ping <mgmt server>
Should indicate successful HTTPS poll of the management server

Solution:

This is unlikely if the previous checks have succeeded. It would indicate an issue on the management server agent or a temporary network issue. Check the management server agent using this WI.


9. OVO can poll agent over HTTPS
OVO> bbcutil –ping <node name>
Should indicate successful HTTPS poll of the agent

Solution:

This usually indicates a certificate issue. Recheck that all required certificates are installed on the agent.

10. OVO can do a remote status check of agent
OVO> opcragt –status <node name>
Should return the status of all sub-agent processes as “running”


Solution:

This is unlikely if the previous checks have succeeded. Recheck the config variables and try restarting the agent as per process 4.

11. OVO can do a remote policy listing of agent
OVO> ovpolicy –list –level 4 –ovrg server –host <node name>
Should show detailed information about all policies deployed to the agent. Each policy should show an “owner” attribute referencing the correct management server.

Solution:
Policies may not be deployed. Double check by looking from the agent:
Node> ovpolicy –list –level 4

If in fact no policies are listed, deploy from the management server. Highlight the node in the node bank and select Actions  Agents  Install/update software, select “templates”, “actions”, “monitors”, “commands” and click OK…or…deploy from the command line:
OVO> opcragt –distrib –templates –actions –monitors –commands <node name>



12. Agent can send test message
Node> opcmsg a=a o=o msg_t=”test”
Should result in a normal test message appearing in the Java GUI

Solution:

Ensure the agent has a generic opcmsg policy deployed to it as per process 11. Without any opcmsg policies, the opcmsg command will not generate an alarm.
Ensure you are logged into the Java browser as an operator with permissions to view all nodes. Find the node in the node pane of the browser. If it is not there, the node is likely not in any node group. Add the node to a node group, reload the Java browser and try the test message again.
Reference URL's