Hi
We have threshold conditions that send alerts if a pathmon process is not active within the defined monitoring schedule. Recently one of the pathmon process failed(only the primary, backup still running) and Prognosis sent an alert as expect. Only the process primary failed. The process backup quickly took over the load but Prognosis continued sending alerts that the process was not active. The threshold conditions had to be restarted to stop the alerts messages. Is this a normal behaviour ? Is it possible for a threshold to check that the process backup process is active in addition to the primary?
Thanks
Hi There,
When primary and backup processes have different process names :
Prognosis can alert on 2 (or more) processes all down in an "application" on NonStop. This requires the Availability Reporting license module ('AV' in the license file). Here is the config for an application with 2 processes $ABC1, $ABC2:
1. in UPDOWN config:
ADD PROCESS ( $ABC1, $ABC2 )
2. in AVAILABILITY config:
APPLICATION ADDTANDEMENTITY (Application-ABC,$ABC1,DE) APPLICATION ADDTANDEMENTITY (Application-ABC,$ABC2,DE)
3. a threshold condition on MpAvailabilityOutage record with where clause of:
ID="Application-ABC" AND UP = EMPTY
More about grouping of processes and other entities into Availability "applications" in Online Help.
HTH
Are you referring to a pathmon process, which is a nonstop process-pair?
i.e.
status $FPMN
System \TCNS
Process Pri PFR %WT Userid Program file Hometerm
$FPMN 1,810 135 005 106,240 $SYSTEM.SYSTEM.PATHMON $VHSPF
Swap File Name: $SYSTEM.#0
Current Extended Swap File Name: $SYSTEM.#0
$FPMN B 2,631 135 001 106,240 $SYSTEM.SYSTEM.PATHMON $VHSPF
Swap File Name: $SYSTEM.#0
Current Extended Swap File Name: $SYSTEM.#0
If so the backup process has the same name. How does one monitor these processes with the same name separately?
Hi
When primary and backup processes have same process name as a NonStop process-pair:
Yes Prognosis can alert on that. This requires the Availability Reporting license module ('AV' in the license file). Here is the config for process name $FPMN:
1. in UPDOWN config:
ADD PROCESS ( $FPMN )
2. a threshold condition on NonStopUpDown record with where clause of:
DEVNAME="$FPMN" AND CURSTATE = "DN"
HTH
Note that your Prognosis threshold kept sending alerts after the backup started because it is alerting on "process PID AND creation time does not exist" because the NonStopJob record has key fields of PID and process creation time.
HTH
Hi
Thank for the reply. I forgot to mention that both the primary and the backup processes have the same name. Is there a solution for this situation ? Does Prognosis monitor both processes or only the primary process.
Thanks
Hi,
That's right. Yes the solution above for when both the primary and the backup processes have same process name should work. I tested it in the lab by stopping a process, and starting it up again with same name (different PID) and confirmed that it should work.
HTH
I tried to recreate the orginal issue with a pathmon process on our system.
When I start this threshold I get the following error.
What am I doing wong?
That answers why we were getting the alerts even after the backup process took over.
Thanks
After is sent my last post I discovered that I was including a field substition in the message Destinations tab.
I removed the subsition and I was able to start the Threshold.
But I was not able to get it to trigger.
I stopped the PID of the primary process, and the threshold did not trigger.
Still curious what I am missing.
1> status $fpmn
System \LCNS
Process Pri PFR %WT Userid Program file Hometerm
$FPMN 2,1149 135 005 255,255 $SYSTEM.SYSTEM.PATHMON $ZHOME
Swap File Name: $SYSTEM.#0
Current Extended Swap File Name: $SYSTEM.#0
$FPMN B 1,1563 135 001 255,255 $SYSTEM.SYSTEM.PATHMON $ZHOME
Swap File Name: $SYSTEM.#0
Current Extended Swap File Name: $SYSTEM.#0
2> stop 2,1149
3> status $fpmn
System \LCNS
Process Pri PFR %WT Userid Program file Hometerm
$FPMN 1,1563 135 005 255,255 $SYSTEM.SYSTEM.PATHMON $ZHOME
Swap File Name: $SYSTEM.#0
Current Extended Swap File Name: $SYSTEM.#0
$FPMN B 2,1308 135 001 255,255 $SYSTEM.SYSTEM.PATHMON $ZHOME
Swap File Name: $SYSTEM.#0
Current Extended Swap File Name: $SYSTEM.#0
Members | Likes |
---|---|
46 | |
13 | |
13 | |
12 | |
10 |