Fail Over Service

Not to be confused with the FOS feature of AMC v2, this is an example Config that uses SNMP to set up fail-over handling that can be included in existing Configs.

When you unzip the enclosed package, you will need to open the Config(s) -- "FOS.xml" is the main one, but there are two test client Configs as well -- and then edit the External Properties (ExtProps) to point to the correct filepath for the "FOS.properties" file. Both the FOS Host and Clients need these property values. You also have to correct some other filepaths in the ExtProps (explained below).

Also, I am an SNMP newbie. It may be that I am misuing one or more parameters when I make the script call to send the SNMP trap. The solution still works, but if you see any unnecessary or incorrect coding here, PLEASE LET ME KNOW. [Thanks -Eddie].

The files include :

  • 3 x GIFs used by the fosWebMonitor AL (simple web server)
  • FOS.properties - External properties.
  • FOS_Execute.bat - Batch file that is set up to run if AL dies note that you configure this in Ext Props.
  • TestClient.xml - test client Config #1
  • TestClient2.xml - test client Config #2

You use FOS by starting a TDI Server and pointing it at FOS.xml:

    ibmdisrv -c examples/_merglets/fos_6.0/fos.xml %1 %2 %3 %4 %5 %6 %7 %8 %9

You can see that I have an FOS_6.0 sub-dir, under examples/_merglets in my TDI solutions directory. You'll adjust this to fit your needs.

Then you drop the fosClientHeartbeat FC in the AL's you want to monitor (as shown in the test client Configs). You do this by creating a new Include (XML type) that points to the FOS.xml Config. Then either create a Library FC that inherits from the included fosClientHeartbeat component, or you can add them directly to your ALs.

You will also have to set up Cloudscape for networked mode operation. This is done by editing the solution.properties file in your TDI solution-directory (sol-dir), commenting out the embedded mode lines and uncommenting the networked mode lines. Check out the Cloudscape/System Store cookbook for more info on this.

The ALs included in FOS.xml are:

  • fosWebMonitor - web-based monitor (very simple) for FOS. Services port 80, so "http://localhost" to test. Was planning to go for XML/XSLT, but chickened out and did it simply by scripting the HTML.
  • fosHostHeartbeatListener - Heartbeat listener process
  • fosHostCheckForTimeouts - Checks Heartbeat Journal for deadbeats, and then calls fosHostHandleDeadClient as needed
  • fosHostHandleDeadClient - AL to handle dead clients (prescribed actions as defined in ExtProps.

And here is a description of the ExtProps:

All props that start with FOS_Client are specific to the client. They apply to all ALs in this Config, but can be tied to a specific AL by appending "@" and the AL name (as shown below - e.g. FOS_ClientAction@A_test )

Log heartbeat messages to client log
FOS_ClientLogHeartbeats:true

Send heartbeat at AL initialization. This has been disabled, as it caused AL startup problems once in a while for me.
FOS_ClientHeartbeatAtInit:true

Send a heartbeat when AL terminates normally.
FOS_ClientCloseAtShutdown:true

Comma separated list of Client actions. See the Info tab of the fosClientHeartbeat FC for more details. You have two types of actions:
  • mailto: sends messages defined in other ExtProp to this address
  • execute: shells out and executes the FOS_ExecuteCommand batch-file/script with the value specified below as parameters
FOS_ClientAction:mailto:edbird@mac.com,execute:"eddie"

Action tied specifically to A_test AL. This overrides the previous action ExtProp.
FOS_ClientAction@A_test:mailto:edbird@mac.com,execute:"A_test60"

Message(s) to send by the mailto: action
FOS_ClientMessage:Eddie is now out of action,Call out the guards!
FOS_ClientMessage@A_test:Special message for A_test60

Name of (all) clients. Since this is not tied to an AL, it serves as a user- specified name for the TDI Server running this Config
FOS_ClientName:eddie
FOS_ClientName@A_test:My Test AL

Client pulse (how often the FC should send a heartbeat). A_test AL has its own pulse.
FOS_ClientPulse:4
FOS_ClientPulse@A_test:2

Timeout before client declared dead by FOS Host
FOS_ClientTimeout@Another_test:20
FOS_ClientTimeout:15

Command to execute (needed quotes here for Windows) on deceased client.
FOS_ExecuteCommand:"C:/Documents and Settings/NO010196/My Documents/TDI/examples/_Merglets/FOS_6.0/FOS_Execute.bat"

Name of Heartbeat Journal db in the System Store (Cloudscape), plus auth params.
FOS_HeartbeatJournalName:HeartbeatJournal
FOS_LoginId:APP
FOS_LoginPwd:APP

SNMP OID for heartbeats, port and Host URL (Clients and FOS Host need these)
FOS_HeartbeatOID:1.1.1.1.1.1.1.3.0
FOS_HeartbeatPort:3609
FOS_HostURL:localhost

Location for HTML files (fosWebMonitor) and GIF images
FOS_HTMLPath:C:/Documents and Settings/NO010196/My Documents/TDI/examples/_Merglets/FOS_6.0/

How oftent he fosWebMonitor refreshes the browser.
FOS_WebClientRefresh:3

That should about do it.

NOTE: If you get this error the first time you start FOS, or whenever you've deleted the Cloudscape sub-directory:

com.ibm.db2.jcc.a.SQLException: Table/View 'HEARTBEATJOURNAL' already exists in Schema 'APP'.

It's because all the ALs are trying to create the table at the same time. Ways around this is to add system.sleep(nSeconds) to the AL Prolog - Before Init Hook of two of the ALs. Once you get this, you need to stop and restart the Server running FOS.xml, since one of the processes was unable to initialize.

You could also have three AutoStart ALs that simply had AL FCs to launch the three worker ALs, and with an EternalIterator, so when an AL stopped, it was started again on the next cycle. Be sure to use the AL FC run mode "Run and wait for result".

Here is the zip file itself. The .doc file is out-of-date, although the message format information is correct. I hope to get some time to address this. Don't let it confuse you -- the stuff written here take precedence.

Note also that a simpler approach can be found in the Real World DI site.

-- EddieHartman - 19 Mar 2006
Topic revision: r3 - 16 Jun 2006, AndrewFindlay - This page was cached on 05 Aug 2023 - 18:57.

This site is powered by FoswikiCopyright © by the contributing authors. All material on this collaboration platform is the property of the contributing authors.
Ideas, requests, problems regarding TDI Users? Send feedback