QSystem Monitor gives you the ability to track the remaining lifetime of your disk cache batteries, helping you to prevent costly system downtime and system slowdowns.
The Importance of Cache Battery Maintenance
Most Power System Disk controllers employ battery-backed cache. The cache’s function is to speed up I/O operations. The battery is used to preserve data that has not yet been written to disk in the case of a power outage.
The problem with battery-backed cache is that each battery has a limited life. If one or more batteries fail, there's the possibility of losing data or your system during operational periods. At the very least, once the operating system detects that the battery has failed, it disables the cache, causing an immediate and very pronounced (several orders of magnitude) slowdown of system processing.
While IBM i tracks life for each battery, the system issues only one warning message per IPL when the battery is nearing the end of its life.
Depending solely on that warning has clear drawbacks:
Spotting one warning message depends on an experienced operator who is monitoring QSYSOPR at all times and is ready to take immediate and appropriate action.
This scenario may have been feasible when networks were smaller and overstaffed but is now impossible for many busy data centers. That’s cache battery expiry messages are easily and often overlooked.
The Cache Battery end-of-life message can come as a surprise, leading to unscheduled service calls and unplanned system downtime. Furthermore, the warning message only indicates the status of one battery and manually checking the battery status involves using SST on each partition in the network.
By using Cache Battery Monitoring in QSystem Monitor, you can continually keep track of expected battery life, monitor all batteries on all systems, and schedule replacements before low battery life threatens your operations. Knowing the status of all batteries enables IBM to replace several batteries at once, preventing multiple service calls and repeated system downtime.
Setting Up Cache Battery Monitors in QSystem Monitor
This functionality requires:
- QSystem Monitor
- Internal disk (not shared through VIOS)
The basic workflow for monitoring the cache batteries is:
- Configure SST Access in QSystem Monitor.
- Add a Cache Battery Life check in QSystem Monitor .
- View the Remaining Cache Battery Life.
- (Optional) Use QMessage Monitor and QRemote Control to send out warnings.
Configure SST Access in QSystem Monitor
Use the SST Access icon in the Monitor module of QSystem Monitor to configure the SST Access. Configuring SST Access requires you to know the password for an IBM i user with *SECADM and *SERVICE special authorities, and the username and password of an SST user ID.
We recommend using the SST Access Configuration to create a new IBM i user profile and new SST user ID that will be used for Cache Battery Monitoring exclusively.
After configuring the access, click the Test button to test the configuration.
SST Access has to be configured for each system for which you want to monitor cache battery life.
Configuration if subsystem QINTER is not used:
By default, SST Access requires that interactive jobs use the QINTER subsystem. This is the case with the IBM i default configuration. However, if your system has been configured so that interactive jobs are routed to a different subsystem, run the following commands on the system first:
CRTDEVDSP DEVD(MSMSST) DEVCLS(*VRT) TYPE(5251) MODEL(11) CTL(QPACTL01) KBDTYPE(USB) TEXT('QSystem Monitor SST virtual device')
ADDWSE SBSD(MSM) WRKSTN(MSMSST)
ADDWSE SBSD() WRKSTN(MSMSST) AT(*ENTER)
Add a Cache Battery Life Check in QSystem Monitor
In the Monitor module, click the Monitor something new button. This displays the Data Definition dialog box.
In the displayed Data Definition dialog box, ensure that the Filter drop-down list is set to *All or to IOP Monitors.
In the Monitor Type drop-down list, select Cache Battery Life.
To create the check for all systems, leave the All Systems option checked. To only create the check for one system or for a subset of all systems, uncheck the All Systems option and select the corresponding systems.
Click OK. This creates the check for the selected systems.
To display the check, ensure the Data Selection pane is displayed. Scroll the pane until the Cache Battery Life node is visible. Then drag and drop the Predicted Cache Battery Life check from the pane to the System View. This will display the check with its current value.
Viewing the Remaining Cache Battery Life
In the Monitor, the real-time value for the element is displayed. The value corresponds to the estimated days until the battery fails. If more than one battery is present, the value is the value for the battery that will fail first.
To display additional information, right-click the element and select Show Details from the menu. This displays additional info such as per-battery information, battery serial number, battery position and number of days in use, as seen in the image below:
Using QMessage Monitor and QRemote Control to Send Out Warnings
If you are using QMessage Monitor and QRemote Control in addition to QSystem Monitor, you can use these to send out warnings by email or SMS/text message when the remaining battery lifetime is too low.
To achieve this:
In QSystem Monitor, modify the threshold for the Cache Battery Lifetime data definition, enabling the “Send Message to iSeries” option as in the following example:
In QMessage Monitor, configure an automated response and an escalation to forward the message from QSystem Monitor. The message can have a message ID of MON0079, MON0279 or MON0379. (You can use a Value List to group these message IDs together.)
As a result, an email and/or text message is sent out when the cache battery nears its predicted failure date.