05-Mar-2005

Another patch problem

Last Monday we installed VMS732_SYS-V0600 on our development cluster in preparation for installation on the production clusters. This was prompted by the response from Oracle when we reported the Rdb hang that resulted in my discovery of the NETDRIVER bug that I described in the previous entry.

We have a bunch of command procedures that monitor the disk farm, looking for things like shadow set membership change, and so forth. Most of these broke with the installation of VMS732_SYS-V0600.

On investigation, it became obvious that the patch had made a functional change to the way that f$getdvi(device, "shdw_next_mbr_name") was working. Instead of returning a null string on the call after the last physical device was returned, the lexical repeatedly returned the same physical device, causing command procedures to go into an infinate loop. And this didn't happen on all shadow sets, just some.

When I reported this to Engineering, it turned out that someone had beaten me to it, and the problem was already known. Support supplied me with an unoffical kit containing the corrected version of IO_ROUTINES.EXE. Based on this image, I can only assume that the problem is underlying corruption to the I/O database itself. Not good.

The major failure, in my humble opinion, is that the patch containing what I would rate a critical problem is still out there on the HP's FTP server, waiting for some unsuspecting systems manager to download and install it. No warnings, no nothing.

What's up with HP's quality assurance? We as OpenVMS customers have come to expect better than this. At least give us some notification that the kit contains a known problem. Sheesh.

Posted at March 5, 2005 9:05 AM
Tag Set:

Comments are closed