Monday, July 18, 2011

Oracle ACFS Troubleshooting (11.2.0.2)


Todays efforts to create an ACFS volume on a linux cluster haven't been smooth – but finally there is success.

Problem 1: ASMCA was slow to startup.
Running a ps and searching for the pid of asmca process showed RSH tests (i.e. /usr/bin/rsh hostB /bin/true).  But RSHD is not running on this cluster, and these attempts take a while to timeout.  The problem was the prior SSH connection attempt was failing.  It needed to be primed with a command line connection (ssh hostB) due to a key change.

Problem 2: ACFS volumes didn't enable on remote nodes
Creating the ACFS volume and filesystem using ASMCA was successful, however the volumes were only enabled successfully on the primary node of the cluster (i.e. hostA).  All other nodes returned ORA-15477 (Cannot communicate with the volume driver).

The volume driver was running:
> acfsdriverstate
ACFS-9206: usage: acfsdriverstate [-orahome ] [-s]
> acfsdriverstate installed
ACFS-9203: true
> acfsdriverstate loaded
ACFS-9203: true
> acfsdriverstate version
ACFS-9325:     Driver OS kernel version = 2.6.18-8.el5(x86_64).
ACFS-9326:     Driver Oracle version = 100804.1.
> acfsdriverstate supported
ACFS-9200: Supported

On at least one of the nodes, the mount directory had the incorrect group (this cluster uses the legacy dba group rather than oinstall):
> chmod dba /u04

Re-installing ACFS allowed further progress.  A few web searches resulted in similar scenarios requiring a re-install after every node boot.  The root cause was unknown.  time will tell whether this is a necessary workaround for this cluster.
> sudo su
> acfsroot install

Enabling of the volumes was then successful
> sudo su – oracle
> . oraenv << "+ASM2"
> asmcmd volenable -a

ASMCA was successful mounting volumes on all but one node.  The final node needed to be manually mounted.
> sudo su
> mount –t acfs /dev/asm/acfs1-351 /u04

Thought of the day/week: Working with computers is rarely boring but frequently frustrating.  They can be as unpredictable as people.

1 comment:

Anonymous said...

Hi,

This was my exam question, still not sure about the answer.
Any help is much appreciatedl

Which three fragments will complete this statement correctly ?
In a cluster environmet, an acfs volume

a-)will be automatically mounted by a node on rebooy by default
b-) must be manually mounted after a node reboot
c-) will be automatically mounted by a node if it is defined as cluster stack startup if it is included in the ACFS mount registry.
d-)will be automatically mounted to all node if it is defined as cluster resource when dependent cluster resource requires access
e-)will be automatically mounted to all node in the cluster when the file system is registered
f-)must be mounted before it can be registered.