Monday, July 18, 2011

Oracle ACFS Troubleshooting (11.2.0.2)


Todays efforts to create an ACFS volume on a linux cluster haven't been smooth – but finally there is success.

Problem 1: ASMCA was slow to startup.
Running a ps and searching for the pid of asmca process showed RSH tests (i.e. /usr/bin/rsh hostB /bin/true).  But RSHD is not running on this cluster, and these attempts take a while to timeout.  The problem was the prior SSH connection attempt was failing.  It needed to be primed with a command line connection (ssh hostB) due to a key change.

Problem 2: ACFS volumes didn't enable on remote nodes
Creating the ACFS volume and filesystem using ASMCA was successful, however the volumes were only enabled successfully on the primary node of the cluster (i.e. hostA).  All other nodes returned ORA-15477 (Cannot communicate with the volume driver).

The volume driver was running:
> acfsdriverstate
ACFS-9206: usage: acfsdriverstate [-orahome ] [-s]
> acfsdriverstate installed
ACFS-9203: true
> acfsdriverstate loaded
ACFS-9203: true
> acfsdriverstate version
ACFS-9325:     Driver OS kernel version = 2.6.18-8.el5(x86_64).
ACFS-9326:     Driver Oracle version = 100804.1.
> acfsdriverstate supported
ACFS-9200: Supported

On at least one of the nodes, the mount directory had the incorrect group (this cluster uses the legacy dba group rather than oinstall):
> chmod dba /u04

Re-installing ACFS allowed further progress.  A few web searches resulted in similar scenarios requiring a re-install after every node boot.  The root cause was unknown.  time will tell whether this is a necessary workaround for this cluster.
> sudo su
> acfsroot install

Enabling of the volumes was then successful
> sudo su – oracle
> . oraenv << "+ASM2"
> asmcmd volenable -a

ASMCA was successful mounting volumes on all but one node.  The final node needed to be manually mounted.
> sudo su
> mount –t acfs /dev/asm/acfs1-351 /u04

Thought of the day/week: Working with computers is rarely boring but frequently frustrating.  They can be as unpredictable as people.

Friday, July 1, 2011

Hidden and Undocumented "Cardinality Feedback"

I just stumbled across this in an 11gR2 environment. Initially I though it was part of Adaptive Cursors Sharing and Bind Awareness, but no. The impacted queries have not been flagged as bind sensitive or bind aware. The DBMS_XPLAN cursor plan includes the note "cardinality feedback used for this statement".

A quick search found this article...

We Do Streams: Hidden and Undocumented "Cardinality Feedback"

There is an article written on this topic by the CBO dev team:
http://blogs.oracle.com/optimizer/entry/cardinality_feedback


During the first execution of a SQL statement, an execution plan is generated as usual. During optimization, certain types of estimates that are known to be of low quality (for example, estimates for tables which lack statistics or tables with complex predicates) are noted, and monitoring is enabled for the cursor that is produced. If cardinality feedback monitoring is enabled for a cursor, then at the end of execution, some of the cardinality estimates in the plan are compared to the actual cardinalities seen during execution. If some of these estimates are found to differ significantly from the actual cardinalities, the correct estimates are stored for later use. The next time the query is executed, it will be optimized again, and this time the optimizer uses the corrected estimates in place of its usual estimates.

We can tell that cardinality feedback was used because it appears in the note section of the plan. Note that you can also determine this by checking the USE_FEEDBACK_STATS column in V$SQL_SHARED_CURSOR.


See also the VLDB document and presentation.

Plus Jonathan Lewis post on Cardinality Feedback.

This behaviour can be controlled by _optimizer_use_feedback.

Thursday, June 2, 2011

HTC Desire dropped in water

My HTC Desire has been a wonderful phone whcih I've had for a little over a year without mishap ... until last night (Thu 6.00pm) when it fell into the toilet.  At least it had been flushed beforehand, so it was clean water.

Before going anything else, I took off the cover, took out the battery, took out the sim, took out the memory card, and stuck it in a container of rice overnight, and prayed.

This morning (Fri 7:00am, 13 hours later) I tried it.  It seemed ok with a small hint of some capacitive loss.

One hour later on the way to work, I tried it again.  Oh.  Quite a lot of capacitive loss, requiring very heavy touches and swipes.  Not good.

So it's now apart again and hopefully continuing to dry out in a dry air conditioned room.

Hint - make sure you do up the zip when putting it back in it's belt pouch.

Further updates to come ... all is not lost ... yet.

Fri 4:40pm (23 hours after the event): My latest test results on the phone are promising after giving it more time to dry in an air conditioned room.  But it's too early to be 100% conclusive as I only had it on for a few minutes.  I'll give it another day to dry out.

Mon 9:00am (3.5 days after the event): Turned it on for the first time in 2.5 days this morning.  All appears to be ok.  I'll now treat is "normally" and see how it goes.  It's currently on charge (a low charge rate via a laptop usb port).
Tue 9:00am: All still ok.

[18-July-2011] 1.5 months later - all is well.

[20-July-2012] 13.5 months later - all is well apart from the camera which has lost it's clarity and focussing ability (which for all I know could be unrelated to the water incident).

[26-April-2013] All is still well.

[9-October-2013] All is still well.

Wednesday, April 27, 2011

KGH: NO ACCESS

Finally there is information on the KGH: NO ACCESS shared pool component appearing on BLOGs and at Oracle Support.

Friday, April 15, 2011

Enjoy The Ride

Excellent road safety video from WA. You can feel yourself slowing down. Awesome.