After a storage migration activity, a two-node Oracle RAC environment experienced Oracle Cluster Registry (OCR) corruption. Although OCR was restored successfully on a newly provisioned disk, ASM instances repeatedly terminated with internal errors. Further investigation revealed that the issue was not related to Oracle software or OCR restore procedures, but to an underlying disk visibility and sharing problem at the storage layer. This article walks through the symptoms, diagnostics, root cause, and final validation that confirmed the disk issue.
Environment Overview
- Oracle RAC: 2-node cluster
- ASM used for:
- OCR/Voting diskgroup (OCRVOTE)
- Database diskgroups
- Recent activity:
- Storage migration performed by system/storage team
- OCR diskgroup became corrupted post-migration
ASM startup repeatedly failed with messages similar to:
Begin lmon rcfg omni enqueue reconfig stage6End lmon rcfg omni enqueue reconfig stage6Begin lmon rcfg omni enqueue reconfig stage7End lmon rcfg omni enqueue reconfig stage7Reconfiguration complete
Followed by internal ASM failures:
ORA-00600: internal error code, arguments: [kfcInitPba15]ERROR: ORA-600 thrown in RBALRBAL: terminating the instanceORA-1092: opitsk aborting processInstance terminated by RBAL
Important Observations
- ASM failed consistently during RBAL operations
- Failures occurred only when using the new OCR/Voting disk
- ASM startup never stabilized across both nodes
Validation Test That Changed Everything
To isolate the issue, we performed a controlled test:
Test Action
- Restored OCR onto an existing, known-good ASM diskgroup (used by a database)
- Ensured the diskgroup was:
- Properly shared
- Stable across both RAC nodes
Root Cause Summary
The newly provisioned OCR disk had storage-level issues, such as:
- Disk not truly shared across both nodes
- Inconsistent LUN presentation
- Improper disk alignment or sector size
- Underlying storage replication or fencing mismatch
As a result:
ASM terminated to protect cluster integrity
ASM metadata could not be initialized consistently
RBAL encountered fatal internal errors
This issue was not caused by Oracle RAC, ASM, or OCR restore procedures.
It was a pure storage-level problem introduced during migration.
Conclusion
The decisive proof came from restoring OCR onto a known-good diskgroup, which immediately stabilized ASM and the cluster.

Leave a comment