Retrieve the data from Corrupted oracle table partition
Oracle Data Recovery Query
Need help with corrupted Oracle Datapump export
I have a corrupted Datapump export (Oracle 11g). How do I use your tool to extract 2 tables from it?
I have downloaded DUL4108.
feedback:
We can provide a data recovery service for corrupted Datapump export. For extract data from datapump is not a packaged function in PRM-DUL 4108 .
We wonder what about the export size , can you pls send us the datapump file?
PRM-DUL capabilities
Oracle data recovery without any system file.
Help to recover Oracle database
Question :
oracle Tablespace deleted issue
If you cannot recover data by yourself, ask Parnassusdata, the professional ORACLE database recovery team for help.
Parnassusdata Software Database Recovery Team
Service Hotline: +86 13764045638 E-mail: service@parnassusdata.com
How to Resolve Ora-00600 [3020] when Allow 1 Corruption Doesnot work
If you cannot recover data by yourself, ask Parnassusdata, the professional ORACLE database recovery team for help.
Parnassusdata Software Database Recovery Team
Service Hotline: +86 13764045638 E-mail: service@parnassusdata.com
ora-00600 [kfcema02] cause the diskgroup can not bring up.
If you cannot recover data by yourself, ask Parnassusdata, the professional ORACLE database recovery team for help.
Parnassusdata Software Database Recovery Team
Service Hotline: +86 13764045638 E-mail: service@parnassusdata.com
Collecting The Required Information For Support To Troubleshot Oracle ASM/ASMLIB Issues.
If you cannot recover data by yourself, ask Parnassusdata, the professional ORACLE database recovery team for help.
Parnassusdata Software Database Recovery Team
Service Hotline: +86 13764045638 E-mail: service@parnassusdata.com
1) The present document provides a list of steps to collect the required information to troubleshoot & diagnostic ASM/ASMLIB Issues required for support.
2) Obtain the most recent ASMLIB & ASM state from your current environment.
Solution
1) In order to check if the ASMLIB API is correctly configured, please execute the next commands and provide us the output (from each node if this is RAC):
$> cat /etc/*release
$> uname -a
$> rpm -qa |grep oracleasm
$> df -ha
$>/usr/sbin/oracleasm configure
$> /sbin/modinfo oracleasm
2) Check the discovery path (from each node if this is RAC):
$> /etc/init.d/oracleasm status
$> /usr/sbin/oracleasm-discover
$> /usr/sbin/oracleasm-discover 'ORCL:*'
3) Please check if the ASMLIB devices can be accessed (from each node if this is RAC):
$> /etc/init.d/oracleasm scandisks
$> /etc/init.d/oracleasm listdisks
$> /etc/init.d/oracleasm querydisk -p <each disk from previous output>
$> ls -l /dev/oracleasm/disks
$> /sbin/blkid
4) Upload the next files from each node if this is RAC:
=)> /var/log/messages*
=)> /var/log/oracleasm
=)> /etc/sysconfig/oracleasm
5) Please show us the partition table (from each node if this is RAC):
$> cat /proc/partitions
6) If you are using multipath devices (mapper devices or emcpower) then show me the output of:
$> ls -l /dev/mpath/*
$> ls -l /dev/mapper/*
$> ls -l /dev/dm-*
$> ls -l /dev/emcpower*
Or if you have another multipath configuration then list the devices:
$> ls -l /dev/<multi path device name>*
7) Finally connect to your ASM instance, execute the next script and upload me the output file (from each node if this is RAC):
spool asm<#>.html
SET MARKUP HTML ON
set echo on
set pagesize 200
alter session set nls_date_format='DD-MON-YYYY HH24:MI:SS';
select 'THIS ASM REPORT WAS GENERATED AT: ==)> ' , sysdate "" from dual;
select 'HOSTNAME ASSOCIATED WITH THIS ASM INSTANCE: ==)> ' , MACHINE "" from v$session where program like '%SMON%';
select * from v$asm_diskgroup;
SELECT * FROM V$ASM_DISK ORDER BY GROUP_NUMBER,DISK_NUMBER;
SELECT * FROM V$ASM_CLIENT;
select * from V$ASM_ATTRIBUTE;
select * from v$asm_operation;
select * from gv$asm_operation
select * from v$version;
show parameter asm
show parameter cluster
show parameter instance_type
show parameter instance_name
show parameter spfile
show sga
spool off
exit
Note: please compress those files in just one file (*.zip or *.tar) and upload it thru Metalink.
8) Also, if this is not a new ASM/ASMLIB implementation, please describe in detail what has changed since this last worked (OS patches, OS kernel upgrade, SAN migration, etc.)?
Example:
# up2date -i oracleasm-support oracleasmlib oracleasm-`uname -r`
The above command will install only 2 packages (oracleasm-support and oracleasmlib):
[oracle@cstdb02 database]$ cat /etc/*release
Enterprise Linux Enterprise Linux Server release 5.7 (Carthage)
Oracle Linux Server release 5.7
Red Hat Enterprise Linux Server release 5.7 (Tikanga)
[oracle@cstdb02 database]$ uname -a
Linux cstdb02.cstdi.com 2.6.32-200.20.1.el5uek #1 SMP Fri Oct 7 02:29:42 EDT 2011 x86_64 x86_64 x86_64 GNU/Linux
[oracle@cstdb02 database]$ rpm -qa |grep oracleasm
oracleasm-support-2.1.7-1.el5
oracleasmlib-2.0.4-1.el5
This is due to the driver package is now embedded in the UEK kernel :
[root@cstdb02 database]# modinfo oracleasm
filename: /lib/modules/2.6.32-200.20.1.el5/kernel/drivers/block/oracleasm/oracleasm.ko
description: Kernel driver backing the Generic Linux ASM Library.
author: Joel Becker <joel.becker@oracle.com>
version: 2.0.6
license: GPL
srcversion: BB13CDD65668CBDA51D0C25
depends:
vermagic: 2.6.32-200.20.1.el5 SMP mod_unload
Understanding and fixing ORACLE ASM errors ORA-600 [kfcChkAio01] and ORA-15196.
If you cannot recover data by yourself, ask Parnassusdata, the professional ORACLE database recovery team for help.
Parnassusdata Software Database Recovery Team
Service Hotline: +86 13764045638 E-mail: service@parnassusdata.com
Symptoms
Errors ORA-600 [kfcChkAio01] and ORA-15196 can be reported, after a NON-CLEAN dismount of the diskgroup, normally caused by a crash of the ASM instance.
During the restart of ASM instance and mounting the diskgroup, following messages will be reported on the alert.log of the ASM instance:
* Messages indicating recovery:
NOTE: starting recovery of thread=2 ckpt=139.4186 group=2
* The messages about the error ORA-600 and ORA-15196:
Errors in file /u01/app/oracle/product/10.2.0/asm/admin/+ASM/udump/+asm2_ora_15305.trc:
ORA-00600: internal error code, arguments: [kfcChkAio01], [], [], [], [], [], [], []
ORA-15196: invalid ASM block header [kfc.c:5552] [endian_kfbh] [2079] [2147483648] [1 != 0]
Abort recovery for domain 2
NOTE: crash recovery signalled OER-600
ERROR: ORA-600 signalled during mount of diskgroup FLASH
As a result the diskgroup is dismounted. Subsequent mounts will report same set of errors.
Bug 7589862 was created for this case.
Cause
For the diagnostic and identification of the problem, there are important parts of information dumped into the trace file generated by the errors
The call stack on the trace
kfgscFinalize <- kfgForEachKfgsc <- kfgsoFinalize <- kfgFinalize <- kfxdrvMount <- kfxdrvEntry
Functions on the call stack indicate the operations like mount diskgroup (kfxdrvMount) and Recovery (kfrcrv)
Description of the errors
- ORA-00600: internal error code, arguments: [kfcChkAio01], [], [], [], [], [], [], []
kfcChkAio01 will be signaled if the IO operation failed because an invalid block.
- ORA-15196: invalid ASM block header [kfc.c:5552] [endian_kfbh] [2079] [2147483648] [1 != 0]
This error is reported when block failed the validation. The arguments:
endian_kfbh | is the first field on the block header. This is the field that missed the validation. |
2079 | Is the asm file number. Note that this value will be different on each case |
2147483648 |
The block number found on kfbh.block.blk, other field on the block header. Converted to hex, the bytes on the right reference the block number. 0X80000000 |
1 != 0 | 1 was the value found on the field referenced on the first argument, but 0 was the expected value. |
The trace file will have the information about the Cache Element and Buffer header affected by the error:
Start recovery for domain 2, valid = 0, flags = 0x4
NOTE: starting recovery of thread=1 ckpt=201.9904 group=2
NOTE: starting recovery of thread=2 ckpt=139.4186 group=2
CE: (0xc0000000153d0bb8) group=2 (FLASH) obj=2079 blk=0 (indirect)
hashFlags=0x0100 lid=0x0002 lruFlags=0x0000 bastCount=1
redundancy=0x11 fileExtent=0 AUindex=0 blockIndex=0
copy #0: disk=0 au=7492
BH: (0xc0000000153a54d0) bnum=322 type=rcv reading state=rcvRead chgSt=not modifying
flags=0x00000000 pinmode=excl lockmode=null bf=0xc000000015141000
kfbh_kfcbh.fcn_kfbh = -1.-1826817 lowAba=0.0 highAba=0.0
last kfcbInitSlot return code=null cpkt lnk is null
From the Cache Element, it is possible to identify the disk and allocation unit involved with the error:
copy #0: disk=0 au=7492
From the alert.log is possible to identify the path of the disk. Review the file back in time and identify the last time diskgroup was mounted without errors. Check for messages like:
NOTE: cache opening disk 0 of grp 2: FLASH_0000 path:/dev/rdsk/c29t1d4
* The second argument of error ORA-15196 indicate the ASM file number involved with the problem. This can be also validated by some of the information printed in the trace file, searching for the words KSTDUMP In memory trace dump:
KSTDUMP: In-memory trace dump
TIME(usecs):SEQ# ORAPID SID EVENT OP DATA
========================================================================
88894E39:000E0839 16 255 10495 20 kfcMoveLRU: gn=2 fn=2079 indblk=218 src=5 dest=2 line=3201
88894E39:000E083A 16 255 10495 3 kfcAddPin: pin=267 kfc.c 3289 excl bnum=189 class=0
88894E3B:000E083B 16 255 10495 10 kfcbpInit: gn=2 fn=2079 indblk=219 pin=268 excl rcvRead kfr.c 5524
88894E3C:000E083C 16 255 10495 12 kfcFlush: bnum=190 kfc.c 3179
88894E3C:000E083D 16 255 10495 11 kfcMakeFree: bnum=190 flags=00000000 kfc.c 3180
88894E3D:000E083E 16 255 10495 19 kfcMoveBucket: [ gn=2 fn=2079 indblk=26 ] --> [ gn=2 fn=2079 indblk=219 ]
From this line:
88894E39:000E0839 16 255 10495 20 kfcMoveLRU: gn=2 fn=2079 indblk=218 src=5 dest=2 line=3201
gn=2 is the diskgroup number fn=2079 is the ASM file Number indblk=218 is the block where the indirect extent is stored
All the references on the In-memory trace dump will be for 256 blocks of the same file, in this case 2079.
Validating the content of Allocation Unit, using kfed
Using kfed to dump the blocks on the Allocation Unit referenced on the Cache Element will show invalid data:
$kfed read /dev/rdsk/c29t1d4 aunum=7492 blknum=0 ausize=1048576|more
kfbh.hard: 66 ; 0x001: 0x42
kfbh.type: 0 ; 0x002: KFBTYP_INVALID
kfbh.datfmt: 0 ; 0x003: 0x00
kfbh.block.blk: 89088 ; 0x004: T=0 NUMB=0x15c00
kfbh.block.obj: 11626 ; 0x008: TYPE=0x0 NUMB=0x2d6a
kfbh.check: 2182659237 ; 0x00c: 0x8218bca5
kfbh.fcn.base: 4293140479 ; 0x010: 0xffe41fff
kfbh.fcn.wrap: 4294967295 ; 0x014: 0xffffffff
kfbh.spare1: 4294967247 ; 0x018: 0xffffffcf
kfbh.spare2: 4294967295 ; 0x01c: 0xffffffff
All 256 (0 through 255) will have similar content. The type will be KFBTYP_INVALID which indicates content/type of the block is incorrect.
The reason of these errors is because during a file creation, ASM incorrectly commits the allocation of an indirect extent before pre-formatting the extent to contain valid blocks. Thus if a crash occurs during the middle of this operation, during recovery the blocks for the indirect extents are found unformatted (kfbh.type: 0 ; 0x002: KFBTYP_INVALID), signaling the errors already mentioned.
Solution
If the patch is not available, the block has to be manually modified. Please carefully follow the procedure described next.
1. Download file patch.zip and copy to any directory on the server running ASM.
( If the downloaded patch.sh is giving any error for some reason, you can just copy/paste the patch.sh script as mentioned below in this document and run it after necessary modifications )
The zip file contains two files:
- empty_indirect.txt: which is the valid format of a indirect block.
- path.sh: is a shell script used to patch the Allocation Unit having the blocks with the incorrect format.
2. Edit file empty_indirect.txt to make the following changes:
The modifications to the file apply to few fields from the block header.
kfbh.hard: 130 ; 0x001: 0x82
kfbh.type: 12 ; 0x002: KFBTYP_INDIRECT
kfbh.datfmt: 1 ; 0x003: 0x01
kfbh.block.blk: 2147483648 ; 0x004: T=1 NUMB=0x0
kfbh.block.obj: 2901 ; 0x008: TYPE=0x0 NUMB=0xb55
kfbh.endian:
Possible values are:
1 for little endian processors
0 for big endian processors
Here is a list of the platforms:
PLATFORM_ID | PLATFORM_NAME | ENDIAN_FORMAT |
---|---|---|
4 | HP-UX IA (64-bit) | Big |
1 | Solaris[tm] OE (32-bit) | Big |
16 | Apple Mac OS | Big |
3 | HP-UX (64-bit) | Big |
9 | IBM zSeries Based Linux | Big |
6 | AIX-Based Systems (64-bit) | Big |
2 | Solaris[tm] OE (64-bit) | Big |
18 | IBM Power Based Linux | Big |
17 | Solaris Operating System (x86) | Little |
12 | Microsoft Windows 64-bit for AMD | Little |
13 | Linux 64-bit for AMD | Little |
8 | Microsoft Windows IA (64-bit) | Little |
15 | HP Open VMS | Little |
5 | HP Tru64 UNIX | Little |
10 | Linux IA (32-bit) | Little |
7 | Microsoft Windows IA (32-bit) | Little |
11 | Linux IA (64-bit) | Little |
kfbh.block.obj:
This is the asm file number that was been created during the failure. It is the third argument referenced on error ORA-15196
Because this example was on HP Itanium, with ASM file Number 2079, the header of the block on file empty_indirect.txt should looks like this:
kfbh.hard: 130 ; 0x001: 0x82
kfbh.type: 12 ; 0x002: KFBTYP_INDIRECT
kfbh.datfmt: 1 ; 0x003: 0x01
kfbh.block.blk: 2147483648 ; 0x004: T=1 NUMB=0x0
kfbh.block.obj: 2079 ; 0x008: TYPE=0x0 NUMB=0xb55
When modifying files generated by kfed, it is required only to change the value on the left of the ';'.
2. Modify script patch.sh
i=0
while [ $i -le 255 ]
do
echo "write block $i"
kfed write ausz=1048576 blksz=4096 aunum=<AU#> blknum=$i dev=<path for ASM disk> text=/tmp/empty_indirect.txt
i=`expr $i + 1`
done
i=1
while [ $i -le 255 ]
do
echo "merge block $i"
blk=`expr 2147483648 + $i`
echo "kfbh.block.blk: $blk" > /tmp/merge
kfed merge ausz=1048576 blksz=4096 aunum=<AU#> blknum=$i dev=<path for ASM disk> text=/tmp/merge
i=`expr $i + 1`
done
The code in file patch.sh execute two changes:
- All the blocks in the allocation unit are replaced with the valid format for an indirect block. This is executed in the first loop.
- The second loop adjust the correct value for field kfbh.block.blk. It includes the block number.
This script needs to be adapted for every particular case. The changes required are:
- aunum=<AU#>.
The Allocation Unit number is reported on the trace file generated by error ORA-600 and ORA-15196, right on the CE and BH area. It's the last line of the CE dump and before the BH.
hashFlags=0x0100 lid=0x0002 lruFlags=0x0000 bastCount=1
redundancy=0x11 fileExtent=0 AUindex=0 blockIndex=0
copy #0: disk=0 au=7492
In this example is Allocation Unit 7492.
- dev=<path for ASM disk>
This is the full path of the ASM disk number. The CE dumps together with the Allocation Unit number,the disk number. Before in the note was explained how to find the complete path of the disk reviewing the alert.log of the ASM instance. Using v$asm* views is not an option because diskgroup if diskgroup is dismounted.
- ausz=1048576.
It will be extremely important to specify the correct size of the Allocation Unit of the diskgroup.
For this example, the version of patch.sh will be:
i=0
while [ $i -le 255 ]
do
echo "write block $i"
kfed write ausz=1048576 blksz=4096 aunum=7492 blknum=$i dev=/dev/rdsk/c29t1d4 text=/tmp/empty_indirect.txt
i=`expr $i + 1`
done
i=1
while [ $i -le 255 ]
do
echo "merge block $i"
blk=`expr 2147483648 + $i`
echo "kfbh.block.blk: $blk" > /tmp/merge
kfed merge ausz=1048576 blksz=4096 aunum=7492 blknum=$i dev=/dev/rdsk/c29t1d4 text=/tmp/merge
i=`expr $i + 1`
done
3. Execute script patch.sh
4. Validate that blocks on the Allocation Unit have now the format of indirect extents block
Following with the example used on this note:
kfed read ausz=1048576 blksz=4096 aunum=7492 blknum=0 dev=/dev/rdsk/c29t1d4 |more
The output should be like:
kfbh.hard: 130 ; 0x001: 0x82
kfbh.type: 12 ; 0x002: KFBTYP_INDIRECT
kfbh.datfmt: 1 ; 0x003: 0x01
kfbh.block.blk: 2147483648 ; 0x004: T=1 NUMB=0x0
kfbh.block.obj: 2079 ; 0x008: TYPE=0x0 NUMB=0x81f
5. After this, diskgroup should operate without problems.
ORA-15196 Oracle ASM CASE STUDY: UNDERSTANDING ERROR ORA-15196
If you cannot recover data by yourself, ask Parnassusdata, the professional ORACLE database recovery team for help.
Parnassusdata Software Database Recovery Team
Service Hotline: +86 13764045638 E-mail: service@parnassusdata.com
This document provides an explanation of error ORA-15196, including the details of each argument, suggestions for the diagnostic of the error and finally includes a case study using a real problem reported by a customer.
Error Description
ORA-15196 is reported after a validation of an ASM metadata block has failed. The error will be reported in the following format:
ORA-15196: invalid ASM block header [1st] [2nd] [3rd] [4th] [5th != 6th]
Where the arguments indicate:
Argument Meaning
- 1st Function and line number in the code, where the exception is raised 2nd Field failing the validation
- 3rd ASM object number stored in the block
- 4th ASM block number stored in the block
- 5th Value associated with field referenced by argument 2 6th Expected value for field referenced by argument 2
Example:
ORA-15196: invalid ASM block header [kfc.c:7997] [endian_kfbh] [1] [93] [211 != 0]
Function and line number in the code, where the exception is raised = kfc.c:7997
Field failing the validation = endian_kfbh ASM object number stored in the block = 1 ASM block number stored in the block = 93
Value associated with field referenced by argument #2 = 211
Expected value for field referenced by argument #2 = 0
Arguments description
- Function and line number in the code, where the exception is raised
In general terms it is valid to say this argument will be the same in most of the possible cases, because is always the same routine where this exception is raised.
#define kfbValid(data, len, type, bl) \
kfbValidPriv(data, len, type, bl, FILE , LINE ).
- Field failing the validation
The ASM metadata is composed by many different structures like file directory, disk directory, active change directory (ACDC), etc, which are organized by files (asm file# between 1 and 255). Each file will be made of extents, which will be made of ASM block (4096 bytes). Each block has a generic block header (kfbh), and any of those fields can be validated.
kfbh.endian: 0 ; 0x000: 0x00 kfbh.hard: 130 ; 0x001: 0x82 kfbh.type: 4 ; 0x002: KFBTYP_FILEDIR kfbh.datfmt: 1 ; 0x003: 0x01 kfbh.block.blk: 80 ; 0x004: T=0 NUMB=0x50 kfbh.block.obj: 1 ; 0x008: TYPE=0x0 NUMB=0x1 kfbh.check: 4268948098 ; 0x00c: 0xfe72fa82 kfbh.fcn.base: 0 ; 0x010: 0x00000000 kfbh.fcn.wrap: 0 ; 0x014: 0x00000000 kfbh.spare1: 0 ; 0x018: 0x00000000 kfbh.spare2: 0 ; 0x01c: 0x00000000
A short description of each of the fields referenced above (file kf3.h):
kfbh.endian endianness of writer big or little endian
kfbh.hard H.A.R.D. magic # and block size
kfbh.type metadata block type (type of ASM metadata)
kfbh.datfmt metadata block data format
kfbh.block.blk block location of this block
kfbh.block.obj check value to verify consistency
kfbh.check change number of last change
kfbh.spare1 zero pad out to 32bytes
kfbh.spare2 zero pad out to 32 bytes
A list of the fields reported by this error through different SR is:
endian_kfbh
obj_kfbl hard_kfbh
type_kfbh
datfmt_kfbh
check_kfbh
- ASM object number stored in the block
Every ASM metadata block belongs to a specific file associated with a specific ASM structure. That’s why ASM File numbers between 1 and 255 are used to identify the files storing those structures. The value on this field, references the ASM file number.
ASM File Number ASM Metadata
1 File Directory
2 Disk Directory
3 Active Change Directory (ACD)
4 Continous Operations Directory (COD)
5 Template Directory
6 Alias Directory
9 Attributes Directory
12 Staleness Directory
For other ASM metadata structures like PST, ATB, DISK HEADER, this field will have a static value 2147483648 (0x80000000)
- ASM block number stored in the block
An ASM file will allocate extents, which are associated with Allocation Units. Multiple ASM metadata blocks of 4096 bytes make the extent, considering the default Allocation Unit size of 1MB; there are 256 blocks on each extent/AU.
The value stored on this field indicates the block number relative to a particular file. In this example, (93) is the block number, which will be stored in the first extent of the file. That extent will be allocated on a specific Allocation Unit of any of the disks in the diskgroup.
- Value associated with field referenced by argument #2
This is the value found in the block for the field referenced in argument #2.
- Expected value for field referenced by argument 2
This is the expected value for the block referenced by argument # 2.
Having the description of all the arguments for error ORA-15196, It should be possible to have a better understanding of the message:
ORA-15196: invalid ASM block header [kfc.c:7997] [endian_kfbh] [1] [93] [211 != 0]
In the previous example, the field failing the validations is endian_kfbh, belong to file 1 (FILE DIRECTORY); it was also relative block 93, and the value for endian_kfbh was 211 while the correct value should have been 0.
Diagnostics
Up to 10gR2, there are some bugs (patch included) related to this error.
5554692 | Related to indirect extent allocation. Please read the bug descriptionin webiv, because not all cases of ORA-15196 are this particular bug. |
6027802 | This was closed as not a bug, but was related to some IO issues caused by EMC Powerpath. Same type of data mismatch has been observed on other PP installations |
6453944 | ORA-15196 with ASM disks larger than 2TB using ASMLIB |
The major number of issues of this error is associated with data changed outside of ASM. This include:
- Disks formatted at the OS level while it was used by ASM
- Disks assigned to a file system while used by ASM
- IO errors (stale writes)
- Usage of 3rdparty software
Once this error is reported, the diskgroup needs to be recreated. There are situations where diskgroup cannot be mounted, or others where any reference to the metadata (recursive or non recursive), will signal the error and dismount the diskgroup.
Data Collection
In order to understand the extension of the problem and produce a correct diagnostic, it is essential to obtain the following data:
- Alert.log and trace file associated to the error
- First 300MB of the disk affected with the error
In the alert.log, review the line before the report of error ORA-15196:
WARNING: cache failed to read fn=1 blk=80 from disk(s): 0
ORA-15196: invalid ASM block header [kfc.c:7997] [endian_kfbh] [1] [93] [211 != 0]
In the line prior the report of error ORA-15196, it indicates the disk storing the block: from disk(s): 0.
To get the first 300MB:
$dd if=<device path> of=/tmp/disk.dd bs= 1048576 count=300
It may be necessary to provide partial copy of other disks in the diskgroup.
- Output from AMDU if available
AMDU will be explained with more detail in a different note (TBD).
This tool is part of the New Features introduced with 11g. It reads the ASM disks and extract information into different files. Those files have a mapping of the ASM metadata, an image with the content of the disks or it is possible to extract files from the diskgroup.
AMDU can extract the information even if the diskgroup is dismounted.
The mapping file is very important for the diagnostic of error ORA-15196. It has the specific location for each of the extents of each ASM metadata file.
Note 553639.1 is the placeholder for the AMDU binaries for some of the platforms.
Data Review
- Always review other blocks in the boundaries of the affected block. If more than one block has incorrect data (zeros), and they belong to different ASM structures (file directory, disk directory, etc), it is most likely was caused outside of ASM: disk reformatted, assigned to another volume manager, etc.
Use kfed to extract the content of the blocks.
- Reviewing the trace file generated by the error.
The trace file always will print a dump of the ASM metadata block in memory, and also a short call stack. The output of the block is the same generated by kfed, which is a readable by the user.
*** SERVICE NAME:() 2008-01-23 11:57:23.892
*** SESSION ID:(39.74) 2008-01-23 11:57:23.892
OSM metadata block dump:
kfbh.endian: 0 ; 0x000: 0x00 kfbh.hard: 130 ; 0x001: 0x82
kfbh.type: 4 ; 0x002: KFBTYP_FILEDIR
kfbh.datfmt: 1 ; 0x003: 0x01
kfbh.block.blk: 80 ; 0x004: T=0 NUMB=0x50
kfbh.block.obj: 1 ; 0x008: TYPE=0x0 NUMB=0x1 kfbh.check: 4268948098 ; 0x00c: 0xfe72fa82 kfbh.fcn.base: 0 ; 0x010: 0x00000000
kfbh.fcn.wrap: 0 ; 0x014: 0x00000000
/* data remove on purpose */
After the OSM metadata block dump, the short call stack is printed:
—– Abridged Call Stack Trace —–
kfcReadBlk()+1276 kfcLoad()+2148 kffbScanNext()+252 kffbTableCb()+700 kfgTableCb()+1252 kffilTableCb()+240 qerfxFetch()+896 qersoFetch()+720 qerjotFetch()+184 opifch2()+8092 kpoal8()+4196 opiodr()+1548 ttcpip()+1284 opitsk()+1432 opiino()+1128 opiodr()+1548 opidrv()+896 sou2o()+80 opimai_real()+124 main()+152
- Compare the data in the trace file with the data extracted from disk using kfed.
Comparing the block dumped in the trace file and the block in disk, it is possible to identify the exact cause of the check validation failure. Every case will be different, but if the data stored in disk is zeros, always remember to validate other blocks (adjacent). If more blocks are reporting invalid data (zeros), this is an indication the disk has been formatted outside ASM.
Example 1:
This is an example of a block with invalid data. The type of the block is KFBTYP_INVALID, generated when a incorrect type is stored.
kfbh.endian: 0 ; 0x000: 0x00
kfbh.hard: 34 ; 0x001: 0x22
kfbh.type: 0 ; 0x002: KFBTYP_INVALID
kfbh.datfmt: 0 ; 0x003: 0x00
kfbh.block.blk: 4290772992 ; 0x004: T=1 NUMB=0x7fc00000
kfbh.block.obj: 0 ; 0x008: TYPE=0x0 NUMB=0x0
kfbh.check: 0 ; 0x00c: 0x00000000
kfbh.fcn.base: 13879 ; 0x010: 0x00003637
kfbh.fcn.wrap: 512 ; 0x014: 0x00000200
kfbh.spare1: 978943 ; 0x018: 0x000eefff
kfbh.spare2: 2054913149 ; 0x01c: 0x7a7b7c7d
Example 2:
The full content of the block has 0xd4.
disk:0 au:2 block:253 file:1 physical extent:0 block:253 kfed read ausz=1048576 blksz=4096 aunum=2 blknum=253 dev=/dev/rdsk/c2t50060E8000C41384d2s6 kfbh.endian: 212 ; 0x000: 0xd4 kfbh.hard: 212 ; 0x001: 0xd4 kfbh.type: 212 ; 0x002: *** Unknown Enum *** kfbh.datfmt: 212 ; 0x003: 0xd4 kfbh.block.blk: 3570717908 ; 0x004: T=1 NUMB=0x54d4d4d4 kfbh.block.obj: 3570717908 ; 0x008: TYPE=0xd NUMB=0x4d4d4 kfbh.check: 3570717908 ; 0x00c: 0xd4d4d4d4 kfbh.fcn.base: 3570717908 ; 0x010: 0xd4d4d4d4 kfbh.fcn.wrap: 3570717908 ; 0x014: 0xd4d4d4d4 kfbh.spare1: 3570717908 ; 0x018: 0xd4d4d4d4 kfbh.spare2: 3570717908 ; 0x01c: 0xd4d4d4d4 kfbtTraverseBlock: Invalid OSM block type 212 0000: d4d4d4d4 d4d4d4d4 d4d4d4d4 d4d4d4d4 d4d4d4d4 d4d4d4d4 d4d4d4d4 d4d4d4d4 0020: d4d4d4d4 d4d4d4d4 d4d4d4d4 d4d4d4d4 d4d4d4d4 d4d4d4d4 d4d4d4d4 d4d4d4d4 0040: d4d4d4d4 d4d4d4d4 d4d4d4d4 d4d4d4d4 d4d4d4d4 d4d4d4d4 d4d4d4d4 d4d4d4d4 0060: d4d4d4d4 d4d4d4d4 d4d4d4d4 d4d4d4d4 d4d4d4d4 d4d4d4d4 d4d4d4d4 d4d4d4d4 0080: d4d4d4d4 d4d4d4d4 d4d4d4d4 d4d4d4d4 d4d4d4d4 d4d4d4d4 d4d4d4d4 d4d4d4d4 00a0: d4d4d4d4 d4d4d4d4 d4d4d4d4 d4d4d4d4 d4d4d4d4 d4d4d4d4 d4d4d4d4 d4d4d4d4
CASE STUDY
The diskgroup was not used for some months, used by a copy of a database. Due to business reasons, that database required to be used. Mounting the diskgroup was possible, but when the database was mounted, and reading the ASM metadata was required, error ORA-15196 was signaled and diskgroup dismounted.
The diskgroup was configured using external redundancy with a single disk and using the default Allocation Unit size of 1MB.
Data Collected
- The messages in the alert.log:
WARNING: cache failed to read fn=1 blk=256 from disk(s): 0
ORA-15196: invalid ASM block header [kfc.c:7997] [obj_kfbl] [1] [256] [3 != 1]
- The ASM block dumped in the trace file.
*** SESSION ID:(108.5) 2008-02-06 10:05:31.054
OSM metadata block dump:
kfbh.endian: 0 ; 0x000: 0x00 kfbh.hard: 130 ; 0x001: 0x82
kfbh.type: 7 ; 0x002: KFBTYP_ACDC
kfbh.datfmt: 1 ; 0x003: 0x01
kfbh.block.blk: 10752 ; 0x004: T=0 NUMB=0x2a00
kfbh.block.obj: 3 ; 0x008: TYPE=0x0 NUMB=0x3
kfbh.check: 1103194877 ; 0x00c: 0x41c16afd
kfbh.fcn.base: 0 ; 0x010: 0x00000000
kfbh.fcn.wrap: 0 ; 0x014: 0x00000000
kfbh.spare1: 0 ; 0x018: 0x00000000
kfbh.spare2: 0 ; 0x01c: 0x00000000
- AMDU together with 300MB for the disk were collected.
Data Review
- The error:
WARNING: cache failed to read fn=1 blk=256 from disk(s): 0
ORA-15196: invalid ASM block header [kfc.c:7997] [obj_kfbl] [1] [256] [3 != 1]
The error provides the following information:
o The field failing the validation is obj_kfbl
o The block belongs to file 1 (fn=1). File 1 is the File Directory.
o The block is block 256 (blk=256)
o The value for obj_kfbl found was 3 but the expected value should be 1.
File extents, allocation units, blocks in ASM start at 0. Also, block size is 4096. Using the default AU size (1MB), there are 256 blocks. Block 256 is stored in the second extent.
Although the diskgroup was mounted, any query referencing x$kffxp trying to get the extent mapping for file 1 failed. As a result, it was not possible to identify the AU used by block 256 from file 1 (the affected block).
- Using AMDU
One of the files generated by AMDU is the mapping file (*.map) . That file contains the location on disk for every extent of the files stored in the diskgroup. The only record for file 1 was:
N0001 D0000 R00 A00000002 F00000001 I0 E00000000 U00 C00256 S0001 B0002097152
This line indicates that for File 1 (F00000001)), the first extent is stored in Allocation Unit 2 ( A00000002 ) from disk 0 ( D0000 ) .
t was not another entry for file 1 in the mapping file, but AMDU was generating a core dump. It was discovered AMDU was trying to read Allocation Unit 50.
One of the cool things of AMDU, is the possibility of dumping the content of a complete extent for a particular file, redirecting the output into a text file.
$amdu –diskstring ‘<path of device>’ –dump ‘<diskgroup name> -print ‘DG.F1.X1.B0.C256’
The previous command will dump 256 blocks of File 1 Extent 1 starting at block 0.
The results of the last command were:
************************** PRINTING XYZ.F1.X1.B0.C2 **************************
——————————– BLOCK 1 OF 2 ——————————–
…………………………………………………………………
disk:0 au:50 block:0 file:1 physical extent:1 block:0
kfed read ausz=1048576 blksz=4096 aunum=50 blknum=0 dev=/emea/bde/home/users/jfiguer2/disk.dd
At this point the conclusions were:
- The ASM metadata shows that Allocation Unit 50 from disk 0 belongs to File 1.
——————————– BLOCK 1 OF 2 ——————————–
…………………………………………………………………
disk:0 au:50 block:0 file:1 physical extent:1 block:0
kfed read ausz=1048576 blksz=4096 aunum=50 blknum=0 dev=/emea/bde/home/users/jfiguer2/disk.dd
- If the block belongs to file 1, the value for kfbh.block.obj field should have been 1 together with the value for kfbh.type, which should have been KFBTYP_FILEDIR. But that was not the case:
The error ORA-15196:
WARNING: cache failed to read fn=1 blk=256 from disk(s): 0
ORA-15196: invalid ASM block header [kfc.c:7997] [obj_kfbl] [1] [256] [3 != 1]
- The content dumped into the trace file was the same found on disk. The check validation failed because the data stored in the block was not part of the correct ASM metadata, in this case file directory.
The next step was to validate all the blocks in the same Allocation Unit. Those blocks belong to the same ASM metadata (KFBTYP_FILEDIR). One Allocation Unit is used exclusively by one unique file.
Example for block 1 from AU 50:
disk:0 au:50 block:1 file:1 physical extent:1 block:1
kfed read ausz=1048576 blksz=4096 aunum=50 blknum=1 dev=/emea/bde/home/users/jfiguer2/disk.dd
The solution
There was not an available backup for the database stored on the diskgroup, so it was required to keep the diskgroup mounted. Patching the ASM metadata, replacing the content of the first block from Allocation Unit 50, with a valid data.
It was not possible to rebuild the real data for the block 0, so it was replaced with block
- Additional patching was required, in order to adjust other fields in the block. Once the block was successfully patched, the diskgroup was mounted and queries on internal views did not dismount the diskgroup.
Opening the database report errors trying to identify one data file. The extent mapping for this file was stored in the patched block. Luckily that file was not relevant for the database. After setting the file offline, the database opened without errors.
Because was not possible to guarantee the integrity of the diskgroup, it was recommended to take a backup of the database and rebuild the diskgroup
Database Restore after Server's storage crash
extracted from data files
Test recovery with unsupported parameters
ORACLE DBF Data Recovery ??
It will run, but that is meant for dropping the tablespace after u open the database. U can also take the datafile offline and try to open the database, but I do not know if you will be able to read all tables in the tablespace, you should probably export the tables anyway.
Here are a few links that might help.
1. use DUL, I do not have experience with it, so u have to do some research.
2. Open the database by setting the ALLOWRESETLOGS_CORRUPTION=TRUE in the init.ora. But there is no 100% guarantee that we can open the database. However, once the database is opened, then you must immediately rebuild the database. Database rebuild means: perform a full-database export, create a new and separate database, and import the recent export dump.
[http://www.dbspecialists.com/files/presentations/missing_logs.html]
3. Third, but it should be first, if u have Oracle Support, call them
import data from a DBF to oracle
Oracle Database Dead?
Merge data from oracle .dbf datafile
Unable to open the database after data block corruption
This is why you take backups.
If all the data you need is character type, you might try the unix strings command.