Quantcast
Channel: Oracle and MySQL Database Recovery Repair Software recover delete drop truncate table corrupted datafiles dbf asm diskgroup blogs
Viewing all 175 articles
Browse latest View live

MySQL Recover Table Structure From InnoDB Dictionary

$
0
0
When a table gets dropped MySQL removes respective .frm file. This post explain how to recover table structure if the table was dropped.
 
You need the table structure to recover a dropped table from InnoDB tablespace. The B+tree structure of InnoDB index doesn’t contain any information about field types. MySQL needs to know that in order to access records of InnoDB table. Normally MySQL gets the table structure from .frm file. But when MySQL drops a table the respective frm file removed too.
 
Fortunately there is one more place where MySQL keeps the tables structure . It is the InnoDB dictionary.
 
InnoDB dictionary is a set of tables where InnoDB keeps some information about the tables. I reviewed them in details is a separate InnoDB Dictionary post earlier. After the DROP InnoDB deletes records related to the dropped table from the dictionary. So we need to recover deleted records from the dictionary and then get the table structure.
 
Compiling Data Recovery Tool
First, we need to get the source code. The code is hosted on LaunchPad.
 
 
# bzr branch lp:undrop-for-innodb
 
To compile it we need gcc, bison and flex.
 
 
# make
cc -g -O3 -I./include -c stream_parser.c
cc -g -O3 -I./include  -pthread -lm stream_parser.o -o stream_parser
flex  sql_parser.l
bison  -o sql_parser.c sql_parser.y
sql_parser.y: conflicts: 6 shift/reduce
cc -g -O3 -I./include -c sql_parser.c
cc -g -O3 -I./include -c c_parser.c
cc -g -O3 -I./include -c tables_dict.c
cc -g -O3 -I./include -c print_data.c
cc -g -O3 -I./include -c check_data.c
cc -g -O3 -I./include  sql_parser.o c_parser.o tables_dict.o print_data.o check_data.o -o c_parser -pthread -lm
cc -g -O3 -I./include -o innochecksum_changer innochecksum.c
 
 
Recover InnoDB Dictionary
Now let’s create dictionary tables in database sakila_recovered. The data recovery tool comes with structure of the dictionary tables.
 
 
 
 
# cat dictionary/SYS_* | mysql sakila_recovered
 
 
The dictionary is stored in ibdata1 file. So, let’s parse it.
 
 
 
# ./stream_parser -f /var/lib/mysql/ibdata1
 
...
 
Size to process:                  79691776 (76.000 MiB)
Worker(0): 84.13% done. 2014-09-03 16:31:20 ETA(in 00:00:00). Processing speed: 7.984 MiB/sec
Worker(2): 84.21% done. 2014-09-03 16:31:20 ETA(in 00:00:00). Processing speed: 8.000 MiB/sec
Worker(1): 84.21% done. 2014-09-03 16:31:21 ETA(in 00:00:00). Processing speed: 4.000 MiB/sec
All workers finished in 2 sec
 
 
 
Now we need to extract the dictionary records from InnoDB pages. Let’s create a directory for table dumps.
 
 
 
# mkdir -p dumps/default
 
 
And now we can generate table dumps and LOAD INFILE commands to load the dumps. We also need to specify -D option to c_parser because the records we need were deleted from the dictionary when the table was dropped.
 
SYS_TABLES
 
 
# ./c_parser -4Df pages-ibdata1/FIL_PAGE_INDEX/0000000000000001.page \
    -t dictionary/SYS_TABLES.sql \
    > dumps/default/SYS_TABLES \
    2> dumps/default/SYS_TABLES.sql
 
 
 
SYS_INDEXES
 
# ./c_parser -4Df pages-ibdata1/FIL_PAGE_INDEX/0000000000000003.page \
    -t dictionary/SYS_INDEXES.sql \
    > dumps/default/SYS_INDEXES \
    2> dumps/default/SYS_INDEXES.sql
 
 
SYS_COLUMNS
 
 
# ./c_parser -4Df pages-ibdata1/FIL_PAGE_INDEX/0000000000000002.page \
    -t dictionary/SYS_COLUMNS.sql \
    > dumps/default/SYS_COLUMNS \
    2> dumps/default/SYS_COLUMNS.sql
 
 
 
and SYS_FIELDS
 
# ./c_parser -4Df pages-ibdata1/FIL_PAGE_INDEX/0000000000000004.page \
    -t dictionary/SYS_FIELDS.sql \
    > dumps/default/SYS_FIELDS \
    2> dumps/default/SYS_FIELDS.sql
 
 
 
With the generated LOAD INFILE commands it’s easy to load the dumps.
 
 
# cat dumps/default/*.sql | mysql sakila_recovered
 
 
Now we have InnoDB dictionary loaded into normal InnoDB tables.
 
Compiling sys_parser
sys_parser is a tool that reads dictionary from tables stored in MySQL and generates CREATE TABLE structure for a table.
 
To compile it we will need MySQL libraries and development files. Depending on a distribution they may be in -devel or -dev package. On RedHat based system you can check it with command yum provides “*/mysql_config” . On my server it was package mysql-community-devel.
 
If all necessary packages are installed compilation boils down to simple command:
 
 
# make sys_parser
/usr/bin/mysql_config
cc `mysql_config --cflags` `mysql_config --libs` -o sys_parser sys_parser.c
 
 
Recover Table Structure
Now sys_parser can do its magic. Just run it to get the CREATE statement in standard output.
 
 
# ./sys_parser
sys_parser [-h <host>] [-u <user>] [-p <passowrd>] [-d <db>] databases/table
 
 
It will use root as username to connect to MySQL, querty as the password. The dictionary is stored in SYS_* tables in database sakila_recovered. And we want to recover is sakila.actor. InnoDB uses a slash ‘/’ as a separator between database name and table name so does sys_parser.
 
 
# ./sys_parser -u root -p qwerty  -d sakila_recovered sakila/actor
CREATE TABLE `actor`(
`actor_id` SMALLINT UNSIGNED NOT NULL,
`first_name` VARCHAR(45) CHARACTER SET 'utf8' COLLATE 'utf8_general_ci' NOT NULL,
`last_name` VARCHAR(45) CHARACTER SET 'utf8' COLLATE 'utf8_general_ci' NOT NULL,
`last_update` TIMESTAMP NOT NULL,
PRIMARY KEY (`actor_id`)
) ENGINE=InnoDB;
 
 
# ./sys_parser -u root -p qwerty  -d sakila_recovered sakila/customer
CREATE TABLE `customer`(
`customer_id` SMALLINT UNSIGNED NOT NULL,
`store_id` TINYINT UNSIGNED NOT NULL,
`first_name` VARCHAR(45) CHARACTER SET 'utf8' COLLATE 'utf8_general_ci' NOT NULL,
`last_name` VARCHAR(45) CHARACTER SET 'utf8' COLLATE 'utf8_general_ci' NOT NULL,
`email` VARCHAR(50) CHARACTER SET 'utf8' COLLATE 'utf8_general_ci',
`address_id` SMALLINT UNSIGNED NOT NULL,
`active` TINYINT NOT NULL,
`create_date` DATETIME NOT NULL,
`last_update` TIMESTAMP NOT NULL,
PRIMARY KEY (`customer_id`)
) ENGINE=InnoDB;
 
 
There are few caveats though.
 
InnoDB doesn’t store all information you can find in the frm file. For example, if a field is AUTO_INCREMENT InnoDB dictionary knows nothing about it. Therefore, sys_parser will not recover that property. If there were any field or table level comments they’ll be lost
sys_parser generates the table structure eligible for further data recovery. It could but it does not recover secondary indexes, foreign keys.
InnoDB doesn’t stores DECIMAL type as a binary string. It doesn’t store precision of a DECIMAL field. So that information will be lost.
For example, table payment uses DECIMAL to store money.
 
 
 
# ./sys_parser -u root -p qwerty  -d sakila_recovered sakila/payment
CREATE TABLE `payment`(
        `payment_id` SMALLINT UNSIGNED NOT NULL,
        `customer_id` SMALLINT UNSIGNED NOT NULL,
        `staff_id` TINYINT UNSIGNED NOT NULL,
        `rental_id` INT,
        `amount` DECIMAL(6,0) NOT NULL,
        `payment_date` DATETIME NOT NULL,
        `last_update` TIMESTAMP NOT NULL,
        PRIMARY KEY (`payment_id`)
) ENGINE=InnoDB;
 
 
Fortunately Oracle is planning to extend InnoDB dictionary and finally get rid of .frm files. I salute that decision, having the structure in two places leads to inconsistencies.
 

 


MySQL UnDROP tool for InnoDB

$
0
0
TwinDB data recovery toolkit is a set of tools that work with InnoDB tablespaces at low level.
 
Incredible Performance of stream_parser
stream_parser is a tool that finds InnoDB pages in stream of bytes. It can be either file such as ibdata1, *.ibd or raw partition.
stream_parser runs as many parallel workers as number of CPUs in the system.  The performance of stream_parser is amazing! Compare how stream_parser outperforms page_parser on a four-CPU virtual machine running on
 
my laptop:
 
 
 
# ./page_parser -f /dev/mapper/vg_twindbdev-lv_root -t 18G
Opening file: /dev/mapper/vg_twindbdev-lv_root
...
Size to process:               19327352832 (18.000 GiB)
1.00% done. 2014-06-23 03:03:48 ETA(in 00:18 hours). Processing speed: 17570320 B/sec
2.00% done. 2014-06-23 03:05:27 ETA(in 00:19 hours). Processing speed: 16106127 B/sec
3.00% done. 2014-06-23 03:02:11 ETA(in 00:16 hours). Processing speed: 19327352 B/sec
4.00% done. 2014-06-23 03:03:48 ETA(in 00:17 hours). Processing speed: 17570320 B/sec
...
 
 
So, it takes almost 20 minutes to parse 18G partition.
 
 
Let’s check stream_parser
 
 
# ./stream_parser -f /dev/mapper/vg_twindbdev-lv_root -t 18G
 
...
Size to process:               19327352832 (18.000 GiB)
Worker(0): 1.91% done. 2014-06-23 02:51:41 ETA(in 00:00:56). Processing speed: 79.906 MiB/sec
Worker(2): 1.74% done. 2014-06-23 02:51:47 ETA(in 00:01:02). Processing speed: 72.000 MiB/sec
Worker(3): 3.30% done. 2014-06-23 02:51:15 ETA(in 00:00:30). Processing speed: 144.000 MiB/sec
Worker(1): 1.21% done. 2014-06-23 02:52:20 ETA(in 00:01:35). Processing speed: 47.906 MiB/sec
Worker(2): 5.38% done. 2014-06-23 02:51:11 ETA(in 00:00:25). Processing speed: 168.000 MiB/sec
Worker(3): 9.72% done. 2014-06-23 02:51:00 ETA(in 00:00:14). Processing speed: 296.000 MiB/sec
...
Worker(0): 88.91% done. 2014-06-23 02:52:06 ETA(in 00:00:02). Processing speed: 191.625 MiB/sec
Worker(0): 93.42% done. 2014-06-23 02:52:06 ETA(in 00:00:01). Processing speed: 207.644 MiB/sec
Worker(0): 97.40% done. 2014-06-23 02:52:06 ETA(in 00:00:00). Processing speed: 183.641 MiB/sec
All workers finished in 31 sec
 
 
So, 18 minutes versus 31 seconds. 34 times faster! Impressive, isn’t it?
 
c_parser Improvements
 
 
c_parser is a tool that reads InnoDB page or many pages, extracts records and stores them in tab-separated values dumps. InnoDB page with user data doesn’t store information about table structure. You should tell c_parser what fields you’re looking for. Command line option -t specifies a file with CREATE TABLE statement.
 
This is how it works. Here’s the CREATE statement (I took it from mysqldump)
 
# cat sakila/actor.sql
CREATE TABLE `actor` (
  `actor_id` smallint(5) unsigned NOT NULL AUTO_INCREMENT,
  `first_name` varchar(45) NOT NULL,
  `last_name` varchar(45) NOT NULL,
  `last_update` timestamp NOT NULL DEFAULT CURRENT_TIMESTAMP ON UPDATE CURRENT_TIMESTAMP,
  PRIMARY KEY (`actor_id`),
  KEY `idx_actor_last_name` (`last_name`)
) ENGINE=InnoDB AUTO_INCREMENT=201 DEFAULT CHARSET=utf8;
And now let’s fetch records of table actor from InnoDB pages:
 
# ./c_parser -6f pages-actor.ibd/FIL_PAGE_INDEX/0000000000001828.page -t sakila/actor.sql
-- Page id: 3, Format: COMPACT, Records list: Valid, Expected records: (200 200)
000000005313    970000013C0110  actor   1       "PENELOPE"      "GUINESS"       "2006-02-15 04:34:33"
000000005313    970000013C011B  actor   2       "NICK"  "WAHLBERG"      "2006-02-15 04:34:33"
000000005313    970000013C0126  actor   3       "ED"    "CHASE""2006-02-15 04:34:33"
...
000000005313    970000013C09D8  actor   199     "JULIA""FAWCETT"       "2006-02-15 04:34:33"
000000005313    970000013C09E4  actor   200     "THORA""TEMPLE"        "2006-02-15 04:34:33"
 
 
-- Page id: 3, Found records: 200, Lost records: NO, Leaf page: YES
 
 
The version 5.6 of MySQL introduced few format changes. Most of them were already supported. The c_parser fixes on top of that some bugs in processing temporal fields.
 
The new UnDROP tool for InnoDB is still no reason not to take backups :-), but at least you can be armed better if the inevitable happens.
 
How to Recover Table Structure
MySQL stores table structure in a respective .frm file. When the table is dropped the .frm file is gone. Fortunately InnoDB stores copy of the structure in the dictionary. sys_parser is a tool that can read the dictionary and generate CREATE TABLE statement. Check how you can Recover Table Structure From InnoDB Dictionary.
 
How to Install TwinDB Data Recovery Toolkit
Check out the source code from LaunchPAD:
 
# $ bzr branch lp:undrop-for-innodb
Branched 33 revisions.
Or you can download an archive with the latest revision from download page.
 
Compile the source code. But first install dependencies: make, gcc, flex, bison.
 
root@twindb-dev undrop-for-innodb]# make
cc -D_FILE_OFFSET_BITS=64 -Wall -g -O3 -pipe  -I./include -c stream_parser.c
cc -D_FILE_OFFSET_BITS=64 -Wall -g -O3 -pipe  -I./include  -pthread -lm  stream_parser.o -o stream_parser
flex  sql_parser.l
bison  -o sql_parser.c sql_parser.y
sql_parser.y: conflicts: 6 shift/reduce
cc -D_FILE_OFFSET_BITS=64 -Wall -g -O3 -pipe  -I./include -c sql_parser.c
lex.yy.c:3078: warning: ‘yyunput’ defined but not used
lex.yy.c:3119: warning: ‘input’ defined but not used
cc -D_FILE_OFFSET_BITS=64 -Wall -g -O3 -pipe  -I./include -c c_parser.c
./include/ctype-latin1.c:359: warning: ‘my_mb_wc_latin1’ defined but not used
./include/ctype-latin1.c:372: warning: ‘my_wc_mb_latin1’ defined but not used
cc -D_FILE_OFFSET_BITS=64 -Wall -g -O3 -pipe  -I./include -c tables_dict.c
cc -D_FILE_OFFSET_BITS=64 -Wall -g -O3 -pipe  -I./include -c print_data.c
cc -D_FILE_OFFSET_BITS=64 -Wall -g -O3 -pipe  -I./include -c check_data.c
cc -D_FILE_OFFSET_BITS=64 -Wall -g -O3 -pipe   -I./include  sql_parser.o c_parser.o tables_dict.o print_data.o check_data.o -o c_parser -pthread -lm
cc -D_FILE_OFFSET_BITS=64 -Wall -g -O3 -pipe   -I./include -o innochecksum_changer innochecksum.c
[root@twindb-dev undrop-for-innodb]#
UPDATE:
 
The toolkit is tested on following systems:
 
CentOS release 5.10 (Final) x86_64
CentOS release 6.5 (Final) x86_64
CentOS Linux release 7.0.1406 (Core) x86_64
Fedora release 20 (Heisenbug) x86_64
Ubuntu 10.04.4 LTS (lucid) x86_64
Ubuntu 12.04.4 LTS (precise) x86_64
Ubuntu 14.04 LTS (trusty) x86_64
Debian GNU/Linux 7.5 (wheezy) x86_64
32 bit operating systems are not supported
 
 
 
 
An InnoDB index doesn’t carry information about the table structure indeed. MySQL keeps the structure in .frm files and InnoDB stores the structure in the dictionary. When the table structure isn’t available from external source (old backup, installation script etc) then possible way to recover the structure are:
 
1) Recover from .frm files. There are some tools around available . I prefer to create a dummy table, replace the .frm file and run SHOW CREATE TABLE. This option however is useless when DROP TABLE happens, MySQL deletes the .frm file as well.
 
2) Recover the structure from the InnoDB dictionary. InnoDB stores almost all necessary information about the table structure in the dictionary. When a user runs DROP TABLE the respective records are deleted from the dictionary tables, so when recover the dictionary tables you need to specify -D option to c_parser (-D recovers records that are marked as deleted). The tables you need are SYS_TABLES, SYS_INDEXES, SYS_FIELDS and SYS_COLUMNS. Then load everything into a live instance of MySQL. A tool sys_parser from the toolkit reads SYS_* tables from MySQL and generates CREATE TABLE statement.
 
 
 

Take image from corrupted hard drive

$
0
0
There are at least two cases when it makes sense to take an image from a corrupted hard drive as soon as possible: disk hardware errors and corrupted filesystem. Faulty hard drives can give just one chance to read a block, so there is no time for experiments. The similar picture with corrupted filesystems. Obviously something went wrong, it’s hard to predict how the operating system will behave next second and whether it will cause even more damage.
 
Save disk image to local storage
Probably the best and fastest way is to plug the faulty disk into a healthy server and save the disk image locally:
 
# dd if=/dev/sdb of=/path/on/sda/faulty_disk.img  conv=noerror
Where /dev/sdb is the faulty disk and faulty_disk.img is the image on the healthy /dev/sda disk.
 
conv=noerrror tells dd to continue reading even if read() call exited with an error. Thus dd will skip bad areas and dump as much information from the disk as possible.
 
By default dd reads 512 bytes and it is a good value. Reading larger blocks would be faster, but the larger block will fail even if a small portion of the block is unreadable. InnoDB page is 16k, so dd reads one page in eight operations. It’s possible to extract information even if the page is partially corrupt. So, reading in 512 bytes blocks seems to be optimal unless somebody convinces me in opposite.
 
Save disk image to remote storage
If the faulty disk can’t be unplugged the best (if not only) way is to save the disk image on a remote storage.
 
Netcat is an excellent tool for this purpose.
 
Start on the destination side a server:
 
# nc -l 1234 > faulty_disk.img
On the server with the faulty disk take a dump and stream it over network
 
# dd if=/dev/sdb of=/dev/stdout  conv=noerror | nc a.b.c.d 1234
a.b.c.d is the IP address of the destination server.
 
Why dd is better for MySQL data recovery
There is a bunch of good file recovery or file undelete tools. However they serve slightly different purpose. In short they try to reconstruct a file. They care about a file system.
 
For MySQL data recovery we don’t need files, we need data. InnoDB page can be recognized by a short signature in the beginning of the page. In the fixed places there are two internal records in every index page infimum and supremum:
 
00000000  3f ff 6f 3d 00 00 11 e0  ff ff ff ff 00 00 11 e8  |?.o=............|
00000010  00 00 00 00 14 8f 8f 57  45 bf 00 00 00 00 00 00  |.......WE.......|
00000020  00 00 00 00 00 00 00 17  3b 58 00 af 1d 95 1d fb  |........;X......|
00000030  1d 3e 00 02 00 03 00 5a  00 00 00 00 00 00 00 00  |.>.....Z........|
00000040  00 00 00 00 00 00 00 00  00 01 00 00 00 00 00 00  |................|
00000050  00 00 00 00 00 00 00 00  00 00 00 00 00 00 08 01  |................|
00000060  00 00 03 00 8d 69 6e 66  69 6d 75 6d 00 09 03 00  |.....infimum....|
00000070  08 03 00 00 73 75 70 72  65 6d 75 6d 00 38 b4 34  |....supremum.8.4|
00000080  30 28 24 20 18 11 0b 00  00 10 15 00 d5 53 59 53  |0($ .........SYS|
00000090  5f 46 4f 52 45 49 47 4e  00 00 00 00 03 00 80 00  |_FOREIGN........|
If the header is good then we know what table the page belongs to, how many records to expect etc. Even if the rest of the page is heavily corrupted it’s possible to extract all survived records.
 
I had several cases when dd excelled.
Story #1.
 
It was a dying hard drive. InnoDB crashed all the time. When a customer figured out the problem was with the disk they tried to copy MySQL file. But simple copy has failed. The customer had tried to read the files with some file recovery tool.
 
MySQL refused to start and reported checksum mismatched in the error log.
 
The customer provided the recovered files. Size of ibdata1 file was reasonable, but stream_parser has found ~20MB of pages. ibdata1 was almost empty inside – just all zeroes where the data should be. I doubt that even 40% of data was recovered.
 
Then we tried to take a dump of the disk and recover InnoDB tables from the image. First of all, there were found ~200MB of pages. Many tables were 100% recovered and around 80-90% records were fetched from corrupted tables.
 
Story #2.
 
A customer has dropped InnoDB database. MySQL was  running with innodb_file_per_table=ON. So, the tables were in .ibd file that were deleted. It was a Windows server and the customer used some tool to undelete the .ibd files from NTFS filesystem. The tool restored the files, but the ibd files were almost empty inside. The recovery rate was close to 20%.
 
Recovery from a disk dump gave around 70-80% of records.

prm dul release 5108 rc8

prm dul release 5108 rc10

RMAN-6026 RMAN-6023 During RESTORE Operation

$
0
0
Problem Description
-------------------
 
You are attempting to restore a database using Oracle Recovery 
Manager (RMAN) using a 'set until time' parameter to do a point-in-time
recovery:
 
   run {
   set until time = '09-JUN-2000:10:30:00';
   allocate channel x type disk;
   restore database;
   recover database;
   }
 
However, this command fails with the following error stack:
 
   RMAN-03002: failure during compilation of command
   RMAN-03013: command type: restore
   RMAN-03002: failure during compilation of command
   RMAN-03013: command type: IRESTORE
   RMAN-06026: some targets not found - aborting restore
   RMAN-06023: no backup or copy of datafile 7 found to restore
   RMAN-06023: no backup or copy of datafile 6 found to restore
   RMAN-06023: no backup or copy of datafile 5 found to restore
   RMAN-06023: no backup or copy of datafile 4 found to restore
   RMAN-06023: no backup or copy of datafile 3 found to restore
   RMAN-06023: no backup or copy of datafile 2 found to restore
   RMAN-06023: no backup or copy of datafile 1 found to restore
   ...
 
A 'list backupset of database' command shows there to be multiple backups
of these files available.
 
 
Solution Description
--------------------
 
You have issued a 'resetlogs' prior to the last backup but before the 'Until
Time' clause in the RMAN script.  
 
For instance, 
- the last BACKUP taken of the database was June 8, 2000.  
- you opened the database with RESETLOGS On June 9, at 9:08 AM,   
- Then, do to complications,   you decide to restore the database to a point in time on June 9, 10:30 AM.  
 
Because you cannot roll forward through the resetlogs, RMAN cannot find any legitimate 
backups to restore from within this incarnation.
 
 
o The solution is to reset database incarnation to previous incarnation and 
  set the 'until time' clause to a time before the resetlogs.
 
  
 
Explanation
-----------
 
You need to check the incarnation of the database:
 
rman>list incarnation of database;
 
RMAN-03022: compiling command: list
 
List of Database Incarnations
DB Key  Inc Key DB Name                 DB ID            CUR Reset SCN  Reset Time
------- ------- ------------------------------ ---------------- --- ---------- ----------
1       2         <backup name>    4094805351       NO  159907     28-apr-2000:10:24:43
1       461     <backup name>    4094805351       NO  220532     09-jun-2000:08:22:08
1       521     <backup name>    4094805351       YES 220693     09-jun-2000:09:08:20
 
If the current incarnation reset time falls between the last backup and
the time specified for 'Set Time,' then the recovery catalog acknowledges 
that there are no backups that match the time criteria specified, and errors
out with RMAN-6023.
 
 
Search Words
------------
 
BACKUP, COPY, RECOVER, TARGETS
REFERENCES

Common Causes for RMAN-06023 and RMAN-06026

$
0
0
PURPOSE
This document describes known root-causes for the error :
 
RMAN-06023: no backup or copy of datafile %s found to restore
RMAN-6023: no backup or copy of datafile %s found to restore
 
 
In case you may want or need more about your current topic - please also access the Backup & Recover Community of Customers and Oracle Specialists directly via:
TROUBLESHOOTING STEPS
1) General description
 
RMAN-06023 "no backup or copy of datafile %d found to restore"
// *Cause: A datafile, tablespace, or database restore could not proceed
// because no backup or copy of the indicated file was found.
// It may be the case that a backup or copy of this file exists but
// does not satisfy the criteria specified in the user's restore
// operands.
 
The error RMAN-6023 means that RMAN cannot find a backup for that datafile in its repository. The RMAN-repository is ALWAYS in the controlfile, but might be in an RMAN-catalog database aswell. So a good starting point for diagnosing the issue is a LIST BACKUP output.
Example :
RMAN> list backup of datafile 1;
 
--OR--
 
RMAN> list backup of archivelog sequence;
 
The backup needs to be marked as AVAILABLE and there needs to be a channel allocated for the 'Device Type' reported in the 'LIST BACKUP'
Example :
BS Key  Type LV Size       Device Type Elapsed Time Completion Time
------- ---- -- ---------- ----------- ------------ --------------------
4       Full    537.90M    DISK        00:00:25     17-JUN-2011 17:12:42
BP Key: 4 Status: AVAILABLE Compressed: NO Tag: <TAG_NAME>
Piece Name: /<DIR>/<DB_HOME>/dbs/<PIECE_NAME>
List of Datafiles in backup set 4
File LV Type Ckp SCN    Ckp Time             Name
---- -- ---- ---------- -------------------- ----
1       Full 975048     17-JUN-2011 17:11:39 /<DIR>/<DB_NAME>/datafile/<FILE_NAME>.dbf
 
However if they above is matching, than it might be one of the issues below which is causing the problems.
 
 
2) Backups available on disk / tape but not in the RMAN repository
It might be that RMAN can not find any backup to restore from and they are not shown in the 'LIST BACKUP'-output,
but the backups available on disk or tape.
So in your setup the backups are removed from the RMAN-repository (controlfile and/or catalog), but are still available on disk or tape.
There are configurations possible where this is intended behaviour.
 
Than the backups need to be cataloged again using the CATALOG command
Backups can only cataloged from 10g and later versions.
 
Example :
 
RMAN> catalog start with '<directory where the backups are>';
 
Afterwards the backups should be shown again in the 'LIST BACKUP' output
 
 
3) UNTIL TIME conversion
When an SET UNTIL TIME is being used, RMAN will convert it to an UNTIL SCN. This is an estimate as there is NO hard relation between a timestamp and an SCN. RMAN is making an estimate. Especially when a timestamp is used which is close to the end-time of the backup, than this might be an issue. If the conversion to an SCN is generating an SCN which is BEFORE the end fuzziness of the datafiles in the backup, than the backup can NOT be used.
 
Example :
Backup start on T1 (SCN=1000) and ends on T2 (SCN=1050), than the backup can ONLY be used if the UNTIL SCN is 1050 or higher.
So if the 'UNTIL TIME T2'  is converted to SCN 1045, than this backup will NOT be used.
 
V$BACKUP_DATAFILE / RC_BACKUP_DATAFILE is giving more info on this.
CHECKPOINT_CHANGE# corresponds with T1
ABSOLUTE_FUZZY_CHANGE# corresponds with T2.  When ABSOLUTE_FUZZY_CHANGE# is NULL, than it is the same as the CHECKPOINT_CHANGE#
 
There is a known RMAN issue with an incorrect UNTIL TIME conversion due to skipping the TIME-part.
Bug 9128954 RMAN IS SELECTING WRONG BACKUP WITH 'SET UNTIL'
 
 
4) Inactive thread
A RAC-database but also a Single instance database, can have multiple threads enabled. Each thread will have its own set of redologs files and will archive them.
In a thread (instance) is idle, or not started for some time, than an RMAN DUPLICATE could fail on it as it is looking for datafile or archived redologs from the inactive instance which will not be there anymore.
 
In addition, RAC becomes the different behavior from 11g. If a user execute 'ALTER SYSTEM ARCHIVE LOG CURRENT' on node#1 and,
even if node#2 stops, the archivelog of thread#2 is NOT archived.
it is archived on 10g even if node#2 stops.
 
There is an known issue related to inactive/disabled threads as handled in
    Bug 9044053 RMAN DUPLICATE CAUSES RMAN-06457 WHEN USING 'UNTIL SCN' UNTIL STARTUP NODE#2.
 
Best practice is to drop the threads which are not used at all anymore.
 
SQL> select thread#, status, enabled, instance
     from v$thread;
 
     select group#, thread# from v$log;
 
     alter database disable instance '<name>';
     alter database drop logfile group <group#>;
 
 
5) Incarnation issues
5a)  New incarnation added due to implicit resync
This is issue is only relevant if a Flash | Fast Recovery Area (FRA) is being used.
 
If 1 or more  restore and recovery attempts have been done for this database and the database has been opened with RESETLOGS,
than there might be archived redologs generated for this new incarnation of the database.
 
During the RMAN recovery-phase, RMAN will do an catalog of all the files in the FRA, and will catalog the new archived redologs aswell.
As they belong to another incarnation, the incarnation will be added (if not there) and will be marked as CURRENT.
The recovery will than look for archived redologs of a different incarnation than intended, as the CURRENT incarnation belongs to a prior RESETLOGS-operation
 
The best option is to remove all the old files from the FRA eg. flashbacklogs, archivelogs, backupsets, datafiles etc.
belonging to an incarnation of a prior attempt.
 
 
Note 965122.1 RMAN RESTORE FAILS WITH RMAN-06023 BUT THERE ARE BACKUPS AVAILABLE
5b) Incarnations have the same RESETLOG_CHANGE#
There is a RMAN issue, which is causing different incarnations having the same RESETLOGS_CHANGE#. So there are multiple records in V$DATABASE_INCARNATION / RC_DATABASE_INCARNATION,
having the same RESETLOGS_CHANGE#. RMAN will loose track of which incarnation to use and might use an incorrect incarnation resulting in unexpected errors
 
Bug 5844752 RESTORE FAILS - CURRENT INCARNATION RESETLOGS SCN SAME AS PARENT INCARNATION
Note 727655.1 Despite Available Backups, Restore Fails with RMAN-03002:ORA-01180:Can Not Create Datafile 1
5c) Restoring from an none-current incarnation
The symptoms of this issue are closely related to the above issue (5b), but this time it is because the backups really belong to a different incarantion, than the current one.
Possible errors are :
 
RMAN-06026: some targets not found - aborting restore
RMAN-06023: no backup or copy of datafile 1 found to restore
 
OR
 
RMAN-06026: some targets not found - aborting restore
RMAN-06100: no channel to restore a backup or copy of datafile 1
 
OR
 
ORA-01180: can not create datafile 1
ORA-01110: data file 1: '+DATA/<DB_NAME>/<PATH>/<FILE_NAME>'
 
Check for more details :
RMAN RESTORE fails with RMAN-06023 or ORA-19505 or RMAN-06100 inspite of proper backups
Note 112248.1 RMAN-6026 RMAN-6023 During RESTORE Operation
Document 2038119.1 Resolving RMAN-06023 or RMAN-06025
 
 
6) Archivelog backup is missing
This issue is likely to occure during an RMAN-duplicate.
An RMAN DUPLICATE will use an UNTIL SCN recovery on the Auxiliary instance, to recover the database.
The end point of the recovery is specified by the UNTIL SCN, which is derived from the last archived redolog on the TARGET.
If this archive is NOT backed up, than the recovery on the AUXILIARY, and therefor the DUPLICATE, will fail on it.
 
There are 2 solutions for this.
Make an explicit archivelog backup, before you start the RMAN DUPLICATE. NOTE : this will only help if there are no additional archives created after the BACKUP and before the DUPLICATE starts
Specify an explicit UNTIL-clause, like UNTIL SEQUENCE. The following query might be usefull in that case:
SQL> select thread#, max(sequence#) + 1 seq#, to_char(max(first_time), 'dd-mon-yyyy hh24:mi:ss') first_time
     from v$archived_log
     where backup_count >= 1
     group by thread#;
 
RMAN RESTORE fails with RMAN-06023 or ORA-19505 or RMAN-06100 inspite of proper backups
 
 
7) Corrupted or missing backups on disk
RMAN is automatically failing over to another backup, if there is an issue with the backuppiece during the restore.
Related errors might be :
 
ORA-19870: error reading backup piece
ORA-19587 error occurred reading %s bytes at block
ORA-19505: failed to identify file
 
If an older backup is found in the repository, than RMAN will continue the restore, but most likely will require more (older) archived redologs during the recovery.
Especially when the restore is done on another host and not ALL the backups are accessible on that host, than it may endup in a situation that RMAN will try to CREATE the datafile(s).
This is really an issue when this involves datafile 1, as that can NEVER be created, as it is done during a CREATE DATABASE
 
Example :
creating datafile fno=1 name=+DATA/<DB_NAME>/<DIR>/<FILE_NAME>
RMAN-00571: ===========================================================
RMAN-00569: =============== ERROR MESSAGE STACK FOLLOWS ===============
RMAN-00571: ===========================================================
RMAN-03002: failure of restore command at 04/28/2010 20:12:34
ORA-01180: can not create datafile 1
ORA-01110: data file 1: '+DATA/<DB_NAME>/<DIR>/<FILE_NAME>'
 
So double check why the initial restore failed and resolve that issue as the errors ORA-1180 or RMAN-6023 are just a result of the initial errors.
 
Note 429689.1 RMAN Restore Fails: ORA-19870 ORA-19587 ORA-27091 ORA-27067 RMAN-06026 RMAN-06023
Note 1271551.1 RMAN duplicate failing with ora-19870 ora-19612 then RMAN-06023
Note 1300586.1 RMAN-6026 RMAN-6023 when restoring to new host
 
 
8) Never backed up
8a) No backup
In 10g onwards, if RMAN is starting a restore and is using a backup from before the CREATION_SCN of a datafile, than RMAN will automaticly create the datafile.
This is the case when a datafile was added after the backup.
 
However through 11g Release 1, this is still an issue during an RMAN DUPLICATE, which than will fail.
 
Note 782317.1 Rman-06023 encountered during duplicate to point in time after datafile was added
Note 135630.1 RMAN-6026 RMAN-6023 Restoring Database
Note 779558.1 Cannot restore incremental backups using tag when datafile has been added
 
 The error can also occur if there is No valid backup available for the specific point in time
8b) Plugged in Tablespace
The datafile is a plugged in datafile, and there has been NO backups taken after the pluggin operation.
So you need to plugin the datafile again from its source.
 
Note 1453090.1 RMAN-06023 during restore of a plugged in datafile
 
 
 
 
 
9) Backup piece are readonly
       The backuppiece are readonly at the operating system level.
       Make the Rman backuppiece not only readable but also writable .See Bug 5412531 for other details
 
       Fixed Version : -11gR1
 
       Applicable only for Operating system IBM AIX on POWER Systems .
 
 
 
10) Known Defects
For known issues reference bugs in the next articles:
 
Doc ID 48182.1 OERR: RMAN-6023 "no backup or copy of datafile %d found to restore"
 
Doc ID 48185.1 OERR: RMAN-6026 "some targets not found - aborting restore"
 

RMAN recover database fails RMAN-6025 - v$archived_log.next_change# is 281474976710655

$
0
0
SYMPTOMS
RMAN database recover failing with the following errors:
 
RMAN-06025: no backup of archived log for thread number with sequence number and starting SCN of string found to restore
Cause: An archived log restore restore could not proceed because no backup of the indicated archived log was found. It may be the case that a backup of this file exists but does not satisfy the criteria specified in the user's restore operands.
Action: None - this is an informational message. See message 6026 for further details.
 
 
 
 
RMAN-00571: ===========================================================
RMAN-00569: =============== ERROR MESSAGE STACK FOLLOWS ===============
RMAN-00571: ===========================================================
RMAN-03002: failure of recover command at 05/17/2010 10:22:51
RMAN-06053: unable to perform media recovery because of missing log
RMAN-06025: no backup of log thread 1 seq 2668 lowscn 1079975830 found to restore
 
The archivelog being requested is very old compared to current log sequence.
Inspection of V$ARCHIVED_LOG shows:
 
CREATOR REGISTR  SEQ# FIRST_CHANGE# NEXT_CHANGE#        NAME
------- ------- -------------------- -------------------- -------------------- --------------------
RMAN     RMAN     2668     1079861514      281474976710655   /<path>/onlinelog/group_3.4508.718549667
To confirm if you have hit the same problem run this query against the controlfile :
 
SQL> select thread#, sequence#, creator, registrar, archived,
     to_char(first_change#), to_char(next_change#), name
     from v$archived_log
     where archived='NO';
 If you are using a catalog and the above query returns no rows then check the catalog:
 
 
SQL>select * from rc_database;
== note the dbinc_key of your target
 
SQL> select thread#, sequence#, creator, archived,
     to_char(first_change#), to_char(next_change#), name
     from rc_archived_log
     where archived='NO' and dbinc_key=<your dbinc_key>;
CAUSE
V$ARCHIVED_LOG or RC_ARCHIVED_LOG contains entries for online redo log files.
Online redo logs are temporarily cataloged by RMAN as 'archived logs' during
FULL media recovery; they are removed from the AL table when media
recovery completes successfully. One of the online redo logs will be 'current'
at the time so the SCN range for this log when cataloged will be low_SCN to 281474976710655 (FFFFFFFFFFFF(hex)).  When media recovery completes, these online entries in v$archived_log/rc_archived_log are deleted automatically by RMAN.  If media recovery fails and recovery is completed via SQLPlus, these entries in AL table are not removed.
 
During any subsequent recovery exercise, if the start SCN for recovery is
greater than the low_SCN of any of the cataloged online redo logs, the one with
an infinite next_SCN value will always be chosen as it will always fall within
the SCN range calculated for recovery - but this 'archived log' does not really exist so RMAN fails.
 
SOLUTION
There is no simple solution if a catalog database is not used 
 
Register the target in a catalog first and then proceed as shown below.
 
Take a backup of the recovery catalog BEFORE deleting the following rows:
 
SQL> select * from rc_database;
== Note the dbinc_key for your target database
 
SQL> delete from al
     where dbinc_key = <dbinc_key>
     and archived = 'N';
SQL> commit;
 

RMAN Command "RESTORE ARCHIVELOG ALL VALIDATE" Failing with RMAN-06025

$
0
0
SYMPTOMS
RMAN command 'RESTORE ARCHIVELOG ALL VALIDATE' failing with error:
 
RMAN-06025: no backup of archived log for thread number with sequence number and starting SCN of string found to restore
Cause: An archived log restore restore could not proceed because no backup of the indicated archived log was found. It may be the case that a backup of this file exists but does not satisfy the criteria specified in the user's restore operands.
Action: None - this is an informational message. See message 6026 for further details.
 
 
 
RMAN-00571: ===========================================================
RMAN-00569: =============== ERROR MESSAGE STACK FOLLOWS ===============
RMAN-00571: ===========================================================
RMAN-03002: failure of restore command at 01/13/2012 11:38:39
RMAN-06026: some targets not found - aborting restore
RMAN-06025: no backup of log thread 1 seq 1 lowscn 1164241 found to restore
RMAN-06025: no backup of log thread 1 seq 58 lowscn 1164240 found to restore
RMAN-06025: no backup of log thread 1 seq 57 lowscn 1164238 found to restore
 
 
CAUSE
- The issue is caused when not using catalog database or no catalog connection is used.
 
- "ALL" keyword in "RESTORE ARCHIVELOG ALL VALIDATE" statement does not take into account backup retention policy but tries to access all archived redo logs referenced in RMAN repository
 
 
 
 
RMAN> RESTORE ARCHIVELOG ALL VALIDATE;
 
Starting restore at 13-JAN-12
using target database control file instead of recovery catalog
allocated channel: ORA_DISK_1
channel ORA_DISK_1: sid=32 devtype=DISK
 
RMAN-00571: ===========================================================
RMAN-00569: =============== ERROR MESSAGE STACK FOLLOWS ===============
RMAN-00571: ===========================================================
RMAN-03002: failure of restore command at 01/13/2012 11:38:39
RMAN-06026: some targets not found - aborting restore
RMAN-06025: no backup of log thread 1 seq 1 lowscn 1164241 found to restore
RMAN-06025: no backup of log thread 1 seq 58 lowscn 1164240 found to restore
RMAN-06025: no backup of log thread 1 seq 57 lowscn 1164238 found to restore
RMAN-06025: no backup of log thread 1 seq 56 lowscn 1162285 found to restore
RMAN-06025: no backup of log thread 1 seq 55 lowscn 1162276 found to restore
RMAN-06025: no backup of log thread 1 seq 54 lowscn 1162274 found to restore
......
RMAN-06025: no backup of log thread 1 seq 3 lowscn 360493 found to restore
RMAN-06025: no backup of log thread 1 seq 2 lowscn 360490 found to restore
RMAN-06025: no backup of log thread 1 seq 1 lowscn 349389 found to restore
RMAN-06025: no backup of log thread 1 seq 21 lowscn 349388 found to restore
RMAN-06025: no backup of log thread 1 seq 20 lowscn 349382 found to restore
MAN-06025: no backup of
RMAN>
 
 
 
SOLUTION
Option 1:- Using only the controlfile, no catlaog database used:
 
Use the below syntax from RMAN command prompt, for validating archivelog backups.
 
 
RMAN> restore archivelog from time='<RECOVERY WINDOWS DAYS#>' validate;
 
Suppose you have set the recovery window of 7 days, then  use the below command.
 
 
RMAN> show RETENTION POLICY;
 
RMAN configuration parameters are:
CONFIGURE RETENTION POLICY TO RECOVERY WINDOW OF 7 DAYS;
 
RMAN> restore archivelog from time='SYSDATE-7' validate;
 
Starting restore at 13-JAN-12
using channel ORA_DISK_1
 
channel ORA_DISK_1: starting validation of archive log backupset
channel ORA_DISK_1: reading from backup piece <path>\BACKUPSET\2012_01_13\O1_MF_ANNNN_TAG20120113T111853_7JZKG717_.BKP
channel ORA_DISK_1: restored backup piece 1
piece handle=<path>\BACKUPSET\2012_01_13\O1_MF_ANNNN_TAG20120113T111853_7JZKG717_.BKP tag=TAG20120113T111853
channel ORA_DISK_1: validation complete, elapsed time: 00:00:02
channel ORA_DISK_1: starting validation of archive log backupset
channel ORA_DISK_1: reading from backup piece <path>\BACKUPSET\2012_01_13\O1_MF_ANNNN_TAG20120113T112054_7JZKL018_.BKP
channel ORA_DISK_1: restored backup piece 1
piece handle=<path>\BACKUPSET\2012_01_13\O1_MF_ANNNN_TAG20120113T112054_7JZKL018_.BKP tag=TAG20120113T112054
channel ORA_DISK_1: validation complete, elapsed time: 00:00:03
Finished restore at 13-JAN-12
 
RMAN>
 
 
Option2: If you have recovery catalog configured, connect to target database and recovery catalog, and  "RESTORE ARCHIVELOG ALL VALIDATE;" works without errors.
 
 
 
rman target / catalog <username>/<password>@<catalog_tns>
 
Recovery Manager: Release 10.2.0.4.0 - Production on Fri Jan 13 11:37:11 2012
 
Copyright (c) 1982, 2007, Oracle. All rights reserved.
 
connected to target database: <dbname> (DBID=<dbid>)
connected to recovery catalog database
 
RMAN> RESTORE ARCHIVELOG ALL VALIDATE;
 
Starting restore at 13-JAN-12
allocated channel: ORA_DISK_1
channel ORA_DISK_1: sid=35 devtype=DISK
 
channel ORA_DISK_1: starting validation of archive log backupset
channel ORA_DISK_1: reading from backup piece <path>\BACKUPSET\2012_01_13\O1_MF_ANNNN_TAG20120113T111853_7JZKG717_.BKP
channel ORA_DISK_1: restored backup piece 1
piece handle=<path>\BACKUPSET\2012_01_13\O1_MF_ANNNN_TAG20120113T111853_7JZKG717_.BKP tag=TAG20120113T111853
channel ORA_DISK_1: validation complete, elapsed time: 00:00:02
channel ORA_DISK_1: starting validation of archive log backupset
channel ORA_DISK_1: reading from backup piece v\BACKUPSET\2012_01_13\O1_MF_ANNNN_TAG20120113T112054_7JZKL018_.BKP
channel ORA_DISK_1: restored backup piece 1
piece handle=<path>\BACKUPSET\2012_01_13\O1_MF_ANNNN_TAG20120113T112054_7JZKL018_.BKP tag=TAG20120113T112054
channel ORA_DISK_1: validation complete, elapsed time: 00:00:03
Finished restore at 13-JAN-12
 
RMAN>

RMAN-06054 While performing Duplicate

$
0
0
SYMPTOMS
 
 
RMAN duplicate fails with error: 
 
RMAN-06054 media recovery requesting unknown archived log for thread 1 with sequence XXXX and starting SCN of XXXXXX
 
 
RMAN-06054: media recovery requesting unknown archived log for thread string with sequence string and starting SCN of string
Cause: Media recovery is requesting a log whose existence is not recorded in the recovery catalog or target database control file.
Action: If a copy of the log is available, then add it to the recovery catalog and/or control file via a CATALOG command and then retry the RECOVER command. If not, then a point-in-time recovery up to the missing log is the only alternative and database can be opened using ALTER DATABASE OPEN RESETLOGS command.
 
 
 
CHANGES
 
 
CAUSE
Duplicate process takes the SCN (for until scn) from most recent backup control file checkpoint_change# and if the SCN of the control file 
in the backup was higher than the SCN of the most recent archived log in backup, then RMAN recovery will be looking for archives to recover 
the database as per UNTIL SCN condition and it fails.
 
Until 11.2.0.3, It was using Highest SCN of Archive log from the backup
 
 
Bug 21868720 - RMAN-06054 ON BACKUP BASED DUPLICATE"
 
 
SOLUTION
* Ensure the controlfile checkpoint SCN is less than last archivelog SCN in the backup.
 
* While taking backup for duplicate purpose, disable CONTROLFILE AUTOBACKUP and perform complete database backup, this will make sure controlfile is backed up in the middle of the backup.  
 
Option 1:
 
=======
 
Apply patch 21868720
 
 
 
Option 2:
 
======
 
Workaround : Connect to catalog while doing duplicate.
 
 
 
Option 3:
======
 
While doing backup, if we disable controlfile autobackup, the controlfile backup will not be taken after the archivelog backup.  Thus the archivelog SCN will be higher than the controlfile SCN. 
 
1.  Turn off controlfile autobackup:
 
RMAN> CONFIGURE CONTROLFILE AUTOBACKUP OFF;
2.  Perform database/archivelog backup:
 
run
{
allocate channel a1 device type disk format '/<path>/<datafilename>/%U';
allocate channel a2 device type disk format '/<path>/<datafilename>/%U';
allocate channel a3 device type disk format '/<path>/<datafilename>/%U';
allocate channel a4 device type disk format '/<path>/<datafilename>/%U';
allocate channel a5 device type disk format '/<path>/<datafilename>/%U';
allocate channel a6 device type disk format '/<path>/<datafilename>/%U';
backup as compressed backupset database plus archivelog;
}
3.  With the above backup, perform the duplicate.  
 
Option 4:
======
 
1.  While performing duplicate, use  a SET UNTIL clause.  I.e., "SET UNTIL SCN" or "SET UNTIL SEQUENCE" explicitly.  For example:
 
rman auxiliary /
run
{
 
set until SCN 1234455;
allocate auxiliary channel a1 device type disk;
allocate auxiliary channel a2 device type disk;
allocate auxiliary channel a3 device type disk;
allocate auxiliary channel a4 device type disk;
DUPLICATE DATABASE TO dev1 BACKUP LOCATION '/<path>/<datafilename>/';
}
 
Option 5:
=======
 
If we don't want to re-execute the duplicate command, complete the process manually using information in the following note:  
 
Manual Completion of a Failed RMAN Duplicate (Note 360962.1)
 
NOTE:  It may be required for you to manually make more archivelog files available to satisfy the until SCN condition. 

Can't Recover a Database Saved With the OEM Backup Facility RMAN-06169

$
0
0
APPLIES TO:
Oracle Database - Enterprise Edition - Version 8.1.6.0 to 10.1.0.2 [Release 8.1.6 to 10.1]
Information in this document applies to any platform.
SYMPTOMS
RMAN restore fails with RMAN-6169
RMAN-06169: could not read file header for datafile 1 error reason 15
 
CHANGES
 
 
CAUSE
 From kcv.c
/* if the controlfile is not a backup then the controlfile checkpoint count
** stored in the file header should be less than or equal the one in the
** controlfile. If it is not, then controlfile is an old restored image
** copy */
if (KCCFHX(&hx->kcvhxft.kccftbch)->kccfhxfh.kccfhtyp != KCCTYPBC
&& fhp->kcvfhccc > fe.kccfecpc)
{ /* wrong checkpoint count */
hx->kcvhxerr = KCVHXCPC;
goto got_header;
}
 
#define KCVHXCPC 15 /* wrong checkpoint count */
 
An incorrect version of controlfile is used
The Controlfile was CURRENT but from an older COLD-backup
than the restored datafiles.
 
SOLUTION
Restore a more recent controlfile.

Errors on rman blockrecover attempt RMAN-06026, RMAN-06023

$
0
0
APPLIES TO:
Oracle Database - Enterprise Edition - Version 10.2.0.3 and later
Information in this document applies to any platform.
***Checked for relevance on 09-Feb-2011***
 
 
RMAN-06026: some targets not found - aborting restore
Cause: Some of the files specified for restore could not be found. Message 6023, 6024, or 6025 is also issued to indicate which files could not be found. Some common reasons why a file can not be restored are that there is no backup or copy of the file that is known to recovery manager, or there are no backups or copies that fall within the criteria specified on the RESTORE command, or some datafile copies have been made but not cataloged.
Action: The Recovery Manager LIST command can be used to display the backups and copies that Recovery Manager knows about. Select the files to be restored from that list.
 
 
 
 
SYMPTOMS
========
Disclaimer:
NOTE: In the images and/or the document content below, the user information and environment data used represents
fictitious data from the Oracle sample schema(s), Public Documentation delivered with an Oracle database product or
other training material. Any similarity to actual environments, actual persons, living or dead, is purely coincidental and
not intended in any manner.
For the purposes of this document, the following fictitious environment is used as an example to describe the procedure:
 
Target:  DB Name: O1RMTNYP
========
 
 
 
Oracle 10.2.0.3
Sun Sparc Solaris
 
Errors when trying to do blockrecover:
 
RMAN> BLOCKRECOVER CORRUPTION LIST;
 
RMAN-03002: failure of blockrecover command at 08/17/2009 08:10:24
RMAN-06026: some targets not found - aborting restore
RMAN-06023: no backup or copy of datafile 37 found to restore
 
CAUSE
There is corruption in the backup of the datafile
 
RMAN> list backup of datafile 37;
 
List of Backup Sets
===================
 
BS Key Type LV Size Device Type Elapsed Time Completion Time
------- ---- -- ---------- ----------- ------------ --------------------
47555 Incr 0 797.68M DISK 00:12:41 08-AUG-2009 17:19:53
BP Key: 47554 Status: AVAILABLE Compressed: YES Tag: TAG20090808T170710
Piece Name: /< backup directory >/O1RMTNYP_LEVEL_0_DB_BKUP_17:05:46-08-08-2009_1_47626.rman
List of Datafiles in backup set 47555
File LV Type Ckp SCN Ckp Time Name
---- -- ---- ---------- -------------------- ----
37 0 Incr 1386008078 08-AUG-2009 17:07:14 < directory >/O1RMTNYP/dbf02/rmtd18.dbf
 
 
 
SQL> select set_stamp from v$backup_piece where handle like '/< backup directory >/O1RMTNYP_LEVEL_0_DB_BKUP_17:05:46-08-08-2009_1_47626%';
 
SET_STAMP
----------
694372032
 
SQL> select * from v$backup_corruption where file# = 37 and set_stamp ='694372032';
 
RECID STAMP SET_STAMP SET_COUNT PIECE# FILE# BLOCK#
---------- ---------- ---------- ---------- ---------- ---------- ----------
BLOCKS CORRUPTION_CHANGE# MAR CORRUPTIO
---------- ------------------ --- ---------
164 694372793 694372032 47626 1 37 178472
90 0 YES ALL ZERO
 
165 694372793 694372032 47626 1 37 215037
5 0 YES ALL ZERO
 
SOLUTION
You cannot recover blocks 178472 and 215037 using that piece as that backup-set doesn't contain a good block image of the block that you want to repair. You can repair other blocks in file# 37 other than above using the piece listed.
CORRUPTION LIST means all the blocks listed in v$database_block_corruption. If there are more than those two blocks, then you have to specify them individually in the block recover command (enhanced in 11gR1 so that one can specify a range of blocks in the syntax). If there are only those two blocks, then user has to give us a backup (and catalog those in RMAN) that was not taken using 'set maxcorrupt'.
 

"RMAN-07517: The File Header Is Corrupted Restoring Files on Another Server

$
0
0
APPLIES TO:
Oracle Database - Enterprise Edition - Version 11.2.0.4 to 12.2 BETA1 [Release 11.2 to 12.2]
Oracle Database Cloud Schema Service - Version N/A and later
Oracle Database Exadata Cloud Machine - Version N/A and later
Oracle Cloud Infrastructure - Database Service - Version N/A and later
Oracle Database Backup Service - Version N/A and later
Information in this document applies to any platform.
 
 
SYMPTOMS
Using RMAN to backup/restore datafiles from one server to another (same OS, same endianness) fails
 
 
RMAN-07517: Reason: The file header is corrupted
 
 
 
CAUSE
 The source Production database, has a 32KB tablespace and an initialisation parameter defined for db_32k_cache_size,
 
but the target pfile didn't have the parameter db_32k_cache_size defined.
 
SOLUTION
 Set in the pfile/spfile of the target db_32k_cache_size.

Rman Backup Failed with RMAN-600 [6000]

$
0
0
APPLIES TO:
Oracle Database - Enterprise Edition - Version 11.2.0.2 and later
Oracle Database Exadata Cloud Machine - Version N/A and later
Oracle Cloud Infrastructure - Database Service - Version N/A and later
Oracle Database Cloud Exadata Service - Version N/A and later
Oracle Database Exadata Express Cloud Service - Version N/A and later
Information in this document applies to any platform.
***Checked for relevance on 22-Feb-2013***
SYMPTOMS
Backup of datafiles is complete but after it we see a failure showing:
 
 
RMAN-00600: internal error, arguments [string] [string] [string] [string] [string]
Cause: An internal error in recovery manager occurred.
Action: Contact Oracle Support Services.
 
 
 
DBGANY: 612 TEXTNOD = sys.dbms_backup_restore.setRmanStatusRowId(rsid=>0, rsts=>0);
DBGANY: 613 TEXTNOD = end;
RMAN-00571: ===========================================================
RMAN-00569: =============== ERROR MESSAGE STACK FOLLOWS ===============
RMAN-00571: ===========================================================
RMAN-00601: fatal error in recovery manager
RMAN-03004: fatal error during execution of command
RMAN-00600: internal error, arguments [6000] [] [] [] []
 
After the failure the next step on the backup script is not executed, this could be an autobackup or a backup of archives.
 
CAUSE
Taking a closer look at the RMAN output obtained from the backup we see earlier errors:
 
channel dsk1: starting piece 2 at 15-MAY-11
RMAN-03009: failure of backup command on dsk2 channel at 05/15/2011 07:29:07
ORA-19504: failed to create file "/<DIR>/<DB_NAME>/<FILE_NAME>"
ORA-27040: file create error, unable to create file
Linux-x86_64 Error: 2: No such file or directory
channel dsk2 disabled, job failed on it will be run on another channel
RMAN-03009: failure of backup command on dsk3 channel at 05/15/2011 07:29:07
ORA-19504: failed to create file "/<DIR>/FILE_NAME"
ORA-27040: file create error, unable to create file
Linux-x86_64 Error: 2: No such file or directory
channel dsk3 disabled, job failed on it will be run on another channel
 
There could be other errors but in this case the output directory to write backup pieces did not exist on disk for at least one of the channels
 
This causes the channel to be disabled and the job to be executed on another  available channel.
 
Then once the backup is complete the channel fails to be auto-allocated to execute the next backup command
 
The issue has been raised on unpublished Bug:
Bug 6314281: RMAN-600: [6000] [] [] [] [] DOING "BACKUP RECOVERY AREA"
 
Fixed->12.2
 
 
SOLUTION
 
Workaround :-
------------------
 
run {
set autolocate off;
<... rest of backup commands ...>
}
 
 
 
Check for Availability of One off patch using link Patch 6314281

RMAN RESTORE fails with RMAN-00600 [8064]

$
0
0
APPLIES TO:
Oracle Database - Enterprise Edition - Version 10.2.0.1 to 10.2.0.4 [Release 10.2]
Oracle Database Cloud Schema Service - Version N/A and later
Oracle Database Exadata Express Cloud Service - Version N/A and later
Oracle Database Exadata Cloud Machine - Version N/A and later
Oracle Cloud Infrastructure - Database Service - Version N/A and later
Information in this document applies to any platform.
SYMPTOMS
When RMAN backup is restored to a new database with the same datafile paths as production server, the RESTORE fails with RMAN-00600 [8064]. The error stack looks like the following:
 
RMAN> restore database preview;
Starting restore at 03-JUN-09
allocated channel: ORA_DISK_1
channel ORA_DISK_1: sid=415 devtype=DISK
 
RMAN-00571: ===========================================================
RMAN-00569: =============== ERROR MESSAGE STACK FOLLOWS ===============
RMAN-00571: ===========================================================
RMAN-00601: fatal error in recovery manager
RMAN-03012: fatal error during compilation of command
RMAN-03028: fatal error code for command restore : 600
RMAN-00600: internal error, arguments [8064] [1] [<path>\<db_unique_name of restored db>\SYSTEM01.DBF]
[<path>\<db_unique_name of prod>\SYSTEM01.DBF] []
 
CHANGES
The path of datafiles in controlfile of production server is different from the path stored in recovery catalog:
 
SQL> select name from v$datafile ;
 
NAME
------------------------------------------------
<path>\<db_unique_name of prod>\SYSTEM01.DBF
<path>\<db_unique_name of prod>\UNDOTBS01.DBF
<path>\<db_unique_name of prod>\UNDOTBS02.DBF
 
.............
.............
 
RMAN Connected WITHOUT recovery catalog
=================================
 
RMAN> report schema;
Report of database schema
 
List of Permanent Datafiles
===========================
File Size(MB) Tablespace RB segs Da0tafile Name
---- -------- -------------------- ------- ------------------------
1 2048 SYSTEM *** <path>\<db_unique_name of prod>\SYSTEM01.DBF
2 2048 UNDO *** <path>\<db_unique_name of prod>\UNDOTBS01.DBF
................
................
 
RMAN Connected WITH recovery catalog
=========================
 
RMAN> report schema;
Report of database schema
 
List of Permanent Datafiles
===========================
File Size(MB) Tablespace RB segs Datafile Name
---- -------- -------------------- ------- ------------------------
1 2048 SYSTEM YES <path>\<db_unique_name of previous clone activity>\SYSTEM01.DBF
2 2048 UNDO YES  <path>\<db_unique_name of previous clone activity>\UNDOTBS01.DBF
 
................
................
 
RMAN> list incarnation;
 
List of Database Incarnations
DB Key Inc Key DB Name DB ID STATUS Reset SCN Reset Time
------- ------- -------- ---------------- --- ---------- ----------
1 1 <db_unique_name prod> <dbid> CURRENT 1 15/AUG/08
 
View RC_DATAFILE will also show a different path then controlfile (V$DATAFILE) of production database.
 
CAUSE
While cloning the database, when an RMAN RESTORE is attempted to a new location while connected to recovery catalog, the catlaog is resynced with the new path as a result of SWITCH DATAFILE ALL command.
 
If the DBID/DBNAME is not changed, the backup of production database (original database) will still be successfully completed even if the location in controlfile ( V$DATAFILE) and recovery catalog (RC_DATAFILE or REPORT SCHEMA) are different. This is because the RMAN reads File# instead of NAME for backup activities.
 
A next cloning attempt from RMAN connected with recovery catalog on a new server, this time with original path structures as present in production database controlfile ( V$DATAFILE ) will fail with RMAN-00600 [8064], because RMAN will try to restore files on the path stored in recovery catalog (RC_DATAFILE) as opposed to V$DATAFILE of production database.
 
Cloning a production database should be not be attempted with RMAN connected to recovery catalog, particularly when the path/name of datafiles are changed with SET NEW NAME...SWTICH DATAFILE.. command. It resyncs the catalog with new datafile path/name which leads to the difficulty in furhter restore. It is mentioned in Oracle documentation also :
 
 
Now, the problem here is that we need to update the recovery catalog with correct file name as present in controlfile of production database to avoid any confusions.
 
SOLUTION
Option 1:
=======
 
UNREGISTER / REGISTER the database and catalog all backups :
 
RMAN> UNREGISTER DATABASE ;
 
RMAN> REGISTER DATABASE ;
 
RMAN> CATALOG START WITH <backup_path>\ ;
 
Option 2:
======
 
Change any property of datafiles, so that RMAN resync the schema information into recovery catalog. For example, resizing the datafile will resync the controlfile information into recovery catalog :
 
SQL> ALTER DATABASE DATAFILE <datafile_number> RESIZE <old_size+1> ;
 
In above SQL, the size of datafile is increased just by 1 byte.
 
Now, connect to RMAN with recovery catalog and resync catalog :
 
RMAN> CONNECT TARGET /
 
RMAN> CONNECT CATALOG <UN>/<PWD>@<TNS>
 
RMAN> RESYNC CATALOG ;
 
RMAN> REPORT SCHEMA ;  # It should now show the correct path
 
Further RESTORE should succeed after updating the recovery catalog information with correct path/name of datafiles.

flashback query failed with ORA-01555?

$
0
0

if you deleted/updated data by mistake in oracle. and try to recover data from flashback query.  Your query may fail with ora-01555 error:

 

SQL> l  1  declare  2  cursor c is select * from testt2 as of scn 5385449;  3  begin  4  for i in c loop  5  null;  6  end loop;  7* end;SQL> /declare*ERROR at line 1:ORA-01555: snapshot too old: rollback segment number  with name "" too smallORA-06512: at line 4

 

if you have no backup , then you can not recover data in this case .

 

we provide a better flashback service  ,  it can recover part of deleted data . for example .

 

SQL> set serveroutput on;

SQL> exec better_flashback_table_save('TEST2','TESTT2',2843925,'MYTVSAVE3');
table TEST2.TESTT2 @ scn 2843925 find   5568 rows , copied to  TEST2.MYTVSAVE3
 
PL/SQL procedure successfully completed.
 
 
the service step :
 
1. first expand your undo_retention parameter :
 
alter system set undo_retention=86400;
 
expand all undo datafiles 
 
alter database datafile 'undofile' resize BIGGER_SIZE;
 
 
the first step avoid undo extent reused by system
 
2. we will provide procedure better_flashback_table_query  , check how many rows can be recovered
 
3. we will provide procedure better_flashback_table_save  , which will recover data and stored in new table .
 
4.  we can also take advantage prm-dul undelete function to help you  recover data .   pls check  https://youtu.be/hIYutqNcVBI
 
 
 
We provide above as a service .  contact   service@parnassusdata.com
 
 
 

devos ransomware/malware encrypted oracle database datafiles

$
0
0

one oracle database has been encrypted by devos ransomware malware , 

 

 

 

 

user can recover data from encrpted datafiles using prm-dul ,   reference videos:   https://youtu.be/jOT6k-KF8Hg

 

 

Steps to troubleshoot Oracle RAC clusterware GIPCHA connection failure

$
0
0
PURPOSE
How to troubleshoot when a local process (non bootstrap client) fails to CONNECT to a peer process using the GIPCHA communication protocol?
 
 
 
TROUBLESHOOTING STEPS
Lets assume that P1 process running on node-1 and P2 process running on node-2 and P1 process made attempt to connect P2 process (assume the connection string is 
gipcha://node-2:foo) using the gipcha communication protocol at timestamp t1 and this CONNECTION was not succeeded . Root cause the CONNECT failure reason as below.
 
1. Check the aliveness of P2 process and GIPCDaemon process on node-2 at timestamp t1. If they are not alive, then it means P1 tried to CONNECT to a non existent process and because of that the CONNECTION request failed.
 
2. If P2 process is alive but GIPCD-2 (GIPCDaemon which runs on node-2) was not alive, then we need to investigate why GIPCDaemon was not spawned by AGENT on node-2. 
 
3. If GIPCDaemon was not running on node-1, then we need to investigate why GIPCDaemon was not spawned by AGENT on node-1.
 
4. If P1, P2, GIPCD-1 (GIPCDaemon which runs on node-1) and GIPCD-2 (GIPCDaemon which runs on node-2) all alive then try the below steps.
 
(a) Investigate whether P2 process is listening at connection string "gipcha://node-2:foo or not at timestamp t1. The log files of GIPCD-2 will help us to identifying this info.
 
 When P2 process creates listen endpoint gipcha://node-2:foo  then you will see log messages in GIPCD-2 which looks as something as below.
 
Example log of GIPCDaemon:
 
2016-05-13 03:15:22.887 :GIPCDCLT:889784064:  gipcdClientThread: req from local client of type gipcdmsgtypeCreateName, endp 0000000000000105
2016-05-13 03:15:22.887 :GIPCDCLT:889784064:  gipcdClientCreateName: Received type(gipcdmsgtypeCreateName), endp(0000000000000105), len(1008), buf(0x7fcd2c0ea1b8):[hostname(node-2), portstr: (foo), haname(d470-975b-4d7b-5095), retStatus(gipcretSuccess)]
2016-05-13 03:15:22.887 :GIPCDCLT:889784064:  gipcdInitPortEntry: port foo entry initialized. port memid : 0000050800000000
2016-05-13 03:15:22.887 :GIPCDCLT:889784064:  gipcdAddPortEntry: added port foo entry to shared memory. port mid: 0000050800000000client memid 000003c800000000, client 5983 incarnation 2
 
 
If node-2 deletes/closed the same listen endpoint, then you will see the below log messages.
 
 
Example log of GIPCDaemon:
 
2016-05-13 03:16:42.962 :GIPCDCLT:889784064:  gipcdClientDeleteName: Received type(gipcdmsgtypeDeleteName), endp(0000000000000105), len(1008), buf(0x7fcd2c0fb728):[hostname(node-2), portstr: (foo), haname(d470-975b-4d7b-5095), retStatus(gipcretSuccess)]
2016-05-13 03:16:42.962 :GIPCDCLT:889784064:  gipcdClientDeleteName: Name deleted(foo)
2016-05-13 03:16:42.962 :GIPCDCLT:889784064:  gipcdDelPortEntry: port foo entry deleted from shared memory. port memid: 0000050800000000
 
In the GIPCDaemon logs , if you see only LISTEN endpoint creation logs but no deletion/close  logs  before timestamp t1, then it means P2 process successfully listening at time stamp t1.
 
(b) If P2 process SUCCESSFULLY listening at the LISTEN endpoint, then we need to investigate whether GIPCD-1 SUCCESSFULLY resolved the P2 connection string or not.
 
 When P1 process makes attempt to CONNECT P2 process, then it needs to resolve the P2 connection string (gipcha://node-2:foo) and for that it sends a LOOKUP request to GIPCD-1.
 
Example log GIPCDaemon:
 
2016-05-13 03:15:52.204 :GIPCDCLT:889784064:  gipcdClientThread: req from local client of type gipcdmsgtypeLookupName, endp 0000000000000105
2016-05-13 03:15:52.204 :GIPCDCLT:889784064:  gipcdClientLookupName: Received type(gipcdmsgtypeLookupName), endp(0000000000000105), len(1008), buf(0x7fcd2c0e9898):[hostname(node2), portstr: (foo), haname(), retStatus(gipcretSuccess)]
 
If GIPCD-1 fails to resolve it locally, then it forwards the lookup request to GIPCD-2. And for forwarding the same it needs to CONNECT GIPCD-2. If GIPCD-1 already connected to GIPCD-2, then immediately it sends the LOOKUP request, if it is not connected, then it first connects to GIPCD-2 and it sends the lookup request as below.
 
2016-05-13 03:15:52.204 :GIPCDCLT:889784064:  gipcdEnqueueMsgForNode: Enqueuing for NodeThread (gipcdReqTypeLookupName)
2016-05-13 03:15:52.204 :GIPCDNDE:887682816:  gipcdProcessClientRequest: Dequeued req for host (node-2), type(gipcdReqTypeLookupName), id (0000000000000105, 0000000000000000), cookie 0x7fcd2c2c5738
2016-05-13 03:15:52.204 :GIPCDNDE:887682816:  gipcdSendReq: recvd msg clnt header: (req: 0x7fcd2c0ea240 [hostname(node-2), id (0000000000000105, 0000000000000000), len(392), req cookie(00007fcd2c2c5738), type(gipcdReqTypeLookupName)])
 
2016-05-13 03:15:52.206 :GIPCDNDE:887682816:  gipcdEnqueueSendReq: Enqueuing the msg in the pending send request for host node-2, msg 0x7fcd2c0ea240
2016-05-13 03:15:52.206 :GIPCDNDE:887682816:  gipcdSendReq: Enqueued the request and waiting for connection to complete with host node-2
 
>>>once GIPCD-1 successfully connects to GIPCD-2, then it forwards the LOOKUP request to GIPCD-2. If GIPCD-1 fails to CONNECT GIPCD-2, then please ask GIPC team to investigate it.
 
2016-05-13 03:15:52.219 :GIPCDNDE:887682816:  gipcdNodeThread: Connection established with hostname node-2, endp: 0000000000001b3a
2016-05-13 03:15:52.220 :GIPCDNDE:887682816:  gipcdNodeSendReq: Sending using id 0000000000001b3a, (nodehdr: req:0x7fcd200b43d0  [version(4107), len(3087073280), type((uknown)), req cookie(0000000000000002)] flags 335544320 clnthdr: req: 0x7fcd2c0ea240 [hostname(node-2), id (0000000000000105, 0000000000000000), len(392), req cookie(00007fcd200b4148), type(gipcdReqTypeLookupName)])
 
>>>After sending the LOOKUP request to GIPCD-2, GIPCD-2 mostly comes back with LOOKUPACK request as below. If LOOKUP ACK is not received then please ask GIPC team to investigate it.
 
2016-05-13 03:15:52.222 :GIPCDNDE:887682816:  gipcdNodeThread: Msg received from endp 0000000000001b3a, req:0x7fcd1c09f438  [version(185597952), len(600), type(gipcdReqTypeLookupNameAck), req cookie(0000000000000002)] flags 20
 
>>> P2 is SUCCESSFULLY listening at LISTEN endpoint but still GIPCD-2 failed to resolve the connection string, then please ask GIPC team to investigate it.
 
(c) If P1 process SUCCESSFULLY resolved the P2 connection string, then you will a message in P1 process which is looks as below.
 
2016-05-13 03:22:13.893 :GIPCHDEM:3670411008: gipchaDaemonCreateeesolveResponse: creating resolveResponse for host: node-2, port:foo, haname:475b-13b9-0f5b-aa6e, ret:0
 
>>>If P1 process SUCCESSFULLY resolved the connection string, then "RET" should be ZERO
 
(d) If P1 process SUCCESSFULLY resolved the P2 connection string but still P1 failed to CONNECT P2 , then we need to investigate whether P1 successfully fetched the P2 interfaces or not. Find the UDP interfaces of P1 and UDP interfaces of P2. When a process creates UDP interfaces for GIPCHA communication, then you will see a log message which looks as below.
 
2016-05-13 03:22:13.828 :GIPCHTHR:3671987968: gipchaWorkerCreateInterface: created local interface for node 'node-1', haName 'CSS_aime1adc00rqh', inf 'udp://10.232.130.185:54140' inf 0x7fcea02cffd0
 
   
If P1 Process fetches the interfaces of P2 process then you will see a log message which looks as below
 
2016-05-13 03:22:13.897 :GIPCHTHR:3671987968: gipchaWorkerCreateInterface: created remote interface for node 'node-2', haName '475b-13b9-0f5b-aa6e', inf 'udp://10.232.130.185:64368' inf 0x7fcea02bc540
2016-05-13 03:22:13.897 :GIPCHGEN:3671987968: gipchaWorkerAttachInterface: Interface attached inf 0x7fcea02bc540 { host 'node-2', haName '475b-13b9-0f5b-aa6e', local 0x7fcea02cffd0, ip '10.232.130.185:64368', subnet '10.232.128.0', mask '255.255.248.0', mac '', ifname '', numRef 0, numFail 0, idxBoot 0, flags 0x6 }
 
(e) If P1 process SUCCESSFULLY resolved the P2 process connection string and it successfully fetches the correct interface of P2 process, but still connection is FAILED. Then in that case, please investigate whether P1 process SENT the CONNECT request or not, similarly check whether P2 process ACCEPTED the CONNECT request of P2 or not.
 
If P1 process SUCCESSFULLY sent the CONNECT request then you will see a log message which looks as below.
 
2016-05-13 03:29:08.591 :GIPCHAUP:1253738240: gipchaUpperConnect: initiated connect for umsg 0x7f3f2008cca0 { msg 0x7f3f2005a260, ret gipcretRequestPending (15), flags 0x6 }, msg 0x7f3f2005a260 { type gipchaMsgTypeConnect (3), srcPort '43a4-4456-3ec9-7c86', dstPort 'foo', srcCid 00000000-0000047e, cookie 00007f3f-2008cca0 } dataLen 0, endp 0x1b51380 [000000000000047e] { gipchaEndpoint : port '43a4-4456-3ec9-7c86', peer ':', srcCid 00000000-0000047e,  dstCid 00000000-00000000, numSend 0, maxSend 100, groupListType 1, hagroup 0x17ea3d0, priority 0, forceAckCount 0, usrFlags 0x4000, flags 0x0 } node 0x7f3f143029e0 { host 'aime1-adc00rqh-0', haName '64fb-4631-033f-fb7a', srcLuid dc93ae6c-9bd26476, dstLuid 00000000-00000000 numInf 0, sentRegister 1, localMonitor 0, baseStream 0x7f3f14302e20 type gipchaNodeType12001 (20), nodeIncarnation 395ca385-154924a4, incarnation 2, cssIncarnation 0, roundTripTime 4294967295 lastSeenPingAck 0 nextPingId 1 latencySrc 0 latencyDst 0 flags 0x80000}
 
>>Please check the status of CONNECT umsg "0x7f3f2008cca0"  in P1 process log
 
 
In P2 process logs check whether P2 process accepted the P1 connect or not. If P2 Process accepted the CONNECT request then you will see a log message which looks as below.
 
 
2016-05-13 03:38:45.986 :GIPCHAUP:2620307200: gipchaUpperProcessAccept: completed new hastream 0x7f7c44062a80 { host 'node-1', haName '6304-ac42-e1c3-579e' srcStreamId 00000000-00008526 dstStreamId 00000000-000006e2 , hendp (nil) haNode 0x7f7c440642c0 numInf 1, contigSeq 1, lastAck 0, lastValidAck 0, sendSeq [1 : 1], priority 0,  duplicate recv 0, completed recv 0, completed send 0, total send 0, total recv 1, flags 0x1}  for hendp 0x7f7c44065070 [0000000000008526] { gipchaEndpoint : port 'foo/1910-9a85-f387-c449', peer ':', srcCid 00000000-00008526,  dstCid 00000000-00000000, numSend 0, maxSend 100, groupListType 1, hagroup 0x20731e0, priority 0, forceAckCount 0, usrFlags 0x4000, flags 0x0 }
 
 
(f) Even after trying all the above steps still if it is not clear why CONNECT failed, then please ask GIPC team to investigate it.

How to Recover Oracle Database from Loss Of Online Redo Log And ORA-312 And ORA-313

$
0
0
PURPOSE
This article aims at walking you through some of the common recovery scenarios after a loss of Online Redolog
 
 
SCOPE
All Oracle support Analysts, DBAs and Consultants who have a role to play in recovering an Oracle database
 
 
DETAILS
Recovering After the Loss of Online Redo Log Files: Scenarios
 
If a media failure has affected the online redo logs of a database, then the
appropriate recovery procedure depends on the following:
 
- The configuration of the online redo log: mirrored or non-mirrored
- The type of media failure : temporary or permanent
- The types of online redo log files affected by the media failure: CURRENT, ACTIVE, UNARCHIVED, or INACTIVE
 
- The database was shutdown normally before loss of archivelog file
 
 
 
1) Recovering After Losing a Member of a Multiplexed Online Redo Log Group
 
 
If the online redo log of a database is multiplexed, and if at least one member of each online redo log group is not affected by the media failure, then the database continues functioning as normal, but error messages are written to the log writer trace file and the alert_SID.log of the database.
 
ACTION PLAN
 
If the hardware problem is temporary, then correct it. The log writer process accesses the previously unavailable online redo log files as if the problem never existed.
 
If the hardware problem is permanent, then drop the damaged member and add a new member by using the following procedure.
 
To replace a damaged member of a redo log group:
 
Locate the filename of the damaged member in V$LOGFILE. The status is INVALID if the file is inaccessible:
 
 
SQL> SELECT GROUP#, STATUS, MEMBER FROM V$LOGFILE WHERE STATUS='INVALID';
 
GROUP#    STATUS       MEMBER
-------   -----------  ---------------------
0002      INVALID      /<redo log path>/<redo log name>
 
+ Drop the damaged member.
  For example, to drop redo log member  from group 2, issue:
 
 
SQL> ALTER DATABASE DROP LOGFILE MEMBER '/<redo log path>/<redo log name>';
 
+ Add a new member to the group.
  For example, to add new redo log member to group 2, issue:
 
 
SQL> ALTER DATABASE ADD LOGFILE MEMBER '/<redo log path>/<redo log name>' TO GROUP 2;
 + If the file you want to add already exists, then it must be the same size as the other group members, and you must specify REUSE. 
 
  For example:
 
SQL> ALTER DATABASE ADD LOGFILE MEMBER '/<redo log path>/<redo log name>' REUSE TO GROUP 2;
2) Losing an Inactive Online Redo Log Group
 
 
If all members of an online redo log group with INACTIVE status are damaged, then the procedure depends on whether you can fix the media problem that damaged the inactive redo log group.
 
If the failure is ... Temporary... then Fix the problem. LGWR can reuse the redo log group when required.
If the failure is ... Permanent... then the damaged inactive online redo log group eventually halts normal database operation.
 
ACTION PLAN
 
Reinitialize the damaged group manually by issuing the "ALTER DATABASE CLEAR LOGFILE"
You can clear an inactive redo log group when the database is open or closed.
The procedure depends on whether the damaged group has been archived.
 
To clear an inactive, online redo log group that has been archived:
 
If the database is shut down, then start a new instance and mount the database:
STARTUP MOUNT
 
Reinitialize the damaged log group.
For example, to clear redo log group 2, issue the following statement:
 
ALTER DATABASE CLEAR LOGFILE GROUP 2;
 
Clearing Inactive, Not-Yet-Archived Redo
 
Clearing a not-yet-archived redo log allows it to be reused without archiving it. This action makes backups unusable if they were started before the last change in the log, unless the file was taken
offline prior to the first change in the log.   Hence, if you need the cleared log file for recovery of a backup, then you cannot recover that backup.  Also, it prevents complete recovery from backups due to the missing log.
 
To clear an inactive, online redo log group that has not been archived:
 
If the database is shut down, then start a new instance and mount the database:
 
STARTUP MOUNT
 
Clear the log using the UNARCHIVED keyword. For example, to clear log group 2,
issue:
 
ALTER DATABASE CLEAR UNARCHIVED LOGFILE GROUP 2;
 
If there is an offline datafile that requires the cleared log to bring it online, then the keywords UNRECOVERABLE DATAFILE are required.   The datafile and its entire tablespace have to be dropped because the redo necessary to bring it online is being cleared, and there is no copy of it.
For example enter:
 
ALTER DATABASE CLEAR UNARCHIVED LOGFILE GROUP 2 UNRECOVERABLE DATAFILE;
Note:  If this is performed on an Active (current) logfile an error will occur:
 
Immediately back up the whole database including controlfile, so that you have a backup you can use for complete recovery without relying on the cleared log group. 
 
 
 
Failure of CLEAR LOGFILE Operation
 
The ALTER DATABASE CLEAR LOGFILE statement can fail with an I/O error due to media failure when it is not possible to:
 
* Relocate the redo log file onto alternative media by re-creating it under the currently configured redo log filename
* Reuse the currently configured log filename to re-create the redo log file because the name itself is invalid or unusable (for example, due to media failure)
 
In these cases, the ALTER DATABASE CLEAR LOGFILE statement (before receiving the I/O error) would  have successfully informed the control file that the log was being cleared and did not require archiving.
 
The I/O error occurred at the step in which the CLEAR LOGFILE statement attempts to create the new redo log file and write zeros to it. This fact is reflected in V$LOG.CLEARING_CURRENT.
 
3) Loss of online logs after normal shutdown 
 
You have a database in archive log mode, shutdown immediate and deleted one of the online redo logs, in this case there are only 2 groups with 1 log member in each. When you try to open the database you receive the following errors: 
 
ora-313 open failed for members of log group 2 of thread 1.
ora-312 online log 2 thread 1 'filename'
It is not possible to recover the missing log, so the following needs to be performed!
 
Mount the database and check v$log to see if the deleted log is current.
 
- If the missing log is not current, simply drop the log group (alter database drop logfile group N).
If there are only 2 log groups then it will be necessary to add another group before dropping this one.
 
- If the missing log is current they should simply perform fake recovery and then open resetlogs
 
sql> connect <username>/<password> as sysdba
sql> startup mount
sql> recover database until cancel;
(cancel immediately)
sql> alter database open resetlogs;
 
Be sure the location (directory) for the online log files exists before trying to open the database.  If not available then create it and rerun the resetlogs else this will give error
 
 NOTE:  If the current online log, needed for instance recovery, is lost, the database must be restored and recovered through the last available archivelog file(PITR).  

How Recover an Oracle Database Backup in Windows When Everything is Lost

$
0
0
GOAL
Which steps have to follow in order to recover a backup of a database in a windows platform when everything is lost ?
 
SOLUTION
NOTE: In the images and/or the document content below, the user information and environment data used represents fictitious data from the
Oracle sample schema(s), Public Documentation delivered with an Oracle database product or other training material. Any similarity to actual
environments, actual persons, living or dead, is purely coincidental and not intended in any manner.
For the purposes of this document, the following fictitious environment is used as an example to describe the procedure:
 
Database and SID Name:  YOURDB
 
************
 
First of all we need to install the Oracle Database Software, we need to install the same database release and the same patch set level.  But we must not create any database, in the installer window you
must select the option of installing software only.
 
Once the Oracle Database software is installed, to recover from an OS backup of your database in a windows platform, it's necessary :
 
 
    1: Create an Oracle Password File
    -------------------------------------------------------------
    For full details on how to create a password file please refer to Oracle9i Database
    Administrator's Guide.
   
      For example:  orapwd file=oraYOURDB.pwd password=<password> entries=10
   
   
   
    2: Create an Initialization Parameter File
    ----------------------------------------------------------------------
    Restore the init.ora file from the backup, and if you don't have the init.ora
    you can use an init.ora from another database and make the necessary changes
    You need setup the required parameters e.g DB_NAME, CONTROL_FILES and
    directories for bdump, udump,cdump etc...
   
      Parameter file '<ORACLE_HOME>\DATABASE\initYOURDB.ORA'
   
    3: Restore all the database files
    ----------------------------------------------------------------------
    Restore all the database files to the same location that they were in production database
   
    You must restore:
    -> controlfiles     <to  the location indicated in control_files parameter  in the init...ora>
    -> database files  
    -> Archivelog files <to the log_archive_dest directory in the init...ora>
   
    Make sure that you have the necessary backups of database and archived redo logs
   
   
   
    4: Create the Oracle services
    --------------------------------
    Create a new NT service for the duplicate database YOURDB using oradim.
   
     C:\>oradim -new -sid YOURDB -intpwd <password> -maxusers 10 -startmode auto -pfile  '<your pfile location>'
   
    if you don't have at least one control file, you  will need to recreate the control file.
    But must be careful,  you need to be sure that all the data files are  included in
    the CREATE CONTROLFILE command and that are all in the right location. Also must
    assure that the redolog files can be created in the indicated location.
       
    To recreate the control file 
   
      C:\> set ORACLE_SID=YOURDB
      C:\> sqlplus "sys/<password> as sysdba"
      SQL> startup nomount
      SQL> CREATE CONTROLFILE REUSE DATABASE YOURDB RESETLOGS  ARCHIVELOG
           MAXLOGFILES 16
           MAXLOGMEMBERS 3
           MAXDATAFILES 100
           MAXINSTANCES 8
           MAXLOGHISTORY 454
       LOGFILE
         GROUP 1 '<log_file_name_and_location>'  SIZE <size>M,
         GROUP 2 '<log_file_name_and_location>'  SIZE <size>M,
         GROUP 3 '<log_file_name_and_location>'  SIZE <size>M
      DATAFILE
      '<datafile_1_name_and_location>',
        .....
        '<datafile_1_name_and_location>'
      CHARACTER SET <your_db_charset>;
  
     You can change the CREATE control file options if you want:
    
     * CREATE CONTROLFILE SYNTAX:           
     This  information is fully documented in the Oracle SQL Reference Manual. 
     
                                               
       CREATE CONTROLFILE [REUSE]              
          DATABASE name                        
          [LOGFILE filespec [, filespec] ...]  
           RESETLOGS | NORESETLOGS             
          [MAXLOGFILES integer]                
          [DATAFILE filespec [, filespec] ...] 
          [MAXDATAFILES integer]               
          [MAXINSTANCES integer]               
          [ARCHIVELOG | NOARCHIVELOG]          
          [SHARED | EXCLUSIVE]                 
                                               
     
   
    5: Recover and Open database
    -------------------------------------
      C:\> set ORACLE_SID=YOURDB
      C:\> sqlplus "/ as sysdba"
      SQL> startup mount
      SQL> recover database until cancel using backup control file;
                    ===> apply all the archivelogs available and when you won't have
                         more available then type CANCEL
   
      SQL> alter database open resetlogs;
Viewing all 175 articles
Browse latest View live