A bootdisk failed (SVM)

BACKGROUND:
The system is Solaris 10 used SVM, RAID-1 with two sub-mirrors, c1t0d0 and c1t1d0. But it can not restart due to a metadevice c1t1d0 in an undefined state.
OPERATION:
1.Boot from cdrom and mount the disk c1t0d0s0.
ok boot cdrom -s
# mount /dev/dsk/c1t0d0s0 /tmp
2.Backup and then modify the /etc/vfstab and /etc/system on c1t0d0s0
/etc/vfstab: Change all the *md* format to /dev/dsk/c1t0d0s*.
/etc/system: Delete all the records related to SVM. Changes as following:
set md:mirrored_root_flag=1
* rootdev:/pseudo/md@0:0,0,blk
3.Restart the system using the sub-mirror c1t0d0.
# halt
ok boot
4.The system can only run in mode S and the / file system may be read only.
# metadb -i
flags first blk block count
a m p luo 16 8192 /dev/dsk/c1t0d0s7
a p luo 8208 8192 /dev/dsk/c1t0d0s7
M p unknown unknown /dev/dsk/c1t1d0s7
M p unknown unknown /dev/dsk/c1t1d0s7
# metadb -d /dev/dsk/c1t1d0s7
# metadb -i
flags first blk block count
a m p luo 16 8192 /dev/dsk/c1t0d0s7
a p luo 8208 8192 /dev/dsk/c1t0d0s7
But the command “metadb -f -a” and “metareplace” command failed with below errors:
Assertion failed: nsm == mm->un_nsm, file ../common/meta_mirror.c,
line 138
After check, we found that the service svc:/system/mdmonitor:default is in maintenance state. In the former messages, there was a record:
[ Mar 4 13:15:15 Executing start method
(“/lib/svc/method/svc-mdmonitor”) ]
No ‘mddb_bootlist’ entry in /kernel/drv/md.conf.
Delete all the meta database and prepare to recreate mirror.
# metadb -d /dev/dsk/c1t0d0s7
# halt
ok boot
5. Recreate the mirror.
# prtvtoc /dev/rdsk/c1t0d0s2 | fmthard -s – /dev/rdsk/c1t1d0s2
# metadb -afc 2 c1t0d07 c1t1d0s7
# metainit -f d10 1 1 c1t0d0s0
# metainit d20 1 1 c1t1d0s0
# metainit d0 -m d10
# metaroot d0
# init 6
# metattach d0 d20
Then the resync process is running. After it completed, the operating system redundancy has been restored.
REFERENCE:
Solstice DiskSuite[TM] 4.x & Solaris[TM] Volume Manager 1.0: Command line procedures [ID
1011732.1]

Posted in Solaris | Leave a comment

Updating the Drive Firmware

BACKGROUND:
There are many warning messages in media servers’ /var/adm/messages such as following:
Dec 28 16:43:06 mes12pdb1 bptm[19273]: [ID 266331 daemon.warning] TapeAlert Code: 0x27, Type: Warning, Flag: DIAGNOSTICS REQ., from drive IBM.ULTRIUM-TD3.002 (index 2), Media Id 000078
ROOT CAUSE:
The drives have different firmware versions. One is “93G0” and the others are “73P5”. When the higher firmware version drive write data to a tape that have been used by other drives. Then warning appeares, otherwise the tape may be FROZEN.
OPERATION:
Downgrade the drive’s firmware version from “93G0” to “73P5”.
1.Create FMR Tape Using a firmware version 73P5 Tape Drive.
Attention:Funcation Code 3 copies the drive’s field microcode replacement(FMR) data to a blank data cartridge. For this function, insert only a blank data cartridge or a cartridge that may be overwritten.
(1) Place the drive in maintennance mode.
The drive must be in maintenance mode to run drive diagnostics or maintenance functions. To place the unit in maintenance mode:
a.Make sure that no cartridge is in the drive.
b.Press the Unload Button three times within two seconds. 0 appears in the Single-character Display(SCD), and the Status Light turns amber.
Note: If a cartridge is in the tape drive, it will eject the first time that you press the Unload Button and the drive will not be placed in maintenance mode, perform the preceding step. Maintenance functions cannot be performed concurrently with read or write operations. While in maintenance mode, the drive does not receive SCSI commands from the server.
(2) Press the Unload Button once per second until 3 appears in the SCD.
(3) Press and hold the Unload Button for three or more seconds, then release it to select the function. The SCD changes to a flashing C.
(4) Insert a blank data cartridge that is not write protected (or the tape drive exits maintenance mode). The SCD changes to a flashing 3. The tape drive copies the FMR data to the blank data cartridge.
Note: If you inserted an invalid or write-protected tape cartridge, error code 7 appears in the SCD. The tape drive unloads the cartridge and exits maintenance mode. If the tape drive creates the FMR tape successfully, it rewinds and unloads the new tape, exits maintenance mode, and the tape is ready to use. If the tape drive fails to create the FMR tape, it displays an error code.
Tip: The firmware of multiple drives can be updated with the same FMR tape.
2.Update the drive’s firmware from an FMR tape cartridge.
(1) Ensure that a cartridge is not loaded in the drive.
(2) Place the drive in maintenance mode by pressing the Unload Button three times within a two seconds. The Status Light becomes solid amber, which means that the drive is in maintenance mode.
(3) Press the Unload Button once per second until 2 displays, then press and hold the button for three seconds. When C flashes,the drive is waiting for a cartridge.
(4) Insert the FMR tape cartridge. 2 flashes, the drive loads the updated firmware from the cartridge, and the Status Light flashes amber. When the update completes successfully, 0 displays and the cartridge automatically ejects. The drive resets itself and automatically activates the new firmware. If the update fails, an error code displays.
3.Unmake FMR Tape.
Attention: Function Code 8 erases the field microcode replacement (FMR) data and rewrites the cartridge memory on the tape. This converts the cartridge into a valid blank data cartridge.
(1) Place the drive in maintenance mode.
(2) Press the Unload Button once per second until 8 appears in the SCD.
(3) Press and hold the Unload Button for three or more seconds, then release it to select function 8. The SCD changes to a flashing C.
(4) Insert the FMR data cartridge (or the tape drive exits maintenance mode). The SCD changes to a flahing 8. The tape drive erases the firmware on the tape and rewrites the header in the cartridge memory to change the cartridge to a valid blank data cartridge: If the operation is successful, the tape drive displays function code 0, rewinds and unloads the newly converted scratch data cartridge, and exits maintenance mode. If the operation is not successful, an error code displays.

Posted in Daily work | Leave a comment

Recover VxVM failed disk

BACKGROUND: VCS 5.0mp3
A standby DB crashed due to a VxVM failed disk.
# df -h
df: cannot statvfs /ddms_db: I/O error
OPERATION:
1.Use the vxdisk list command to see which disks have failed.
# vxdisk list
DEVICE TYPE DISK GROUP STATUS
c4t21d0s2 auto:cdsdisk – – online
c4t22d0s2 auto:cdsdisk iscsiodydg02 iscsiodydg online
c4t23d0s2 auto:cdsdisk iscsiodydg03 iscsiodydg online
– – iscsiodydg01 iscsiodydg failed was:c4t21d0s2
2.Once the fault has been corrected, the disks can be reattached by using the following command to rescan the device list:
# /usr/sbin/vxdctl/vxdctl enable
3.Use the vxreattach command to reattach the disks:
# /etc/vx/bin/vxreattach
# vxdisk list
DEVICE TYPE DISK GROUP STATUS
c4t21d0s2 auto:cdsdisk iscsiodydg01 iscsiodydg online
c4t22d0s2 auto:cdsdisk iscsiodydg02 iscsiodydg online
c4t23d0s2 auto:cdsdisk iscsiodydg03 iscsiodydg RECOVER
4.Remount the failed file system, but error occured again.
# umount -f /ddms_db
# mkdir /ddms_db
# mount -F vxfs /dev/vx/dsk/iscsiodydg/iscsi_ddms /ddms_db
UX:vxfs mount: ERROR: V-3-20003: Cannot open /dev/vx/dsk/iscsiodydg/iscsi_ddms: No such device or address
UX:vxfs mount: ERROR: V-3-24996: Unable to get disk layout version
5.Displaying the volume and plex states.
# vxinfo -g iscsiodydg
iscsi_ddms fsgen Unstartable
Use the following form of the vxprint command to display detailed information about the configuration of the volume including its state and the states of its plexes.
# vxprint -g iscsiodydg -hvt iscsi_ddms
V NAME RVG/VSET/CO KSTATE STATE LENGTH READPOL PREFPLEX UTYPE
PL NAME VOLUME KSTATE STATE LENGTH LAYOUT NCOL/WID MODE
SD NAME PLEX DISK DISKOFFS LENGTH [COL/]OFF DEVICE MODE
SV NAME PLEX VOLNAME NVOLLAYR LENGTH [COL/]OFF AM/NM MODE
SC NAME PLEX CACHE DISKOFFS LENGTH [COL/]OFF DEVICE MODE
DC NAME PARENTVOL LOGVOL
SP NAME SNAPVOL DCO
EX NAME ASSOC VC PERMS MODE STATE
v iscsi_ddms – DISABLED ACTIVE 901775360 SELECT – fsgen
pl iscsi_ddms-01 iscsi_ddms DISABLED RECOVER 901775360 CONCAT – RW
sd iscsiodydg01-01 iscsi_ddms-01 iscsiodydg01 0 629047040 0 c4t21d0 ENA
sd iscsiodydg02-01 iscsi_ddms-01 iscsiodydg02 0 209813760 629047040 c4t22d0 ENA
sd iscsiodydg02-03 iscsi_ddms-01 iscsiodydg02 419627520 62914560 838860800 c4t22d0 ENA
6.Recovering an unstartable volume with a DISABLED plex in the RECOVER state.
(1)Force the plex into the OFFLINE state.
# vxmend -g iscsiodydg -o force off iscsi_ddms-01
# vxprint -g iscsiodydg -hvt iscsi_ddms
……
v iscsi_ddms – DISABLED ACTIVE 901775360 SELECT – fsgen
pl iscsi_ddms-01 iscsi_ddms DISABLED OFFLINE 901775360 CONCAT – RW
……
(2)Place the plex into the STALE state.
# vxmend -g iscsiodydg on iscsi_ddms-01
# vxprint -g iscsiodydg -hvt iscsi_ddms
……
v iscsi_ddms – DISABLED ACTIVE 901775360 SELECT – fsgen
pl iscsi_ddms-01 iscsi_ddms DISABLED STALE 901775360 CONCAT – RW
……
(3)In this scenario,there are no other clean plexes in the volume, use the following command to make the plex DISABLED and CLEAN:
# vxmend -g iscsiodydg fix clean iscsi_ddms-01
# vxprint -g iscsiodydg -hvt iscsi_ddms
……
v iscsi_ddms – DISABLED ACTIVE 901775360 SELECT – fsgen
pl iscsi_ddms-01 iscsi_ddms DISABLED CLEAN 901775360 CONCAT – RW
……
Note: If there are other ACTIVE or CLEAN plexes in the volume, use the following command to reattach the plex to the volume:
# vxplex [-g diskgroup] att plex volume
If the volume is already ENABLED, resynchronization of the plex is started immediately.
(4)Use the following command to start the DISABLED volume.
# vxvol -g iscsiodydg start iscsi_ddms
If you want to perform any resynchronization of the plexes in the background, you can use -o bg option.
# vxprint -g iscsiodydg -hvt iscsi_ddms
……
v iscsi_ddms – ENABLED ACTIVE 901775360 SELECT – fsgen
pl iscsi_ddms-01 iscsi_ddms ENABLED ACTIVE 901775360 CONCAT – RW
……
7.Remount the file system.
# mount -F vxfs /dev/vx/dsk/iscsiodydg/iscsi_ddms /ddms_db
UX:vxfs mount: ERROR: V-3-21268: /dev/vx/dsk/iscsiodydg/iscsi_ddms is corrupted. needs checking
# fsck -F vxfs /dev/vx/dsk/iscsiodydg/iscsi_ddms
log replay in progress
replay complete – marking super-block as CLEAN
# mount -F vxfs /dev/vx/dsk/iscsiodydg/iscsi_ddms /ddms_db
8.Use DBV to verify the standby DB datafiles.
We need to have the database at least in MOUNT stage to perform this operation.
Use the following SQL statements to generate the DBV scripts:
select ‘dbv file=’||name||’ blocksize=’||block_size||’ logfile=’||substr(name,instr(name,’/’,-1,1)+1)||’.’||file#||’.log’ from v$datafile;
Then run the scripts and check the failing records. In my operation, each file need about four minutes to verify.
REFERENCE:
Script To Run DBV On All Datafiles Of the Database [ID 352907.1]
Veritas Volume Manager Troubleshooting Guide (5.0)

Posted in VCS | Leave a comment

RMAN-08137

BACKGROUND:
Several days ago, I stoped a standby database, and then set the log_archive_dest_state to “DEFER”.But an error occured when delete the archive log after backup successfully. The database version is 10.2.0.4. The backup scripts as below:
$RMAN target $TARGET_CONNECT_STR catalog rman10g/rman10g msglog $RMAN_LOG_FILE append << EOF
RUN {
SQL ‘alter system archive log current’;
ALLOCATE CHANNEL ch00 TYPE ‘SBT_TAPE’;
BACKUP
filesperset 20
FORMAT ‘al_%s_%p_%t’
ARCHIVELOG ALL DELETE INPUT;
RELEASE CHANNEL ch00;
resync catalog;
}
EOF
Error information such as:
RMAN-08137: WARNING: archive log not deleted as it is still needed
archive log filename=/yms_arc/YMS_1_10710_713119317.arc thread=1 sequence=10710
OPERATION:
$ rman target / catalog rman10g/rman10g
RMAN>crosscheck archivelog all;
RMAN>resync catalog;
RMAN>delete noprompt archivelog until time = ‘sysdate-1’ backed up 1 times to device type sbt;
RMAN>resync catalog;
REFERENCE:
Rman-08137: Warning: Archive Log Not Deleted As It Is Still Needed Rman-08137 [ID 374421.1]
RMAN-08137 when Deleting Archivelog Files [ID 373066.1]

Posted in Oracle | Leave a comment

Stream test

BACKGROUND:
Source DB: sourcedb, archivelog mode,Oracle 10.2.0.4
Destination DB: destdb, Oracle 10.2.0.4
OPERATION:
1.Check initialize parameters in these databases.
(1) GLOBAL_NAME = FALSE
(2) JOB_QUEUE_PROCESSES >=2
(3) STREAMS_POOL_SIZE >0 or SGA_TARGET >0
(4) PARALLEL_MAX_SERVERS >=8
(5) OPEN_LINKS >=4
2.Create streams administrator schema and grant proper privilege in both source DB and destination DB.
The administrator account can be different in the two databases. Such as in the source DB it named “TESTS” while used “TESTD” in the destination DB. Grant the following privileges:
SQL> grant connect,resource,dba to TESTS/TESTD;
SQL> grant read,write on directory strm_dmp to TESTS/TESTD; # for instantiate databases objects by Data Pump
SQL> begin
dbms_streams_auth.grant_admin_privilege(grantee=>’TESTS/TESTD’,grant_privileges=>true);
end;
3.Create DB link in these databases.
Connected as TESTS/TESTD.
In source DB: SQL> create database link destdb connect to testd identified by testdpw using ‘destdb’;
In destination DB: SQL> create database link sourcedb connect to tests identified by testspw using ‘sourcedb’;
4.Create queues in these databases.
In source DB:
SQL> conn tests/testspw;
SQL> begin
dbms_streams_adm.set_up_queue(queue_table=>’tests.sourcedb_queue_table’, queue_name=>’tests.sourcedb_queue’);
end;
In destination DB:
SQL> conn testd/testdpw;
SQL> begin
dbms_streams_adm.set_up_queue(queue_table=>’testd.destdb_queue_table’, queue_name=>’testd.destdb_queue’);
end;
5.Add supplement log for the replication table in source DB.
SQL> alter table oraaud.lot_info add supplemental log data (all) columns;
6.Create capture process and propagation process in source DB.
SQL> begin
dbms_streams_adm.add_table_rules(table_name=>’ORAAUD.LOT_INFO’, streams_type=>’capture’, streams_name=>’capture_info’, queue_name=>’tests.sourcedb_queue’, include_dml=>true, include_ddl=>true, inclusion_rule=>true);
end;
SQL> begin
dbms_streams_adm.add_table_propagation_rules(table_name=>’ORAAUD.LOT_INFO’, streams_name=>’sourcedb_to_destdb’, source_queue_name=>’tests.sourcedb_queue’, destination_queue_name=>’testd.destdb_queue@destdb’, include_dml=>true, include_ddl=>true, inclusion_rule=>true, queue_to_queue=>true);
end;
7.Create apply process in destination DB.
SQL> begin
dbms_streams_adm.add_table_rules(table_name=>’ORAAUD.LOT_INFO’, streams_type=>’apply’, streams_name=>’apply_info’, queue_name=>’testd.destdb_queue’, include_dml=>true, include_ddl=>true, inclusion_rule=>true);
end;
8.Instantiate database objects using Data Pump
In source DB:
SQL> col get_system_change_number form 9999999999999999999
SQL> select dbms_flashback.get_system_change_number from dual;
GET_SYSTEM_CHANGE_NUMBER

Posted in Oracle | Leave a comment

Send messages on Linux via AT

Connected the modem via COM, then used “chat” command to send messages.
$answer = system “/usr/sbin/chat -v -f $message_file < /dev/ttyS0 > /dev/ttyS0”
If successful will return 0.
$message_file format as below:
CDMA modem:
TIMEOUT 20
” ‘AT+CMGF=1’
OK ‘AT+CMGS=1,189********,,0,messages’
OK ”

New CDMA modem:
TIMEOUT 20
” ‘AT+CMGS=189********’
” ‘messages^Z’
” ”
GSM modem:
TIMEOUT 20
OK ‘AT+CMGS=189********’
‘>’ ‘messages^Z’
OK ”

Posted in linux | Leave a comment

Sendmail configuration error in Linux

These days I was very busy because my partner’ knee was injured when playing basketball.
When I configured sendmail process on Linux system, there were two types of errors.
The configuration files /etc/mail/mailertable, /etc/hosts, /etc/mail/local-host-names were correctly configured.
(1) There were two domains, but the mail could be sent to one domain successfully, but the other one failed due to the mailgate server rejected the mails.
Via tested by using the following command:
# telnet 192.168.*.* 25
helo domain_name
mail from:addresser
OPTION: Add the server name into the DNS server of the domains.
(2) Tested the connection between the server and the mail host through the below comand:
# sendmail -v recipients < /dev/null
domain_name: Name server timeout
recipients… Transient parse error — message queued for future delivery
recipients… queued
OPTION: Check the file /etc/resolv.conf. Add the DNS server list in this file. Such as:
nameserver 192.168.11.100

Posted in linux | Leave a comment

ORA-01192 when create controlfile

SCENARIO:
I used the command “backup as copy database” to duplicate a database, but when recreated the controlfile following errors occured:
CREATE CONTROLFILE REUSE DATABASE “MES12DB” NORESETLOGS ARCHIVELOG
*
ERROR at line 1:
ORA-01503: CREATE CONTROLFILE failed
ORA-01192: must have at least one enabled thread
SOLUTION:
Change “NORESETLOGS” to “RESETLOGS”.

Posted in Oracle | Leave a comment

Invalid triggers

I found some of the triggers’ status became invalid. But if I recompiled it, the status changed to be valid. Some generalized rules for the invalidation of schema objects as below:
1. If you change the definition of a schema object, dependent objects are cascade invalidated.
2. If you revoke privileges from a schema object, dependent objects are cascade invalidated.
3. There are a small number of situations where altering the definition of a schema object does not invalidate dependent objects.
Oracle would recompile the trigger when access it, and then it changed to be valid if no compile error.
The invalid trigger such as:
OWNER OBJECT_NAME STATUS LAST_DDL_TIME

Posted in Oracle | Leave a comment

festival

In the Mid-Autumn Festival, I went to HuangShan with several classmates of university. But unfortunately I got cold when arrived home. I am healthy now.

Posted in Gossip | Leave a comment