Disk Mirroring using Solaris Volume Manager

From Wikipedia, Solaris Volume Manager (SVM; formerly known as Online: DiskSuite, and later Solstice DiskSuite) is a software package for creating, modifying and controlling RAID-0 (concatenation and stripe) volumes, RAID-1 (mirror) volumes, RAID 0+1 volumes, RAID 1+0 volumes, RAID-5 volumes, and soft partitions.

Version 1.0 of Online: DiskSuite was released as an add-on product for SunOS in late 1991;  the product has undergone significant enhancements over the years. SVM has been included as a standard part of the Solaris Operating System since Solaris 8 was released in February 2000.

SVM is similar in functionality to later software volume managers such as FreeBSD vinum, allowing metadevices (virtual disks) to be concatenated, striped or mirrored together from physical ones. It also supports soft partitioning, dynamic hot spares, and growing metadevices. The mirrors support dirty region logging (DRL, called resync regions in DiskSuite) and logging support for RAID-5.

The ZFS file system, added in the Solaris 10 6/06 release, has its own integrated volume management capabilities, but SVM continues to be included with Solaris for use with other file systems.

Example disk mirroring using SVM:

DISK:
c0t0d0
c0t1d0

# prtvtoc /dev/rdsk/c0t0d0s2 |fmthard -s – /dev/rdsk/c0t1d0s2

{If you got an error:

fmthard: Partition 2 specifies the full disk and is not equal full size of disk

Then you will first need to do a format on your second disk so it has a Solaris label.

bash-3.00# format
Searching for disks…done

select your 2nd disk

format> p
WARNING – This disk may be in use by an application that has
modified the fdisk table. Ensure that this disk is
not currently in use before proceeding to use fdisk.
format> fdisk
No fdisk table exists. The default partition for the disk is:

a 100% “SOLARIS System” partition

Type “y” to accept the default partition,  otherwise type “n” to edit the
partition table.
y
format> label
Ready to label disk, continue? yes

{run metadb command to create replicas of the metadevice  state  database:

#metadb -a -f -c 3 c0t0d0s7 c0t1d0s7

{then run metainit to configure metadevice each slices:

# metainit -f d11 1 1 c0t0d0s0
# metainit d12 1 1 c0t1d0s0
# metainit d10 -m d11
# metaroot d10

# metainit -f d21 1 1 c0t0d0s1
# metainit d22 1 1 c0t1d0s1
# metainit d20 -m d21

{edit /etc/vfstab:
/dev/md/dsk/d20 –       –       swap    –       no      –
/dev/md/dsk/d10 /dev/md/rdsk/d10        /       ufs     1       no      logging

#reboot (for x86)

#init 0 (for Sparc) then from OK Prompt:
{0} ok setenv boot-device disk0 disk1
{0} ok boot

{After your Solaris booted up, then:

# metattach d10 d12
# metattach d20 d22

{check the Synchronizing process:
#metastat | grep %

{to continuously monitoring the metastat result, run this command:

#while true; do metastat | grep %; sleep 20; done;

Last step, run installgrub to MBR on second disk, Otherwise you wouldn’t be able to boot from your second disk once your first disk has failed.

{For x86 machines set the active partition for the disks:
bash-3.00# fdisk -b /usr/lib/fs/ufs/mboot /dev/rdsk/c?t?d?p?

{If making root partition raid then make second disk bootable:
===> For x86 machines
bash-3.00# /sbin/installgrub -m /boot/grub/stage1 /boot/grub/stage2 /dev/rdsk/c?t?d?s?

===>And for Sparc machines
bash-3.00# installboot /usr/platform/`uname -i`/lib/fs/ufs/bootblk /dev/rdsk/c?t?d?s?

——

bash-3.00# installgrub -m /boot/grub/stage1 /boot/grub/stage2 /dev/rdsk/c0t1d0s0
Updating master boot sector destroys existing boot managers (if any).
continue (y/n)?y
stage1 written to partition 0 sector 0 (abs 16065)
stage2 written to partition 0, 265 sectors starting at 50 (abs 16115)
stage1 written to master boot sector
bash-3.00#

Reference:
Solaris Volume Manager Administration Guide

How to Configure Tape Drive on Solaris for Veritas Netbackup

Understanding the SCSI Passthru Drivers

NetBackup Media Manager provides its own driver for communicating with SCSI-controlled robotic peripherals.

This driver is called the SCSA (Generic SCSI passthru driver), also referred to as the sg driver.

To manage the sg driver

Perform the following steps as the root user.

1. Determine if an sg driver is loaded by using the following command:

/usr/sbin/modinfo | grep sg

141 fc580000 2d8c 116 1 sg (SCSA Generic Revision: 3.4d)

153 fc7fa000 1684 49 1 msgsys (System V message facility)

2. Remove the existing driver:

/usr/sbin/rem_drv sg

/usr/bin/rm -f /kernel/drv/sg.conf

Install SG Driver

To install the driver run the following command:

/usr/openv/volmgr/bin/driver/sg.install

Once the driver has been installed, it is not necessary to reboot the system or run the sg.install command during or after each system boot.

Configuring SG and ST Drivers

This procedure contains instructions for configuring the sg driver for SCSI targets 0 thru 6 and 8 thru 15 for fast or wide adapter cards.

In this procedure, you execute sg.build to add these targets to the st.conf, sg.conf and sg.links files. Adjust the -mt and -ml parameters to create the range of targets and LUNs required by your configuration.

To configure drivers

Execute the sg.build script to add target IDs 0-6, 8-15, and LUNs 0-1 to the following files:

/usr/openv/volmgr/bin/driver/st.conf

/usr/openv/volmgr/bin/driver/sg.conf

/usr/openv/volmgr/bin/driver/sg.links

/usr/openv/volmgr/bin/sg.build all -mt 15 -ml 1

The -mt 15 parameter specifies themaximum target ID that is in use on any SCSI bus (or bound to a fibre channel device).The -ml 1 parameter specifies the maximum target LUN that is in use on any SCSI bus (or by a fibre channel device).

The file /usr/openv/volmgr/bin/driver/st.conf is used to replace the following seven entries in the /kernel/drv/st.conf file:

name=”st” class=”scsi” target=0 lun=0;

name=”st” class=”scsi” target=1 lun=0;

name=”st” class=”scsi” target=2 lun=0;

name=”st” class=”scsi” target=3 lun=0;

name=”st” class=”scsi” target=4 lun=0;

name=”st” class=”scsi” target=5 lun=0;

name=”st” class=”scsi” target=6 lun=0;

Edit the /kernel/drv/st.conf file.

Place a # in column one of each line of the seven default entries.

The temporary file ./st.conf contains the entries that you need to insert into /kernel/drv/st.conf.

Reboot the system with the reconfigure option (boot -r or reboot — -r).

Verify that the system created device nodes for all the tape devices using the following command: ls -l /dev/rmt/*cbn

Install the new sg driver configuration.

/usr/bin/rm -f /kernel/drv/sg.conf

/usr/openv/volmgr/bin/driver/sg.install

when run sg install command sg.conf will copy to /kernel/drv/sg.conf

and sg.links will copy to /etc/devlink.tab

Verify that the sg driver found all the robots, tape drives,

The script /usr/openv/volmgr/bin/sg.build adds the proper entries to the sg.links and sg.conf files. Before running the script, make sure that all devices are powered on and connected to the HBA.An example of the additional entries in /usr/openv/volmgr/bin/driver/sg.conf

follows:

name=”sg” parent=”fp” target=0 lun=0 fc-port- wwn=”22000090a50001c8″;name=”sg” parent=”fp” target=0 lun=1 fc-port-wwn=”22000090a50001c8″;

An example of the additional entries in /usr/openv/volmgr/bin/driver/sg.links

follows:

type=ddi_pseudo;name=sg;addr=w22000090a50001c8,0; sg/c\N0t\A1l0type=ddi_pseudo;name=sg;addr=w22000090a50001c8,1; sg/c\N0t\A1l1

Preventing Possible System Problems

VERITAS recommends adding the following forceload statements to the /etc/system file. These statements prevent the st and sg drivers from being unloaded from memory

forceload: drv/st

forceload: drv/sg

 

Other statements may be necessary for various fibre channel drivers, such as the following example for JNI drivers. This statement prevents the named driver from being unloaded from memory.

forceload: drv/fcaw

SSO Configurations With More Than 16 Tape Drives

When the number of tape devices that are configured approaches 16, changes in tape device status may not be visible to all media servers in a Shared Storage Option (SSO) configuration. This is because the default maximum size of IPC message queues may not be large enough.

VERITAS recommends adding the following statements to the /etc/system file. These statements increase the maximum number of messages that can be created, and the number of bytes per queue. A reboot is necessary for the changes to take effect.

set msgsys:msginfo_msgtql=512

set msgsys:msginfo_msgmnb=65536

Checking and Repairing File system with fsck

Checking and Repairing File system with fsck

fsck is a Unix utility for checking and repairing file system inconsistencies. File system can become inconsistent due to several reasons and the most common is abnormal shutdown due to hardware failure , power failure or switching off the system without proper shutdown . Due to these reasons the superblock in a file system is not updated and has mismatched information relating to system data blocks, free blocks and inodes .

Modes of operation :

fsck operates in two modes interactive and non interactive :

interactive : the fsck examines the file system and stops at each error it finds in the file system and gives the problem description and ask for user response usually whether to correct the problem or continue without making any change to the file system.

noninteractive :fsck tries to repair all the problems it finds in a file system without stopping for user response useful in case of a large number of inconsistencies in a file system but has the disadvantage of removing some useful files which are detected to be corrupt .

If file system is found to have problem at the booting time non interactive fsck fsck is run and all errors which are considered safe to correct are corrected. But if still file system has problems the system boots in single user mode asking for user to manually run the fsck to correct the problems in file system

Running fsck :

fsck should always be run in a single user mode which ensures proper repair of file system . If it is run in a busy system where the file system is changing constantly fsck may see the changes as inconsistencies and may corrupt the file system .

if the system can not be brought in a single user mode fsck should be run on the partitions ,other than root & usr , after unmounting them . Root & usr partitions can not be unmounted . If the system fails to come up due to root/usr files system corruption the system can booted with CD and root/usr partitions can be repaired using fsck.

command syntax:

fsck [ -F fstype] [-V] [-yY] [-o options] special

-F fstype type of file system to be repaired ( ufs , vxfs etc)

-V verify the command line syntax but do not run the command

-y or -Y Run the command in non interactive mode – repair all errors encountered without waiting for user response.

-o options Three options can be specified with -o flag

b=n where n is the number of next super block if primary super block is corrupted in a file system .

p option used to make safe repair options during the booting process.

f force the file system check regardless of its clean flag.

special – Block or character device name of the file system to be checked/repaired – for example /dev/rdsk/c0t3d0s4 .Character device should be used for consistencies check & repair

phases:

fsck checks the file system in a series of 5 pages and checks a specific functionality of file system in each phase.

** phase 1 – Check Blocks and Sizes

** phase 2 – Check Pathnames

** phase 3 – Check Connectivity

** phase 4 – Check Reference Counts

** phase 5 – Check Cylinder Groups

Error messages & Corrective action :

1. Corrupted superblock – fsck fails to run

If the superblock is corrupted the file system still can be repaired using alternate superblock which are formed while making new file system .

the first alternate superblock number is 32 and others superblock numbers can be found using the following command :

newfs -N /dev/rdsk/c0t0d0s6

for example to run fsck using first alternate superblock following command is used

fsck -F ufs -o b=32 /dev/rdsk/c0t0d0s6

2.Link counter adjustment : fsck finds mismatch between directory inode link counts and actual directory links and prompts for adjustment in case of interactive operation .Link count adjustments are considered to be a safe operation in a file system and should be repaired by giving ‘y’ response to the adjust ? prompt during fsck.

3.Free Block count salvage : During fsck the number of free blocks listed in a superblock and actual unallocated free blocks count does not match .fsck inform this mismatch and asks to salvage free block count to synchronize the superblock count. This error can be corrected without any potential problem to the file system or files.

4.Unreferenced file reconnection : While checking connectivity fsck finds some inodes which are allocated but not referenced -not attached to any directory . Answering y to reconnect message by fsck links these files to the lost+found directory with their inode number as their name .

To get more info about the files in lost+found ‘file’ command can be used to see the type of files and subsequently they can be opened in their applications or text editors to find out about their contents. If the file is found to be correct it can be used after copying to some other directory and renaming it.

Next Steps :

The fsck topic here paid a brief visit to some of general aspects of fsck but a detailed document on fsck is comming up to cover most of the error messages and there explanation so watch out if you are looking for more details.

fsck is covered in most of the sysadmin books and you can buy some of the books from amazon.com or besttechbooks.com which is a amazon affiliate site with focus obly on technical books.