How To replace M4000/M5000 XSCF board

XSCF or eXtended System Control facility unit is known as service processor for M-Series server.
The XSCF unit is a cold replacement component. This means the entire server must be powered off and the power cords disconnected to replace the XSCF unit. Execute “showhardconf” or “showstatus” command to make sure the XSCF is faulted.

XSCF> showhardconf


*   XSCFU Status:Degraded,Active; Ver:0101h; Serial:BFxxxxxxx  ;
+ FRU-Part-Number:CF00541-0481 04   /541-0481-04       ;

I hv been asked by some people on how to backup XSCF configuration before replacing the XSCF board. They presume the XSCF configuration need to backup first because there are only 1 XSCF board on M4000/M5000 server. In fact no need to backup the config because there was a backup copy of XSCF configuration on Operator Panel, both XSCF and Operator Panel always synchronizing its data each time XSCF bootup or there was a changes on XSCF configuration. Thats way there is a restriction to replace both XSCF and Operator Panel simultaneously.

Okay, if you ready to replace the XSCF board, below are the instruction:

[Shutdown the OS, Power off the server and unplug the power cord and XSCF ethernet cables.

[Use proper ESD grounding technique and anti static mat, replace the XSCF board:

*M4000 XSCF board location:

*M5000 XSCF board location:

# Plug all cables then power on server, wait till the new XSCF board startup. It will reboot around 2-3 times. you will see the messages of XSCF and OPNL synchronize its data during startup:

…..
initialize XSCF common database (OWN)  —  complete
synchronize setup data (XSCF -> OPNL)  —  complete
initialize XSCF common database (ACTIVE)  —  complete
wait for database synchronization  —  complete
execute S00clis_all  —  complete
…..

[If the boot process is finished, then try to log in. If you see below error messages:

XCP version of Panel EEPROM and XSCF FMEM mismatched,
Panel EEPROM=1090, XSCF FMEM=1100

Then you need to upgrade the XSCF firmware. Download the latest firmware from MOS, then perform firmware upgrade.

[XSCF FIRMWARE UPGRADE:

*VIA FTP:

XSCF> getflashimage -l        >CHECK CURRENT FIRWARE
XSCF> getflashimage -u AZIZ ftp://10.32.17.61/FFXCP1112.tar.gz    >> aziz is username, 10.32.17.61 is ftp server on my laptop
Password: *******
0MB received
1MB received
2MB received
3MB received
4MB received
5MB received
6MB received
7MB received
8MB received

Download successful: 42660 Kbytes in 50 secs (987.298 Kbytes/sec)
Checking file…
MD5: 73ca6370dc6c636f2e3845b66caa203a
XSCF> getflashimage -l

XSCF> flashupdate -c check -m xcp -s 1112
XCP update is possible with domains up

XSCF> flashupdate -c update -m xcp -s 1112
The XSCF will be reset. Continue? [y|n] :y
Checking the XCP image file, please wait a minute
XCP update is started (XCP version=1112:last version=1081)
OpenBoot PROM update is started (OpenBoot PROM version=02180000)

*VIA USB:
– checked version, update firmware
XSCF> version -c xcp -v

XSCF> getflashimage file:///media/usb_msd/FFXCP1112.tar.gz

Note the different of M-Series firmware file below:
getflashimage file:///media/usb_msd/IKXCP1112.tar.gz    >>for M3000
getflashimage file:///media/usb_msd/FFXCP1112.tar.gz    >>for M4000/5000
getflashimage file:///media/usb_msd/DCXCP1112.tar.gz    >>for M8000/M9000

XSCF> flashupdate -c check -m xcp -s 1112
XCP update is possible with domains up

XSCF> flashupdate -c update -m xcp -s 1112

XSCF> version -c xcp -v
XCP0 (Reserve): 1110 <<XCP0 will take few minutes to finish update
OpenBoot PROM : 02.29.0000
XSCF          : 01.11.0000
XCP1 (Current): 1112 <<updated already
OpenBoot PROM : 02.29.0000
XSCF          : 01.11.0002
OpenBoot PROM BACKUP

XSCF> version -c cmu -v

[If you hv finished on upgrading the firmware or there are no firmware issue, then make sure the device status again with “showhardconf” and “showstatus” command.

[Continue powering on the domain:

XSCF> poweron -d0
DomainIDs to power on:00
Continue? [y|n] :y
Poweron canceled due to invalid system date and time.
XSCF>

Wait, did you see above error messages? yes the domain unable to boot because the system date and time is invalid.

#set the new date and time, Example for 24 Oct 2012 @ 10:23:

XSCF> setdate -u -s 102410232012.00
Wed Oct 24 10:23:00 UTC 2012
The XSCF will be reset. Continue? [y|n] :y

#If you want to change the timezone, run the settimezone command. example:

XSCF> settimezone -c settz -s Asia/Jakarta

#DONE. Now power on the domain again.

Advertisements

How to reset RSC password

If you forgot the RSC password for V480, V880, V490, and V890 or other old legacy Sun machines, here are the procedure to reset the password. Requirement: SUNWrsc package

If you dont hv the package, please download the latest package from My Oracle Support:

  1. login to support.oracle.com
  2. click on “Patches & Updates” in the top menu
  3. in the search window (located on the right) click “Product or Family (Advanced)”
  4. in the updated search window type “Sun Remote” in the “Product” box, then select “Sun Remote System Control”
  5. Click the “Release” box (which says “Select up to 10”, in that box click “Sun Remote System Control” and then select the version “Sun Remote System Control 2.2.3”.
  6. In the new window you can now download RSC 2.2.3 (called p10264451_223.zip) by marking it and clicking “download”.

[reset RSC password:

Login with root privilege, install the package, then run rscadm command.

Prefix >> #/usr/platform/<platform>/rsc/rscadm userpassword <username>

[example for v890:
# /usr/platform/SUNW,Sun-Fire-V890/rsc/rscadm userpassword admin
Password:
Re-enter Password:

You can also reset the whole configuration by running “rsc-config” command.

Update:

If the SCADM not available, download the RSC software from MOS:

RSC Software Download (steps to download the latest RSC software):
1. Login to MOS and select “Patches and Updates Tab”
2. In “Patch Search” on the Top right panel, Click on “Product or Family (Advanced Search)”
3. In the “Product Is” pull-down select “Sun remote System Control”??
4. In the next pull down “Release is” select the RSC version (2.2.2 or 2.2.3).
5. Select OS and click “Search” (will get a list with RSC releases & patches)
6. Select the desired RSC Release (packages) or patch
7. Click Download on the Right

The packages for Solaris 8 and 9 (and later) are both in the zip file. There are two options of the zip file, 32bit, and 64bit, but they both have the same checksums, so there are no differences: p10264452_223_SOLARIS64.zip (p10264451_223_SOLARIS.zip)

Install the software as you would any Package with pkgadd.

Command syntax is same:
#/usr/platform/`uname -i`/rsc/rscadm userpassword admin
[To reconfigure the card run the command:
# usr/platform/`uname -i`/rsc/rsc-config

[If you had installed the software before and believe the card is configured check the setup:
# usr/platform/`uname -i`/rsc/rscadm show

How to clear fmadm log or FMA faults log

Here are the step by step of clearing the FMA faults on most of Oracle/Sun server. Work perfectly on Solaris 10:

Clear fmadm log, Example :
———————————-
For each fault listed in the ‘fmadm faulty’ run:
# fmadm repair <uuid>   (OR if the components are listed instead, e.g.:)
# fmadm repair 568a9180-7308-4535-92e6-a7c17ef1bfef

[Clear ereports and resource cache:
# cd /var/fm/fmd
# rm e* f* c*/eft/* r*/*

[Clearing out FMA files with no reboot needed:
svcadm disable -s svc:/system/fmd:default
cd /var/fm/fmd
find /var/fm/fmd -type f -exec ls {} \;
find /var/fm/fmd -type f -exec rm {} \;
svcadm enable svc:/system/fmd:default

[Reset the fmd serd modules:
# fmadm reset cpumem-diagnosis
# fmadm reset cpumem-retire
# fmadm reset eft
# fmadm reset io-retire

Configure Persistent Binding for Tape Devices

If you have a lot of tape drives connected to your host/server, sometimes its not easy to know the tape drives order between tape library and drive order on host.

Lets say, all tape drive order on tape library and host are starting from 0 (zero).You try to mount cartridge to tape drive number 6 on tape library using the utility (example SL Console or move command). After mounting succeeded, you may guess that the cartridges is mounted to drive order number 6 also on your host/server. You run the command “mt -f /dev/rmt/6cbn status” to check the result, but guess what? the cartridges is not in there..

This is because, cbn number is picked automatically by devfsadm during enumeration of new devices. Every new tape logical unit number (LUN) found by devfsadm gets the next available number in “/dev/rmt”. So, cbn number order its not same with drive number order on tape library.

Since the /dev/rmt name depends on the order in which devices appear in the device tree, it changes from host to host. For a given tape drive that is seen by two or more different hosts, the /dev/rmt link can be different on each of these hosts. Also, if the drive is replaced the links change unless the vendor provides a way to retain the port World-Wide Name (PWWN) of the drive.

So, now what we need to do is just to configure persistent binding for all tape devices. on Solaris, we only need to edit “/etc/devlink.tab” file. First of all, list all tape drives on current configuration:

Example:

# ls -ltr /dev/rmt/*cbn

lrwxrwxrwx 1 root root 75 Mar 1 15:50 /dev/rmt/3cbn -> ../../devices/pci@8,600000/SUNW,emlxs@1,1/fp@0,0/st@w500108f00056a81c,0:cbn
lrwxrwxrwx 1 root root 75 Mar 1 15:50 /dev/rmt/2cbn -> ../../devices/pci@8,600000/SUNW,emlxs@1,1/fp@0,0/st@w500108f00056a81b,0:cbn
lrwxrwxrwx 1 root root 75 Mar 1 15:50 /dev/rmt/1cbn -> ../../devices/pci@8,600000/SUNW,emlxs@1,1/fp@0,0/st@w500108f00056a81a,0:cbn
lrwxrwxrwx 1 root root 75 Mar 1 15:50 /dev/rmt/0cbn -> ../../devices/pci@8,600000/SUNW,emlxs@1,1/fp@0,0/st@w500108f00056a81d,0:cbn

So, there are 4 tape drives, the current config is like this:
drive 0cbn:  w500108f00056a81d,0
drive 1cbn:  w500108f00056a81a,0
drive 2cbn:  w500108f00056a81b,0
drive 3cbn:  w500108f00056a81c,0

If you check on tape library, the drive order is like this:

drive o:  w500108f00056a81a,0
drive 1:  w500108f00056a81b,0
drive 2:  w500108f00056a81c,0
drive 3:  w500108f00056a81d,0

Now, ho to match drives order between tape library and your host? okay, here we go:

1. Edit “/etc/devlink.tab” file.

2. add these line:

type=ddi_byte:tape;addr=w500108f00056a81a,0;    rmt/0\M0
type=ddi_byte:tape;addr=w500108f00056a81b,0;    rmt/1\M0
type=ddi_byte:tape;addr=w500108f00056a81c,0;    rmt/2\M0
type=ddi_byte:tape;addr=w500108f00056a81d,0;    rmt/3\M0

3. Remove existing links from /dev/rmt by running the rm /dev/rmt/* command.

4. run “devfsadm” command. run “reboot — -r” if needed.

5. Finish,

See the result by manualy mounting cartridges and check with “mt -f /dev/rmt/Xcbn status” command.

Type “man devlinks” for more information..

Tape Logical Device Files

LTO Ultrium

As we may know already that, tape device are in “/dev/rmt” directory. Actually tapes creates symbolic links in the “/dev/rmt” directory to the actual tape device special files under the “/devices” directory tree. tapes searches the kernel device tree to see what tape devices are attached to the system.

Each tape LUN seen by the system is represented by 24 minor nodes in the form of /dev/rmt/N, /dev/rmt/Nb, and /dev/rmt/Nbn, where N is an integer counter starting from 0. This number is picked by devfsadm during enumeration of new devices. Every new tape logical unit number (LUN) found by devfsadm gets the next available number in /dev/rmt.

Continue reading

Run Levels for Various Unices

From Wikipedia Page, The term runlevel refers to a mode of operation in one of the computer operating systems that implement Unix System V-style initialization. Conventionally, seven runlevels exist, numbered from zero to six; though up to ten, from zero to nine, may be used. S is sometimes used as a synonym for one of the levels.

In standard practice, when a computer enters runlevel zero, it halts, and when it enters runlevel six, it reboots. The intermediate runlevels (1-5) differ in terms of which drives are mounted, and which network services are started. Lower run levels are useful for maintenance or emergency repairs, since they usually don’t offer any network services at all. The particular details of runlevel configuration differ widely among operating systems, and slightly among system administrators.

The runlevel system replaced the traditional /etc/rc script used in Version 7 Unix.

Run Levels in Solaris
S, s
Single user mode. Doesn’t require properly formated /etc/inittab. Filesystems required for basic system operation are mounted.

0
Go into firmware (sparc)

1
System Administrator mode. All local filesystems are mounted. Small set of essential system processes are running. Also a single user mode.

2
Put the system in multi-user mode. All multi-user environment terminal processes and daemons are spawned.

3
Extend multi-user mode by making local resources available over the network.

4
Is available to be defined as an alternative multi-user environment configuration. It is not necessary for system operation and is usually not used.

5
Shut the machine down so that it is safe to remove the power. Have the machine remove power, if possible.

6
Reboot

a, b, c
Process only those /etc/inittab entries having the a, b, or c run level set. These are pseudo-states, which may be defined to run certain commands, but which do not cause the current run level to change.

Q, q
Re-examine /etc/inittab.

Run Levels in HP-UX
0
System is completely shut down. All processes are terminated and all file systems are unmounted.

1,s,S
Single-user mode. All system services and daemons are terminated and all file systems are unmounted.

2

Multi-user mode, except NFS is not enabled.

3
Multi-user mode. This is the normal operational default state. NFS is enabled.

4
Multi-user mode with NFS and VUE. (VUE is HP’s desktop, kinda like CDE)

6
Reboot.

Run Levels in OpenBSD
-1
Permanently insecure mode – always run system in level 0 mode.

0
Insecure mode – immutable and append-only flags may be changed. All devices may be read or written subject to their permissions.

1
Secure mode – system immutable and append-only flags may not be turned off; disks for mounted filesystems, /dev/mem, and /dev/kmem are read-only.

2
Highly secure mode – same as secure mode, plus disks are always read-only whether mounted or not and the settimeofday(2) system call can only advance the time.

Run Levels in ULTRIX, Digital UNIX / Tru64
0
System is completely shut down. All processes are terminated and all file systems are unmounted.

1
Single-user mode. All system services and daemons are terminated and all file systems are unmounted.

2
Multi-user mode, except NFS is not enabled.

3
Multi-user mode. This is the normal operational default state. NFS is enabled.

4
Not Used

5
Not Used

6
Reboot

Run Levels in Irix
0
Shut the machine down so it is safe to remove the power. Have the machine remove power if it can.

1
Put the system into system administrator mode. All filesystems are mounted. Only a small set of essential kernel processes run. This mode is for administrative tasks such as installing optional utilities packages. All files are accessible and no users are logged in on the system.

2
Put the system into multi-user state. All multi-user environment terminal processes and daemons are spawned. Default.

3
Start the remote file sharing processes and daemons. Mount and advertise remote resources. Run level 3 extends multi-user mode and is known as the remote-file-sharing state.

4
Define a configuration for an alternative multi-user environment. This state is not necessary for normal system operations; it is usually not used.

5
Stop the IRIX system and enter firmware mode.

6

Stop the IRIX system and reboot to the state defined by the initdefault entry in inittab.

a,b,c
Process only those inittab entries for which the run level is set to a, b, or c. These are pseudo-states that can be defined to run certain commands but do not cause the current run level to change.

Q,q
Re-examine inittab.

S,s
Enter single-user mode. When the system changes to this state as the result of a command, the terminal from which the command was executed becomes the system console.

Run Levels in SYSV
The following is from a SYSV text book, it’s the generally used run level for SYSV systems.

0
Power-down state. Shuts machine down gracefully so that it can be turned off. Some models turn off automatically.

s
Single user state. This run level should be used when installing or removing software utilities, checking file systems, or using Maintenance (/install) file system. It is similar to run level 1; however, in run level s, multi-user file systems are unmounted and daemons are stopped. The terminal issuing the init s becomes the console.

1
Administrative state. In run level 1, file systems required for multi-user operations are mounted. And loggias requiring access to multi-user file systems can be used.

2
Multi-user state. File systems are mounted and normal user services are started.

3
Network File System (NFS) state. Prepares your system to use NFS.

4
User-defined

5
Virtually the same as System State 6. See /sbin/rc0 script for details. Early versions of UNIX used this as an entry to a firmware interface.

6
Power-down and reboot to the state defined by the initdefault entry in the /etc/inittab file.

Run Levels in Linux
0
Halt the system.

1
Single-user mode.

2-4
Multi-user modes. Usually identical. Level 2 or 3 is default (dependent on distro).

5
Multi-user with graphical environment. This applies to most (but not all) distros.

6
Reboot the system and return to default run level.

Setting up NFS (Network File System) share

NFS (Network File System) is a protocol used by UNIX/Linux computers to share disks across a network. Similar to the Common Internet File Services (CIFS) protocol used by Windows, NFS is older and more light-weight, and performs much more efficiently on UNIX and Linux systems.

Setting up an NFS share

As an example, we’ll be sharing the /home directory with all clients on a network. Sharing /home is a good idea if you’re running the Network Information Service (NIS) server that I covered in the May issue, as it allows you to use the same desktop and configuration settings on every computer attached to your network.

First, open /etc/exports as root using your favourite text editor. If this file doesn’t exist you will need to create it. Add the following to the file:

/home 192.168.1.0/255.255.255.0(rw)

This line shares the /home directory with all machines on the 192.168.1.0 network and allows each machine to have both read and write access to the share. Change this network address to one that is appropriate for your network. Read only access can be specified by changing (rw) to (ro).

You can individually specify a list of machines that will have access to the share, and tailor the access each machine has to the share, using a line such as:

/home 192.168.1.2(rw) 192.168.1.3(ro)

In this example, 192.168.1.2 has both read and write access to the share while 192.168.1.3 has only read access. Any other machine on your network will be unable to mount the share.

Continue reading