Hardware Diagnostics for Oracle Sun systems, A Toolkit for System Administrators

The easiest way to diagnose the hardware related problem on Oracle Sun server is by using of OBP OK Prompt commands, the Power On Self Test (POST), and the status LEDs on system boards.

ou can diagnose hardware related problems on Oracle Sun server and desktop products. With these low-level diagnostics, you can establish the state of the system and attached devices. For example, you can determine if a device is recognized by the system and working properly, or you can also obtain useful system configuration information.

OBP DIAGNOSTIC COMMANDS AND TOOLS
OBP is a powerful, low-level interface to the system and devices attached to the system (OBP is also known as the ok prompt). By entering simple OBP commands, you can learn system configuration details such as the ethernet address, the CPU and bus speeds, installed memory, and so on. Using OBP, you can also query and set system parameter values such as the default boot device, run tests on devices such as the network interface, and display the SCSI and SBUS devices attached to the system.

Below are the available commands in OBP OK prompt:
—————————-
banner
Displays the power on banner. The banner includes information such as CPU speed, OBP revision, total system memory, ethernet address and hostid.

devalias alias path
Defines a new device alias, where alias is the new alias name and path is the physical path of the device. If devalias is used without arguments, it displays all system device aliases (will run up to 120 MHz).

.enet-addr
Displays the ethernet address

led-off/led-on
Turns the system led off or on.

nvaliasname path
Creates a new alias for a device, where name is the name of the alias and path is the physical path of the device. Note – Run the reset-all or the nvstore command to save the new alias in non-volatile memory (NVRAM).

nvunalias name path
Deletes a user-created alias (see nvalias), where name is the name of the alias and path is the physical path of the device. Note – Run the reset-all or nvstore command to save changes in NVRAM.

nvstore
Copies the contents of the temporary buffer to NVRAM and discards the contents of the temporary buffer.

power-off/power-on
Powers the system off or on.

printenv
Displays all parameters, settings, and values

probe-fcal-all
dentifies Fiber Channel Arbitrated Loop (FCAL) devices on a system. 1

probe-sbus
Identifies devices attached to all SBUS slots. Note – This command works only on systems with SBUS slots.

probe-scsi
Identifies devices attached to the onboard SCSI bus. 1

probe-scsi-all
Identifies devices attached to all SCSI busses. 1

set-default parameter
Resets the value of parameter to the default setting.

set-defaults
Resets the value of all parameters to the default settings. Tip – You can also press the Stop and N keys simultaneously during system power-up to reset the values to their defaults.

setenv parameter value
Sets parameter to specified value. Note – Run the reset-all command to save changes in NVRAM.

show-devs
Displays all the devices recognized by the system.

show-disks
Displays the physical device path for disk controllers.

show-displays
Displays the physical device path for frame buffers.

show-nets
Displays the physical device path for network interfaces

show-post-results
If run after Power On Self Test (POST) is completed, this command displays the findings of POST in a readable format.

show-sbus
Displays devices attached to all SBUS slots. Similar to probe-sbus .

show-tapes
Displays the physical device path for tape controllers.

sifting string
Searches for OBP commands or methods that contain string. For example, the sifting probe command displays probe-scsi, probe-scsi-all, probe-sbus, and so on.

speed
Displays CPU and bus speeds

test device-specifier
Executes the selftest method for device-specifier. For example, the test net command tests the network connection.

test-all
Tests all devices that have a built-in test method.

version
Displays OBP and POST version information.

watch-clock
Tests a clock function.

watch-net
Monitors the network connection for the primary interface.

watch-net-all
Monitors all the network connections.

words
Displays all OBP commands and methods

—————————-

OBDIAG
OBDiag also displays diagnostic and error messages on the system console.

How To Run OBDiag
To run OBDiag, simply type obdiag at the Open Boot ok prompt.
You can also set up OBDiag to run automatically when the system is powered on using the following methods:

Set the OBP diagnostics variable:   ok setenv diag-switch  true
Press the Stop and D keys simultaneously while you power on the system.
On Ultra Enterprise servers, turn the key switch to the diagnostics position and power on the system.

POWER ON SELF TEST (POST)
POST is a program that resides in the firmware of each board in a system, and it is used to initialize, configure, and test the system boards. POST output is sent to serial port A (on an Ultra Enterprise server, POST output is sent only to serial port A on the system and clock board). The status LEDs of each system board on Ultra Enterprise servers indicate the POST completion status. For example, if a system board fails the POST test, the amber LED stays lit.
You can watch POST ouput in real-time by attaching a terminal device to serial port A. If none is available, you can use the OBP command show-post-results to view the results after POST completes.

How To Run POST
– connect to serial port
– set the dig-switch to ‘true’
ok setenv diag-switch  true
– Set the desired testing level (min or max), example:
ok setenv diag-level max
– Set the auto-boot variable to ‘false’
ok setenv auto-boot  false
– run ‘reset-all’ >> ok reset-all
– Reboot or Power cycle the system

SOLARIS OPERATING ENVIRONMENT DIAGNOSTIC COMMANDS
The following table describes OS commands you can use to display the system configuration, such as failed Field Replaceable Units (FRU), hardware revision information, installed patches, and so on

/usr/platform/sun4u/sbin/prtdiag -v
Displays system configuration and diagnostic information, and lists any failed Field Replaceable Units (FRU).

/usr/bin/showrev [-p]
Displays revision information for the current hardware and software. When used with the -p option, displays installed patches.

/usr/sbin/prtconf
Displays system configuration information.

/usr/sbin/psrinfo -v
Displays CPU information, including clock speed.

###########
ref# Doc ID 1005946.1

Advertisements

How to Upload Files to Oracle Support

As an old Sun/Oracle upload file method has been discontinued, below are the several method on how to upload files to Oracle Support based on file size.

mos_attach

  • FTPS & HTTPS to MOS File Upload service – 200 GB max

sftp_mos

  1.     Set “ftps://transport.oracle.com” as the Host
  2.     Supply the appropriate credentials (MOS Support Portal username and password)
  3.     Leave the Port setting blank
  4.     After connecting, double-click on the Issue directory in the right (Remote) pane
  5.     Double-click the SR number’s directory in the right (Remote) pane
  6.     Locate the file to be transferred in the left (Local) pane
  7.     Drag-and-drop the file into the relevant SR directory
  • Diagnostic Assistant (DA), using MOS file utilities – 200 GB max

Diagnostic Assistant (DA)

DA 2.2 (included w/RDA/Explorer/STB 8.02) now supports uploads via https to MOS File Upload Service. Use DA via menus,explorer or the command line.

Menu

  1. Run diagnostic assistant menu:
  2. /<linux/solaris rda home>/da/da.sh menu or <win rda home>dada.cmd menuDiagnosticAssistance
  3. Start with option 3: RDA, OCM,ADR, SR Creation / Packaging, and MOS ToolsDiagnosticAssistant
  4. Next select option 4: Package, Upload Diagnostic FilesDiagnosticAssistanct
  5. Complete it with option 7: Upload File Package to SRDiagnosticAssistant
  6. You will be prompted for your SR, credentials and the file.

 

To use DA do a command line upload:

da.sh upload -p sr=<SR Number>file=path=<path to file>

To use DA to upload with explorer

explorer -w default -T DA -SR <Service Request number>

NOTE: If SR Number is not specified, the file will be uploaded to transport.oracle.com/upload/proactive/

  • Secure File Transport (SFT), part of ASR Manager – 200 GB max

# /opt/SUNWsasm/bin/sasm transport -r
Enter “1” to select:
1) transport.oracle.com
Or, enter:
https://tranport.oracle.com

  • FTP, including SFTP, is not supported

*Reference: Doc ID 1547088.2 and Doc ID 1596914.1

How to Configure SL24 / SL48 with Netbackup

SL24 and SL48 are the Oracle’s entry level of Autoloader/Tape Library.

Check here for complete documentation.

L24/48 Library are using a single SCSI ID and two logical unit numbers (LUN). LUN 0 controls the tape drive and LUN 1 controls the robotic. So, its require an HBA that supports multiple LUNs. If multiple LUN support is not enabled, the host server cannot scan beyond LUN 0 to discover the Library. It just sees the tape drive.

To check the device and connectivity status from Solaris, please use the show_FCP_dev option: “cfgadm -o show_FCP_dev -al”, instead of “cfgadm -al” command. The robotic or changer will not shown if you use standard “cfgadm -al” command.
If the changer detected already by “cfgadm -o show_FCP_dev -al” command but still not detected by NBU sgscan command, check your NBU device configuration. You need to modify the st.conf file in order to detect the devices on two LUNs.

[Find the following line in the st.conf file:

name=”st” target=0 lun=0;

Replace that line and the following lines through target 5 with the following. Doing so modifies the st.conf file to include searches on non-zero LUNs:

name="st" target=0 lun=0;
name="st" target=0 lun=1;
name="st" target=1 lun=0;
name="st" target=1 lun=1;
name="st" target=2 lun=0;
name="st" target=2 lun=1;
name="st" target=3 lun=0;
name="st" target=3 lun=1;
name="st" target=4 lun=0;
name="st" target=4 lun=1;
name="st" target=5 lun=0;
name="st" target=5 lun=1;
name="st" parent="fp" target=0;
name="st" parent="fp" target=1;
name="st" parent="fp" target=2;
name="st" parent="fp" target=3;
name="st" parent="fp" target=4;
name="st" parent="fp" target=5;
name="st" parent="fp" target=6;

Click here for complete information on how to configure tape drive and robotic devices for Netbackup.

If the SL24/SL48 has the SAS tape drives and you are using LSI SAS HBA, please check and upgrade the SAS HBA driver.

There was an issue with the LSI SAS1 (3GB) HBA with a firmware level of 1.26.00 and below, where the HBA will not see any SAS devices connected to it. Check below document (MOS access required) for more detail.

HBA – LSI SAS HBA Firmware Issue, SAS Devices Not Being Seen by Server (Doc ID 1350564.1)

ERROR: Last Trap: Instruction Access Exception

{0} ok boot
Boot device: /pci@7c0/pci@0/pci@1/pci@0,2/LSILogic,sas@2/disk@0,0:a File and args:
Loading ufs-file-system package 1.4 04 Aug 1995 13:02:54.
FCode UFS Reader 1.12 00/07/17 15:48:16.
Loading: /platform/SUNW,Sun-Fire-T200/ufsboot
Loading: /platform/sun4v/ufsboot
ERROR: Last Trap: Instruction Access Exception

If you got above error messages when powering on Sun Server (T-series, T1000/T2000), and the boot process stuck in there, do not call Oracle support or open SR via MOS unless you try below simple troubleshooting step:

Try to unplug all USB devices – ie USB keyboard + mouse, KVM etc, then connect your laptop/PC to server via serial port then reboot the server. IF the error messages are disappeared, I believe the server will able to boot as usual.

This issue are mostly related with USB keyboard /mouse or other USB related devices. It could be the USB devices or the USB port of the server it self. Try to plug the USB device on another port then reboot the server again. For T2000 there are 4 USB port on the back and 2 USB port on the front.

How to reset/recover Integrated Lights Out Manager (ILOM) password

The default user and password of ILOM is “root/changeme”, but if you hv changed the password already and due for some reason you forgot the password, here are the step by step to revocer the ILOM password.

First of all, try to change the password with ipmitool, but if its still doesn’t work, try below step:

Notes:
– You must be physically present at the server to perform this procedure.
– This procedure uses the default user account to enable you to recover a lost password or to re-create the root user account.
– You cannot change or delete the default user account.

1. Connect to ILOM via serial console and log in using the default user account.
SUNSP-xxxxxxxx login: default
Press and release the physical presence button.
Press return when this is completed…

2. Prove physical presence at your server.
Press and release the physical presence button.

The Physical Presence button on the Sun SPARC Enterprise T5xxx servers and X-Series is the Locator button:

But, for Sparc T3/T4 model, the physical presence are on rear side, except for T3-1b/T4-1b:
SPARC T3-1/T4-1: (rear) pin-hole to the left of the USB ports
SPARC T3-2/T4-2: (rear) pin-hole to the left of NET0
SPARC T3-4/T4-4: (rear) to the right of OK LED, above the USB port
SPARC T3-1b/T4-1b: (front) Locate button/Physical Presence (White LED)

3. Return to your serial console and press Enter.

You will be prompted for a password.

4. Type the password for the default user account: defaultpassword

5. Reset the account password or re-create the root account.

-> set /SP/users/root password
Enter new password: ********
Enter new password again: ********
6. try login with your new root password

How to bypass and reset the ALOM password for Sun V-Series and Netra Series

How to bypass and reset the ALOM password on Sun Fire V125/V210/V215/V240/V245/V250/V440/V445 and Sun Netra 210/240/440 Servers.
Use scadm utility to reset the admin password:
# cd /usr/bin/cd /usr/platform/`uname -i`/sbin
# ./scadm userpassword admin

Return to the ALOM login prompt
Now, login into the “admin” account using the new password

Update:

If the SCADM not available, download the RSC software from MOS:

RSC Software Download (steps to download the latest RSC software):
1. Login to MOS and select “Patches and Updates Tab”
2. In “Patch Search” on the Top right panel, Click on “Product or Family (Advanced Search)”
3. In the “Product Is” pull-down select “Sun remote System Control”??
4. In the next pull down “Release is” select the RSC version (2.2.2 or 2.2.3).
5. Select OS and click “Search” (will get a list with RSC releases & patches)
6. Select the desired RSC Release (packages) or patch
7. Click Download on the Right

The packages for Solaris 8 and 9 (and later) are both in the zip file. There are two options of the zip file, 32bit, and 64bit, but they both have the same checksums, so there are no differences: p10264452_223_SOLARIS64.zip (p10264451_223_SOLARIS.zip)

Install the software as you would any Package with pkgadd.

Command syntax is same:
#/usr/platform/`uname -i`/rsc/rscadm userpassword admin
[To reconfigure the card run the command:
# usr/platform/`uname -i`/rsc/rsc-config

[If you had installed the software before and believe the card is configured check the setup:
# usr/platform/`uname -i`/rsc/rscadm show

How to reset the ILOM root password back to the default ‘changeme’ using ipmitool

If the root password on the ILOM is currently unknown, but you have root access to the O/S installed, you can change the ILOM password back to the default “changeme”.

Follow below steps:
# which ipmitool
/usr/sbin/ipmitool

# /usr/sbin/ipmitool -V
ipmitool version 1.8.8

# /usr/sbin/ipmitool user set password 0x02 changeme

[or you can also use the raw format:
# /usr/sbin/ipmitool raw 0x06 0x47 0x02 0x02 0x63 0x68 0x61 0x6e 0x67 0x65 0x6d 0x65 0x00 0x00 0x00 0x00 0x00 0x00 0x00 0x00

There will be no confirmation after running the ipmitool command, however the ILOM root password will be changed to changeme.