How To replace M4000/M5000 XSCF board

XSCF or eXtended System Control facility unit is known as service processor for M-Series server.
The XSCF unit is a cold replacement component. This means the entire server must be powered off and the power cords disconnected to replace the XSCF unit. Execute “showhardconf” or “showstatus” command to make sure the XSCF is faulted.

XSCF> showhardconf


*   XSCFU Status:Degraded,Active; Ver:0101h; Serial:BFxxxxxxx  ;
+ FRU-Part-Number:CF00541-0481 04   /541-0481-04       ;

I hv been asked by some people on how to backup XSCF configuration before replacing the XSCF board. They presume the XSCF configuration need to backup first because there are only 1 XSCF board on M4000/M5000 server. In fact no need to backup the config because there was a backup copy of XSCF configuration on Operator Panel, both XSCF and Operator Panel always synchronizing its data each time XSCF bootup or there was a changes on XSCF configuration. Thats way there is a restriction to replace both XSCF and Operator Panel simultaneously.

Okay, if you ready to replace the XSCF board, below are the instruction:

[Shutdown the OS, Power off the server and unplug the power cord and XSCF ethernet cables.

[Use proper ESD grounding technique and anti static mat, replace the XSCF board:

*M4000 XSCF board location:

*M5000 XSCF board location:

# Plug all cables then power on server, wait till the new XSCF board startup. It will reboot around 2-3 times. you will see the messages of XSCF and OPNL synchronize its data during startup:

…..
initialize XSCF common database (OWN)  —  complete
synchronize setup data (XSCF -> OPNL)  —  complete
initialize XSCF common database (ACTIVE)  —  complete
wait for database synchronization  —  complete
execute S00clis_all  —  complete
…..

[If the boot process is finished, then try to log in. If you see below error messages:

XCP version of Panel EEPROM and XSCF FMEM mismatched,
Panel EEPROM=1090, XSCF FMEM=1100

Then you need to upgrade the XSCF firmware. Download the latest firmware from MOS, then perform firmware upgrade.

[XSCF FIRMWARE UPGRADE:

*VIA FTP:

XSCF> getflashimage -l        >CHECK CURRENT FIRWARE
XSCF> getflashimage -u AZIZ ftp://10.32.17.61/FFXCP1112.tar.gz    >> aziz is username, 10.32.17.61 is ftp server on my laptop
Password: *******
0MB received
1MB received
2MB received
3MB received
4MB received
5MB received
6MB received
7MB received
8MB received

Download successful: 42660 Kbytes in 50 secs (987.298 Kbytes/sec)
Checking file…
MD5: 73ca6370dc6c636f2e3845b66caa203a
XSCF> getflashimage -l

XSCF> flashupdate -c check -m xcp -s 1112
XCP update is possible with domains up

XSCF> flashupdate -c update -m xcp -s 1112
The XSCF will be reset. Continue? [y|n] :y
Checking the XCP image file, please wait a minute
XCP update is started (XCP version=1112:last version=1081)
OpenBoot PROM update is started (OpenBoot PROM version=02180000)

*VIA USB:
– checked version, update firmware
XSCF> version -c xcp -v

XSCF> getflashimage file:///media/usb_msd/FFXCP1112.tar.gz

Note the different of M-Series firmware file below:
getflashimage file:///media/usb_msd/IKXCP1112.tar.gz    >>for M3000
getflashimage file:///media/usb_msd/FFXCP1112.tar.gz    >>for M4000/5000
getflashimage file:///media/usb_msd/DCXCP1112.tar.gz    >>for M8000/M9000

XSCF> flashupdate -c check -m xcp -s 1112
XCP update is possible with domains up

XSCF> flashupdate -c update -m xcp -s 1112

XSCF> version -c xcp -v
XCP0 (Reserve): 1110 <<XCP0 will take few minutes to finish update
OpenBoot PROM : 02.29.0000
XSCF          : 01.11.0000
XCP1 (Current): 1112 <<updated already
OpenBoot PROM : 02.29.0000
XSCF          : 01.11.0002
OpenBoot PROM BACKUP

XSCF> version -c cmu -v

[If you hv finished on upgrading the firmware or there are no firmware issue, then make sure the device status again with “showhardconf” and “showstatus” command.

[Continue powering on the domain:

XSCF> poweron -d0
DomainIDs to power on:00
Continue? [y|n] :y
Poweron canceled due to invalid system date and time.
XSCF>

Wait, did you see above error messages? yes the domain unable to boot because the system date and time is invalid.

#set the new date and time, Example for 24 Oct 2012 @ 10:23:

XSCF> setdate -u -s 102410232012.00
Wed Oct 24 10:23:00 UTC 2012
The XSCF will be reset. Continue? [y|n] :y

#If you want to change the timezone, run the settimezone command. example:

XSCF> settimezone -c settz -s Asia/Jakarta

#DONE. Now power on the domain again.

13 responses to “How To replace M4000/M5000 XSCF board

  1. How did you clear the error log and quiesce the amber light on the front panel?

    This has been our experience.

    We’ve also had to change 2 x m3000 mainboards.
    In both cases our fault was indicated by a lit amber light on the front panel.
    When we investigated with “showstatus”, it reported the CPU cache was reporting an error and performance was degraded. The CPU cache is on the MBU which requires a mainboard swap.
    Fortunately the first faulty M3000 was still under warranty as a new mainboard is 8 or 9 grand (GBP). Oracle came out to site and after swapping the mainboard, The amber light was still lit. To clear this they had to request a passcode based on XCP version, date and serial number from Oracle Support. This took 24 hours to arrive and they came back next morning when they could use the passcode to get to an elevated “service>” prompt. They could then use the command “clearfault /MBU_A” .
    Oh, before this when we powered on after the mainboard swap we had an XCP version mismatch. The panel EEPROM on 1093 and FMEM on 1091. Oracle were unsure which version to get a passcode for and requested 1093 first. Unfortunately this passcode didn’t work. They then requested the 1091 code which worked The passcodes they obtain expire in 48 hours.

    The second server that faulted was out of warranty and our hardware support had no access to the passcode. The mainboard was replaced and the error condition was still reported and needed to be cleared.
    The first option tried “restoredefaults -c xscfu” didn’t clear the error log and switch off the amber light. So we had to do a full factory reset using the “restoredefaults -c factory” command. This zeros the error logs which clears the fault condition.
    PLEASE BE AWARE that a full factory reset requires
    1. Physical presence at the server
    2. The SERIAL CONSOLE and CONNECTION that was used for the initial XSCF configuration because after this restore/reset you’re right back to the username “default” with no ethernet networking configured.
    3. THE KEY for the front panel. When you type username “default” and press return you have to wait 5 secs and turn the key to the service position, then go back to the console press return and then back to the front panel and turn the key back to the locked position.
    We had done a dumpconfig but we were advised it was less riskier to redo the config. manually from our documentation rather than “restoreconfig”.
    I assume everyone’s familiar with XSCF commands. This is the order we restored our config..
    XSCF> adduser (fe and we also created our own username)
    XSCF> password (for both users)
    XSCF> setprivileges (fieldeng for fe, platadm for our username)
    XSCF> setdscp
    XSCF> settimezone
    XSCF> setnetwork
    XSCF> setroute
    XSCF> sethostname
    XSCF> sethostname -d
    XSCF> setssh -c enable (enabled ssh)

    Hope this helps somebody.
    Cheers
    Red Steve

    • I’m speaking from no knowledge here but it’s not just a dead eeprom battery on the board is it? Low voltage error, clock fatal failure/

  2. Hi everybody, hi aziz,
    i need your help in our 2 SUN Storage 2500-M2 storage,which have problems, showing firmware not up to date.firmware needs to be updated. How can I perform firmware update? can you please give me the MOP?
    thank you for help

  3. Hi Aziz,

    The XSCF is not booting on M4000 servers due to long storage period, how can I install a new firmware?

    • A good XSCF board should be able to boot even has an old firmware. replace with other XSCF board till you able to login and perform flashupdate command. Download latest firmware on MOS (my oracle support) website.

  4. Pingback: How To Get To Ok Prompt From Xscf | Zombies5

  5. Hi Aziz

    After replacing the timekeeper on an M4000 XSCF board, it is no longer possible to login. When we try to login with the default user, we get an error that it can’t find the XSCF Firmware. Is there anything we can do to repair it? All the guides we have found on firmware installation, requires that it is possible to login. I have a dump of the xscf boot sequence. If you think you can help, I can mail it to you.

      • It’s the yellow chip labeled M4T32-BR12SH1. It contains a battery and a crystal. The battery had run out, so it had to be replaced. We have also replaced the Lithium battery on the XSCF board, because it had run out. When the Lithium battery runs out and is replaced, the XSCF board resets itself to default settings. But this time the Timekeeper chips battery had also run out, and had to be replaced.

        I have attached a copy of the boot sequence.

        ——————————————————————

        XSCF uboot 01080001 (May 8 2009 – 15:09:36)

        SCF board boot factor = 4080
        memory test ..
        Memory compare test
        …………….finish
        DDR Real size: 256 MB
        DDR: 224 MB

        ## Booting image at ff800000 …
        Image Name: XSCF kernel 01090001 2.6.11.12-s
        Image Type: PowerPC Linux Kernel Image (gzip compressed)
        Data Size: 1456963 Bytes = 1.4 MB
        Load Address: 00000000
        Entry Point: 00000000
        Verifying Checksum … OK
        Uncompressing Kernel Image … OK
        ## Loading RAMDisk Image at ff980000 …
        Image Name: XSCF rootfs 01090001 ,2009/11/05
        Image Type: PowerPC Linux RAMDisk Image (gzip compressed)
        Data Size: 5461340 Bytes = 5.2 MB
        Load Address: 00000000
        Entry Point: 00000000
        Verifying Checksum … OK
        Loading Ramdisk to 0baca000, end 0bfff55c … OK
        Linux version 2.6.11.12-sec (gcc version 3.4.4) #1 Thu Nov 5 14:55:41 JST 2009
        new message buffer at 0f700000 size 1048576
        log_buf_len: 1048576
        mpc85xx_cds_setup_arch
        Built 1 zonelists
        Kernel command line: root=/dev/ram rw console=ttyS0,9600 init=/sbin/init_change_
        root panic=1 mem=240M
        OpenPIC Version 1.2 (1 CPUs and 44 IRQ sources) at fbe79000
        PID hash table entries: 1024 (order: 10, 16384 bytes)
        RX4574SG rtc not initialize.
        RX4574SG rtc initialize complete.
        Dentry cache hash table entries: 32768 (order: 5, 131072 bytes)
        Inode-cache hash table entries: 16384 (order: 4, 65536 bytes)
        Memory: 218752k available (2188k kernel code, 668k data, 316k init, 0k highmem)
        Mount-cache hash table entries: 512 (order: 0, 4096 bytes)
        Freeing initrd memory: 5333k freed
        RAMDISK driver initialized: 16 RAM disks of 32768K size 1024 blocksize
        i2c-algo-cpm: CPM2 I2C algorithm module version 0.1 (Mar 22, 2005)
        FCC ENET Version 0.3
        TCP established hash table entries: 8192 (order: 4, 65536 bytes)
        TCP bind hash table entries: 8192 (order: 3, 32768 bytes)
        ip_tables: (C) 2000-2002 Netfilter core team
        arp_tables: (C) 2002 David S. Miller
        VFS: Mounted root (ext2 filesystem).
        Freeing unused kernel memory: 316k init
        switching initrd filesystem, ramdisk to tmpfs
        SCF Linux Boot Script 2006/03/04 for ROM boot environment
        fsl-sec2 hardware crypt accelerator model3a ver 0.02 enabled

        XSCF initial process start (pid=106)

        load /scf/modules/lites_ldrv.ko — complete
        load /scf/modules/drvscftrace.ko — complete
        load /scf/modules/sec2_rsa.ko — complete
        load /scf/modules/sec2_md5.ko — complete
        load /scf/modules/sec2_des.ko — complete
        load /scf/modules/sec2_arc4.ko — complete
        load /scf/modules/sec2_aes.ko — complete
        load /scf/modules/sec2_sha256.ko — complete
        load /scf/modules/sec2_sha1.ko — complete
        load /scf/modules/hw_random.ko — complete
        load /scf/modules/scsi_mod.ko — complete
        load /scf/modules/sd_mod.ko — complete
        load /scf/modules/usbcore.ko — complete
        load /scf/modules/ohci-hcd.ko — complete
        load /scf/modules/usb-storage.ko — complete
        load /scf/modules/drvbootfmem.ko — complete
        load /scf/modules/drvmbc.ko — complete

        ***** WARNING *****
        XSCF initialization terminate,
        because there is no XSCF-Firmware in this XSCF board.
        Please install XSCF-Firmware.

        *** SCF_INIT was set FACTORY mode automatically. ***

        login: default
        login: cannot run /scf/bin/rbash: No such file or directory

        login:

        ———————————————————————-

        Do you have any suggestions on how to fix this? Or do you think that we have to order a new XSCF board?

        • I would suggest to upgrade the firmware, but look like you are unable to login, even with the default user. try re-install original timekeeper chip or replace with another new one. if the problem still persist, then order new XSCF board.

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s