This document describes a procedure to attempt to recover from an IPL hang at LED 553. It applies to AIX Version 4.
An LED value of 553 is a checkpoint code displayed to indicate the system transition to phase 3 of IPL. A halt or hang at LED 553 is often the result of a corrupted or missing /etc/inittab file. It can also be caused by full / (root) or /tmp file systems, inconsistencies in either startup configuration files, Object Data Manager (ODM) object class databases, or system library files. Additionally, a number of other issues involving file permissions, invalid hard links in the root file system, etc. have been observed to cause a hang at LED 553.
Please refer to your system user's or installation and service guide for specific IPL procedures related to your type and model of hardware. You can also refer to the document titled "Booting in Service Mode", available at http:// techsupport.services.ibm.com/rs6k/techbrowse for more information.
Follow the screen prompts to the Welcome to Base OS menu.
The next screen displays a warning that indicates you will not be able to return to Base OS menu without rebooting.
The next screen displays information about all volume groups on the system.
If you get errors from the preceding option, do not continue with the rest of this procedure. Correct the problem causing the error. If you need assistance correcting the problem causing the error, contact one of the following:
If no errors occur, proceed with the following steps.
fsck /dev/hd4 fsck /dev/hd2 fsck /dev/hd3 fsck /dev/hd9var fsck /dev/hd1
NOTE: The -y option gives the fsck command permission to repair file system corruption when necessary. This flag can be used to avoid having to manually answer multiple confirmation prompts, however, use of this flag can cause permanent data loss in some situations.
/usr/sbin/logform /dev/hd8
Answer yes when asked if you want to destroy the log.
df /dev/hd3 df /dev/hd4
TERM=xxx export TERM
Now use an editor to create the /etc/inittab file. For an example, see the section "Sample /etc/inittab file" in this document. If your /etc/inittab file was corrupt and you recreated it, the following steps may not be necessary.
There are only three entries which must be in the /etc/inittab file to successfully boot the system. If your /etc/inittab file is missing or corrupted AND you are unable to use an editor while in Service mode, do the following to create a minimal inittab file to boot the machine into run level 2 (Normal mode).
mv /etc/inittab /etc/inittab.MMYYDD touch /etc/inittab chmod 500 /etc/inittab chown root:system /etc/inittab echo 'init:2:initdefault:' >> /etc/inittab echo 'brc::sysinit:/sbin/rc.boot 3 >/dev/console 2>&1' >> /etc/inittab echo 'cons:0123456789:respawn:/etc/getty /dev/console' >> /etc/inittab
MMDDYY represents the current two-digit representation of the Month, Day and Year respectively.
NOTE: The /.kshrc and /.profile files are not necessary for the system to boot into run level 2 (Normal mode) and, in fact, may not exist on your system.
ls -al /.kshrc /.profile /etc/environment /etc/profile
Sample output:
-rw-r--r-- 1 root system 71 Dec 14 1993 /.kshrc -rw-r--r-- 1 root system 158 Dec 14 1993 /.profile -rw-rw-r-- 1 root system 1389 Oct 26 1993 /etc/environment -rw-r--r-- 1 root system 1214 Jan 22 1993 /etc/profile
etc/profile or .profile may contain a command that is valid only in the Korn shell. Change the command to something that is also valid in the Bourne shell. For example, change the following:
export PATH=/bin:/usr/bin/:/etc:/usr/ucb:.
to the following:
PATH=/bin:/usr/bin/:/etc:/usr/ucb:. export PATH
/etc/environment is a special case. The only commands it may contain are simple variable assignments, such as statements of the form [varname]=[value]. Check this file with an editor to verify the format. See the section "Sample /etc/environment file" at the end of this document.
ls -al /bin /bin/bsh /bin/sh /lib /unix /u
Sample output:
lrwxrwxrwx 1 root sys 8 Aug 5 1994 /bin -> /usr/bin -r-xr-xr-x 3 bin bin 256224 Jun 4 1993 /bin/bsh -r-xr-xr-x 3 bin bin 256224 Jun 4 1993 /bin/sh lrwxrwxrwx 1 root sys 8 Aug 5 1994 /lib -> /usr/lib lrwxrwxrwx 1 root sys 5 Aug 5 1994 /u -> /home lrwxrwxrwx 1 root sys 18 Aug 5 1994 /unix -> /usr/lib/boot/unix
If any of these files are missing, the problem may be a missing symbolic link. Use the commands from the following list that correspond to the missing links.
ln -s /usr/bin /bin ln -s /usr/lib/boot/unix /unix ln -s /usr/lib /lib ln -s /home /u
ls -l /etc/fsck /sbin/rc.boot
Sample output:
lrwxrwxrwx 1 root system 14 Aug 5 1994 /etc/fsck -> /usr/sbin/fsck -rwxrwxr-- 1 root system 33760 Aug 30 1993 /sbin/rc.boot
brc::sysinit:/sbin/rc.boot 3 >/dev/console 2>&1
See the section "Sample /etc/inittab file" in this document for an example.
cp /bin/bsh /bin/bsh.orig cp /bin/ksh /bin/bsh
If you can then reboot successfully, this indicates that one of the profiles was causing problems for bsh. Check the profiles again by running the following:
/bin/bsh.orig /.profile /bin/bsh.orig /etc/profile /bin/bsh.orig /etc/environment
If you receive errors with any of the preceding commands, this indicates that there is a command in that profile that bsh cannot handle.
lppchk -c lppchk -v
NOTE: These commands should not produce output. If they do, then the messages should be examined to assess whether it is a potential cause of the hang.
lslv -m hd5Sample output:
hd5:N/A LP PP1 PV1 PP2 PV2 PP3 PV3 0001 0001 hdisk0
The disk number under the PV1 column is the disk name you should use to run the following two commands:
bosboot -ad /dev/hdisk0 bootlist -m normal hdisk0
cfgmgr -vp 2
If the cfgmgr command hangs, this is likely the cause of the system hang. You may be able to stop the command by pressing Ctrl-C, however, a reboot is often required to get back into Service mode and continue troubleshooting the problem.
sync;sync;sync;reboot
If you followed all of the preceding steps and the system still stops at an LED 553 during a reboot in Normal mode, you may want to consider reinstalling your system from a recent backup. Isolating the cause of the hang could be excessively time-consuming and may not be cost-effective in your operating environment. It is possible, in the end, that isolation of the problem may indicate a restore or reinstall of AIX is necessary to correct it.
If you wish, you may pursue further system recovery assistance from one of the following:
: US Government Users Restricted Rights - Use, duplication or : disclosure restricted by GSA ADP Schedule Contract with IBM Corp. : : Note - initdefault and sysinit should be the first and second entry. : init:2:initdefault: brc::sysinit:/sbin/rc.boot 3 >/dev/console 2>&1 # Phase 3 of system boot powerfail::powerfail:/etc/rc.powerfail 2>&1 | alog -tboot > /dev/console # Power Failure Detection load64bit:2:once:/etc/methods/cfg64 >/dev/console 2>&1 # Enable 64-bit execs rc:2:wait:/etc/rc 2>&1 | alog -tboot > /dev/console # Multi-User checks fbcheck:2:wait:/usr/sbin/fbcheck 2>&1 | alog -tboot > /dev/console # run /etc/fi rstboot srcmstr:2:respawn:/usr/sbin/srcmstr # System Resource Controller rctcpip:2:wait:/etc/rc.tcpip > /dev/console 2>&1 # Start TCP/IP daemons rcnfs:2:wait:/etc/rc.nfs > /dev/console 2>&1 # Start NFS Daemons cron:2:respawn:/usr/sbin/cron piobe:2:wait:/usr/lib/lpd/pio/etc/pioinit >/dev/null 2>&1 # pb cleanup uprintfd:2:respawn:/usr/sbin/uprintfd logsymp:2:once:/usr/lib/ras/logsymptom # for system dumps pmd:2:wait:/usr/bin/pmd > /dev/console 2>&1 # Start PM daemon diagd:2:once:/usr/lpp/diagnostics/bin/diagd >/dev/console 2>&1 dt:2:wait:/etc/rc.dt cons:0123456789:respawn:/usr/sbin/getty /dev/console
# @(#)18 1.21 src/bos/etc/environment/environment, cmdsh, bos430, ... PATH=/usr/bin:/etc:/usr/sbin:/usr/ucb:/usr/bin/X11:/sbin:/local/netscape:/usr/lo cal/bin TZ=CST6CDT LANG=en_US LOCPATH=/usr/lib/nls/loc MOZILLA_HOME=/local/netscape export MOZILLA_HOME NLSPATH=/usr/lib/nls/msg/%L/%N:/usr/lib/nls/msg/%L/%N.cat LC__FASTMSG=true PS1='MYSYSTEM $PWD=>' set -o vi # ODM routines use ODMDIR to determine which objects to operate on # the default is /etc/objrepos - this is where the device objects # reside, which are required for hardware configuration ODMDIR=/etc/objrepos
[ Doc Ref: 90605189414750 Publish Date: Nov. 14, 2000 4FAX Ref: 4191 ]