DISCLAIMER : Please note that blog owner takes no responsibility of any kind for any type of data loss or damage by trying any of the command/method mentioned in this blog. You may use the commands/method/scripts on your own responsibility.If you find something useful, a comment would be appreciated to let other viewers also know that the solution/method work(ed) for you.
To Do List - Before migration from AIX 5.3 to 6.1
- Copy Sendmail.cf
- Tar Perl Mondule if you use it in 5.3, this will be upgraded when going 6.1 and you might want to fall back to older version
- Make sure you note the Number of Allowed Envrioment Process, this will fall back to default which 128 allowed process. **THIS IS CRITICAL**
- Copy MOTD “Message of the Day”
- Please Make Sure you turn of RSH since you use NIMSH/RSH to upgrade
- Make sure you not the NIC Tunning TCP Send/Rcv you will have to put it back afterwards
- Memory Tunning (no –a) please review after upgrade, upgrading to 6.1 will always take the best tunning parameters but please review to make sure its good
- After upgrade please make sure you run the following command to make sure your OS 6.1 is consistent “lppchk –v” and “lppchk –c”
- Copy "/etc/security/ulimits" and make sure after upgrade they are the same
- Please upddate the NMON script and cron job to reference new AIX 6.1
Please Note: If you do your copy/tars make sure you do it before your nimadm migration so when you reboot to 6.1 disk the copy files are there. Also make sure have your replace back the files you take note of the permission owner and group of the files.
Two ways to create mksysb images in AIX
1) create on NIM server command:
nim -o define -t mksysb -a server=master -a source=<server name> -a mk_image=yes -a location=<location of the store image> <mksysb image name>
This will create the mksysb image of the client server and define it on the NIM server.
Example:
nim -o define -t mksysb -a server=master -a source=edppbuslvd01 -a mk_image=yes -a location=/nim/mksysb/edppbuslvd01_6100-04-03-05112010 edppbuslvd01_6100-04-03-05112010
server=master: server to store image, in this case is master
source=edppbuslvd01: the source of the image, which is client
location: the location of the stored mksysb image
2) create on client machine and then copy to NIM server and define on NIM server, or NFS mount the filesystem from NIM server on the client server.
let say you successfully NFS mount nim server filesystem on the client machine as /mnt.
mksysb -ieX /mnt/edppbuslvd01_6100-04-03-05112010
-e: exclude the filesystem/dir that defined on /etc/exclude.rootvg
-i: call the mkszfile command to generate the /image.data file
The /image.data file contains information on volume groups, logical volumes, file systems, paging space, and physical volumes.
This information is included in the backup for future use by the installation process.
-X: set to automatically expand the /tmp if necessary
After the mksysb image created, you need to define it on NIM server.
nim -o define -t mksysb -a server=master -a location=<image location> <image name>
nim -o define -t mksysb -a server=master -a source=<server name> -a mk_image=yes -a location=<location of the store image> <mksysb image name>
This will create the mksysb image of the client server and define it on the NIM server.
Example:
nim -o define -t mksysb -a server=master -a source=edppbuslvd01 -a mk_image=yes -a location=/nim/mksysb/edppbuslvd01_6100-04-03-05112010 edppbuslvd01_6100-04-03-05112010
server=master: server to store image, in this case is master
source=edppbuslvd01: the source of the image, which is client
location: the location of the stored mksysb image
2) create on client machine and then copy to NIM server and define on NIM server, or NFS mount the filesystem from NIM server on the client server.
let say you successfully NFS mount nim server filesystem on the client machine as /mnt.
mksysb -ieX /mnt/edppbuslvd01_6100-04-03-05112010
-e: exclude the filesystem/dir that defined on /etc/exclude.rootvg
-i: call the mkszfile command to generate the /image.data file
The /image.data file contains information on volume groups, logical volumes, file systems, paging space, and physical volumes.
This information is included in the backup for future use by the installation process.
-X: set to automatically expand the /tmp if necessary
After the mksysb image created, you need to define it on NIM server.
nim -o define -t mksysb -a server=master -a location=<image location> <image name>
Steps to remove PowerPath software, cleanup ODM and reinstall PowerPath
varyoff Volume Group (varyoffvg <VGNAME>)
/etc/rc.agent stop (if you have clariion devices)
Remove paths from Powerpath configuration
powermt remove hba=all
Delete all hdiskpower devices
lsdev -Cc disk -Fname | grep power | xargs -n1 rmdev -dl
Remove the PowerPath driver instance
rmdev -dl powerpath0
Delete all hdisk devices
- for Symmetrix devices, use this command:
lsdev -CtSYMM* -Fname | xargs -n1 rmdev -dl
- for CLARiiON devices, use this command:
lsdev -CtCLAR* -Fname | xargs -n1 rmdev -dl
Confirm with lsdev -Cc disk that there are no EMC hdisks or hdiskpowers
***If needed:
odmdelete -q name=powerpath0 -o CuDv
odmdelete -q name=powerpath0 -o CuAt
rm /dev/powerpath0
odmget CuDv |grep hdisk
odmdelete -q name=xxxxx -o CuDv (value you get from above NOT ROOTVG DISK)
odmget CuAt |grep hdisk
odmdelete -q name=xxxxx -o CuAt (value you get from above NOT ROOTVG DISK)
odmget CuDvDr|grep hdisk
odmdelete –q value3=xxxxxxxx -o CuDvDr (value you get from above NOT ROOTVG DISK)
odmget CuVPD|grep hdisk
odmdelete –q name=xxxxxxx -o CuVPD (value you get from above NOT ROOTVG DISK)
odmget CuDvDr|grep hdisk
odmdelete –q value3 =xxxxxxx -o CuDvDr (value you get from above NOT ROOTVG DISK)
cd /dev
rm hdiskxxxx (NOT ROOTVG DISK)
rm rhdiskxxxx (NOT ROOTVG DISK)
rm hdiskpower*
rm rhdiskpower*
savebase -v
***
Remove all Fiber driver instances rmdev -Rdl fscsiX ---> X being driver instance number i.e. 0,1,2, etc.
Verify through lsdev -Cc driver that there are no more fiber driver instances (fscsi)
Change the adapter instances in Defined state rmdev -l fcsX ---> X being adapterr instance number i.e. 0,1,2, etc.
Create the hdisk entries for all EMC devices
--> remove Clarrays definition here.
--> install ODM definition. (if you are reinstall ODM)
emc_cfgmgr or cfgmgr -vl fcsx ---> x being each adapter instance which was rebuilt Skip this part if no PowerPath.
Configure all EMC devices into PowerPath powermt config
Check the system to see if it now displays correctly powermt display
powermt display dev=all lsdev -Cc disk
varyoff Volume Group (varyoffvg <VGNAME>)
/etc/rc.agent stop (if you have clariion devices)
Remove paths from Powerpath configuration
powermt remove hba=all
Delete all hdiskpower devices
lsdev -Cc disk -Fname | grep power | xargs -n1 rmdev -dl
Remove the PowerPath driver instance
rmdev -dl powerpath0
Delete all hdisk devices
- for Symmetrix devices, use this command:
lsdev -CtSYMM* -Fname | xargs -n1 rmdev -dl
- for CLARiiON devices, use this command:
lsdev -CtCLAR* -Fname | xargs -n1 rmdev -dl
Confirm with lsdev -Cc disk that there are no EMC hdisks or hdiskpowers
***If needed:
odmdelete -q name=powerpath0 -o CuDv
odmdelete -q name=powerpath0 -o CuAt
rm /dev/powerpath0
odmget CuDv |grep hdisk
odmdelete -q name=xxxxx -o CuDv (value you get from above NOT ROOTVG DISK)
odmget CuAt |grep hdisk
odmdelete -q name=xxxxx -o CuAt (value you get from above NOT ROOTVG DISK)
odmget CuDvDr|grep hdisk
odmdelete –q value3=xxxxxxxx -o CuDvDr (value you get from above NOT ROOTVG DISK)
odmget CuVPD|grep hdisk
odmdelete –q name=xxxxxxx -o CuVPD (value you get from above NOT ROOTVG DISK)
odmget CuDvDr|grep hdisk
odmdelete –q value3 =xxxxxxx -o CuDvDr (value you get from above NOT ROOTVG DISK)
cd /dev
rm hdiskxxxx (NOT ROOTVG DISK)
rm rhdiskxxxx (NOT ROOTVG DISK)
rm hdiskpower*
rm rhdiskpower*
savebase -v
***
Remove all Fiber driver instances rmdev -Rdl fscsiX ---> X being driver instance number i.e. 0,1,2, etc.
Verify through lsdev -Cc driver that there are no more fiber driver instances (fscsi)
Change the adapter instances in Defined state rmdev -l fcsX ---> X being adapterr instance number i.e. 0,1,2, etc.
Create the hdisk entries for all EMC devices
--> remove Clarrays definition here.
--> install ODM definition. (if you are reinstall ODM)
emc_cfgmgr or cfgmgr -vl fcsx ---> x being each adapter instance which was rebuilt Skip this part if no PowerPath.
Configure all EMC devices into PowerPath powermt config
Check the system to see if it now displays correctly powermt display
powermt display dev=all lsdev -Cc disk
Unix filesystems explained
A filesystem is a logical collection of files on a partition or disk. A
partition is a container for information and can span an entire hard
drive if desired.
Everything in Unix is considered to be a file, including physical devices such as DVD-ROMs, USB devices, floppy drives, and so forth.
Unix uses a hierarchical file system structure, much like an
upside-down tree, with root (/) at the base of the file system and all
other directories spreading from there.
A UNIX filesystem is a collection of files and directories that has the following properties:
Everything in Unix is considered to be a file, including physical devices such as DVD-ROMs, USB devices, floppy drives, and so forth.
A UNIX filesystem is a collection of files and directories that has the following properties:
- It has a root directory (/) that contains other files and directories.
- Each file or directory is uniquely identified by its name, the directory in which it resides, and a unique identifier, typically called an inode.
- By convention, the root directory has an inode number of 2 and the lost+found directory has an inode number of 3. Inode numbers 0 and 1 are not used. File inode numbers can be seen by specifying the -i option to ls command.
- It is self contained. There are no dependencies between one filesystem and any other.
Directory | Description |
---|---|
/ | This is the root directory which should contain only the directories needed at the top level of the file structure. |
/bin | This is where the executable files are located. They are available to all user. |
/dev | These are device drivers. |
/etc | Supervisor directory commands, configuration files, disk configuration files, valid user lists, groups, ethernet, hosts, where to send critical messages. |
/lib | Contains shared library files and sometimes other kernel-related files. |
/boot | Contains files for booting the system. |
/home | Contains the home directory for users and other accounts. |
/mnt | Used to mount other temporary file systems, such as cdrom and floppy for the CD-ROM drive and floppy diskette drive, respectively |
/proc | Contains all processes marked as a file by process number or other information that is dynamic to the system. |
/tmp | Holds temporary files used between system boots |
/usr | Used for miscellaneous purposes, or can be used by many users. Includes administrative commands, shared files, library files, and others |
/var | Typically contains variable-length files such as log and print files and any other type of file that may contain a variable amount of data |
/sbin | Contains binary (executable) files, usually for system administration. For example fdisk and ifconfig utlities. |
/kernel | Contains kernel files |
Flavors of UNIX
The table below summarizes some of the common UNIX variants and clones. While the table lists about forty different variants, the UNIX world isn't nearly as diverse as it used to be. Some of them are defunct and are listed for historical purposes. Others are on their way out. In some cases, vendors have defected to Microsoft technology. In others, mergers and acquisitions have led to the consolidation of different UNIX implementations. A list of "dead" UNIX implementations would be substantial indeed, consisting of hundreds of variations on the letters "U," "I," and "X" (CLIX, CX/UX, MV/UX, SINIX, VENIX, etc.).
RAM disk in AIX
AIX provides 'mkramdisk' command for producing a disk that resides in the RAM for very high I/O intensive applications like database.
Here is a simple set of commands to create a ramdisk and a filesystem on top of it:
1.create a RAM disk specifying the size
# mkramdisk 5G
The system will assign the available RAM disk. Since this is the first one, it will be called as ramdisk0
2.Check for the new disk
# ls -l /dev | grep -i ram
If there isn't sufficient available memory, the mkramdisk command will warn about the same during the creation.
3.Create and mount a filesystem on top of the ram disk
/sbin/helpers/jfs2/mkfs -V jfs2 /dev/ramdiskx
mount -V jfs2 -o log=NULL /dev/ramdiskx /ramdiskx
The new filesystem will now be available like any other FS.
To remove a ram disk, unmount/remove the filesystem and use 'rmramdisk' command to remove the ram disk.
How to clear AIX NFS cache on a server
Do the following on a server that is having problem exporting NFS mounts
------------------------------------------------------------------------------------
1) Move the currents exports file to another name
mv /etc/exports /etc/exports.old
2) Create a new exports file
touch /etc/exports
3) Unexport everything
exportfs -ua
4) Stop NFS
stopsrc -g nfs
5) Stop portmapper
stopsrc -s portmap
6) Change directory to /etc and remove or rename the following files if they exist.
rm -rf xtab state sm sm.bak rmtab
7) change directory to /var/statman and remote the status monitoring files.
rm -rf state sm sm.bak
8) start the portmapper
startsrc -s portmap
9) start nfs
startsrc -g nfs
10) re-export what is left in /etc/exports
exportfs -va
11) refresh the inetd daemon subsystem
refresh -s inetd
12) Move the /etc/exports file that you backed up back in place.
mv /etc/exports.old /etc/exports
13) export all directories in /etc/exports
exportfs -a
------------------------------------------------------------------------------------
1) Move the currents exports file to another name
mv /etc/exports /etc/exports.old
2) Create a new exports file
touch /etc/exports
3) Unexport everything
exportfs -ua
4) Stop NFS
stopsrc -g nfs
5) Stop portmapper
stopsrc -s portmap
6) Change directory to /etc and remove or rename the following files if they exist.
rm -rf xtab state sm sm.bak rmtab
7) change directory to /var/statman and remote the status monitoring files.
rm -rf state sm sm.bak
8) start the portmapper
startsrc -s portmap
9) start nfs
startsrc -g nfs
10) re-export what is left in /etc/exports
exportfs -va
11) refresh the inetd daemon subsystem
refresh -s inetd
12) Move the /etc/exports file that you backed up back in place.
mv /etc/exports.old /etc/exports
13) export all directories in /etc/exports
exportfs -a
Procedure to mount and unmount NFS filesystems on AIX
1) Show what is being exported on the source server
showmount -e
Note: If the command above does not show the correct mount points
that needs to be exported. You can run the following command to attempt
to export the filesystems.
exportfs -a
2) To unmount the filesystem on the source server that is being NFS on other systems.
a) unmount the NFS mount points on the target server.
umount (filesystems) target servers
b) umount the filesystem on the source server once the target servers
are unmounted.
umount (filesystems)
3) Mounting NFS mount points on target server.
a) mount (IP):(mount point) (mount point)
showmount -e
Note: If the command above does not show the correct mount points
that needs to be exported. You can run the following command to attempt
to export the filesystems.
exportfs -a
2) To unmount the filesystem on the source server that is being NFS on other systems.
a) unmount the NFS mount points on the target server.
umount (filesystems) target servers
b) umount the filesystem on the source server once the target servers
are unmounted.
umount (filesystems)
3) Mounting NFS mount points on target server.
a) mount (IP):(mount point) (mount point)
Replace failed mirrored internal disk in AIX
The following procedure should be used to replace a failed internal (boot) disk on AIX 5 or higher, with software mirroring.
(Note: in these examples, hdisk0 and hdisk1 are doubly-mirrored internal disks and members of rootvg; hdisk1 has failed)
1. Identify the failed disk by analyzing the errpt logs. Confirm the failure using lspv by checking if "PV State" is "Missing".
2. Break the mirror and remove the device from AIX:
# unmirrorvg rootvg hdisk1
# reducevg rootvg hdisk1
# rmdev -l hdisk1 -d
3. Confirm that the device is no longer present using lspv.
4. Replace the disk drive, letting the new device take the same device name (hdisk1).
5. Add the new device into rootvg:
# extendvg rootvg hdisk1
6. Re-mirror the volume group. No additional arguments are required to doubly-mirror the two internal disks.
# mirrorvg rootvg
7. Re-add the boot image to the new internal disk:
# bosboot -ad hdisk1
8. Re-add the new disk to the bootlist and confirm it is present:
# bootlist -m normal hdisk0 hdisk1
# bootlist -m normal -o
hdisk0 blv=hd5
hdisk1 blv=hd5
Linux boot process
In this topic we will discuss indepth of Linux Boot Sequence.How a linux system boots?
This will help unix administrators in troubleshooting some bootup problem.
Before discussing about it I will notedown the major component we need to know which are responsible for the booting process.
1.BIOS(Basic Input/Output System)
2.MBR(Master Boot Record)
3.LILO or GRUB
LILO:-LInux LOader
GRUB:-GRand Unified Bootloader
4.Kernel
5.init
6.Run Levels
1.BIOS:
i.When we power on BIOS performs a Power-On Self-Test (POST) for all of the different hardware components in the system to make sure everything is working properly
ii.Also it checks for whether the computer is being started from an off position (cold boot) or from a restart (warm boot) is
stored at this location.
iii.Retrieves information from CMOS (Complementary Metal-Oxide Semiconductor) a battery operated memory chip on the motherboard that stores time, date, and critical system information.
iv.Once BIOS sees everything is fine it will begin searching for an operating system Boot Sector on a valid master boot sector
on all available drives like hard disks,CD-ROM drive etc.
v.Once BIOS finds a valid MBR it will give the instructions to boot and executes the first 512-byte boot sector that is the first
sector (“Sector 0″) of a partitioned data storage device such as hard disk or CD-ROM etc .
2.MBR
i. Normally we use multi-level boot loader.Here MBR means I am referencing to DOS MBR.
ii.Afer BIOS executes a valid DOS MBR,the DOS MBR will search for a valid primary partition marked as bootable on the hard disk.
iii.If MBR finds a valid bootable primary partition then it executes the first 512-bytes of that partition which is second level MBR.
iv. In linux we have two types of the above mentioned second level MBR known as LILO and GRUB
3.LILO
i.LILO is a linux boot loader which is too big to fit into single sector of 512-bytes.
ii.So it is divided into two parts :an installer and a runtime module.
iii.The installer module places the runtime module on MBR.The runtime module has the info about all operating systems installed.
iv.When the runtime module is executed it selects the operating system to load and transfers the control to kernel.
v.LILO does not understand filesystems and boot images to be loaded and treats them as raw disk offsets
GRUB
i.GRUB MBR consists of 446 bytes of primary bootloader code and 64 bytes of the partition table.
ii.GRUB locates all the operating systems installed and gives a GUI to select the operating system need to be loaded.
iii.Once user selects the operating system GRUB will pass control to the karnel of that operating system.
see below what is the difference between LILO and GRUB
4.Kernel
i.Once GRUB or LILO transfers the control to Kernel,the Kernels does the following tasks
i.The kernel, once it is loaded, finds init in sbin(/sbin/init) and executes it.
ii.Hence the first process which is started in linux is init process.
iii.This init process reads /etc/inittab file and sets the path, starts swapping, checks the file systems, and so on.
iv.It runs all the boot scripts(/etc/rc.d/*,/etc/rc.boot/*)
v.starts the system on specified run level in the file /etc/inittab
6.Runlevel
i.There are 7 run levels in which the linux OS runs and different run levels serves for different purpose.The descriptions are
given below.
Now as per our setting in /etc/inittab the Operating System the operating system boots up and finishes the bootup process.
Below are given some few important differences about LILO and GRUB
This will help unix administrators in troubleshooting some bootup problem.
Before discussing about it I will notedown the major component we need to know which are responsible for the booting process.
1.BIOS(Basic Input/Output System)
2.MBR(Master Boot Record)
3.LILO or GRUB
LILO:-LInux LOader
GRUB:-GRand Unified Bootloader
4.Kernel
5.init
6.Run Levels
1.BIOS:
i.When we power on BIOS performs a Power-On Self-Test (POST) for all of the different hardware components in the system to make sure everything is working properly
ii.Also it checks for whether the computer is being started from an off position (cold boot) or from a restart (warm boot) is
stored at this location.
iii.Retrieves information from CMOS (Complementary Metal-Oxide Semiconductor) a battery operated memory chip on the motherboard that stores time, date, and critical system information.
iv.Once BIOS sees everything is fine it will begin searching for an operating system Boot Sector on a valid master boot sector
on all available drives like hard disks,CD-ROM drive etc.
v.Once BIOS finds a valid MBR it will give the instructions to boot and executes the first 512-byte boot sector that is the first
sector (“Sector 0″) of a partitioned data storage device such as hard disk or CD-ROM etc .
2.MBR
i. Normally we use multi-level boot loader.Here MBR means I am referencing to DOS MBR.
ii.Afer BIOS executes a valid DOS MBR,the DOS MBR will search for a valid primary partition marked as bootable on the hard disk.
iii.If MBR finds a valid bootable primary partition then it executes the first 512-bytes of that partition which is second level MBR.
iv. In linux we have two types of the above mentioned second level MBR known as LILO and GRUB
3.LILO
i.LILO is a linux boot loader which is too big to fit into single sector of 512-bytes.
ii.So it is divided into two parts :an installer and a runtime module.
iii.The installer module places the runtime module on MBR.The runtime module has the info about all operating systems installed.
iv.When the runtime module is executed it selects the operating system to load and transfers the control to kernel.
v.LILO does not understand filesystems and boot images to be loaded and treats them as raw disk offsets
GRUB
i.GRUB MBR consists of 446 bytes of primary bootloader code and 64 bytes of the partition table.
ii.GRUB locates all the operating systems installed and gives a GUI to select the operating system need to be loaded.
iii.Once user selects the operating system GRUB will pass control to the karnel of that operating system.
see below what is the difference between LILO and GRUB
4.Kernel
i.Once GRUB or LILO transfers the control to Kernel,the Kernels does the following tasks
- Intitialises devices and loads initrd module
- mounts root filesystem
i.The kernel, once it is loaded, finds init in sbin(/sbin/init) and executes it.
ii.Hence the first process which is started in linux is init process.
iii.This init process reads /etc/inittab file and sets the path, starts swapping, checks the file systems, and so on.
iv.It runs all the boot scripts(/etc/rc.d/*,/etc/rc.boot/*)
v.starts the system on specified run level in the file /etc/inittab
6.Runlevel
i.There are 7 run levels in which the linux OS runs and different run levels serves for different purpose.The descriptions are
given below.
- 0 – halt
- 1 – Single user mode
- 2 – Multiuser, without NFS (The same as 3, if you don’t have networking)
- 3 – Full multiuser mode
- 4 – unused
- 5 – X11
- 6 – Reboot
Now as per our setting in /etc/inittab the Operating System the operating system boots up and finishes the bootup process.
Below are given some few important differences about LILO and GRUB
LILO |
GRUB
|
LILO has no interactive command interface | GRUB has interactive command interface |
LILO does not support booting from a network | GRUB does support booting from a network |
If you change your LILO config file, you have to rewrite the LILO stage one boot loader to the MBR | GRUB automatically detects any change in config file and auto loads the OS |
LILO supports only linux operating system | GRUB supports large number of OS |
AIX Boot Process
Three phases available in BOOT Process
1. Ros kernel init phase
2. Base Device Configuration
3. System boot phase
1. Ros Kernel init phase (PHASE1)
A. Post (power on self test)
In this post it will do basic hardware checking
B. Then it will go to NVRAM and check the boot list for last boot device (hdisk0 or hdisk1).
C. Then it will check the BLV (hd5) in boot device.
D. Then it will check the boot image
E. Then boot image is moved to memory.
F. Then kernel will execute.
2. Base Device configuration (PHASE2)
A. Here cfgmgr will run for device configuration.
3. System Boot Phase (PHASE3)
A. Kernel will execute.
B. The paging space (hd6) will get started.
C. Then following file system will be mounted /, /var. /usr, /home. /tmp
D. Kernel start the init process, it will read the /etc/inittab file and execute the following process.
/etc/rc.boot,
srcmstr
/etc/rc.tcpip
/etc/rc.net
The above network related files /etc/rc.tcpip, /etc/rc.net, used to configure the ip address and routing.
E. Then it will start the system by default run level 2.
NOTE:
Run level 2: It contains all of the terminal process and daemons that are run in the multi user environment. This is default run level.
/etc/inittab file contains four fields, 1. Identifier, 2. Command, 3. Action, 4. Runlevel
Modifying /etc/inittab entries without using vi
These are the steps for modifying /etc/inittab without using Vi editor in AIX.
Before editing take a copy the inittab file to a file named inittab.old
cp –p
/etc/inittab /etc/inittab.old
#mkitab
---------->Adds records to the /etc/inittab file.
# mkitab
"xcmd:2:respawn:find / -type f > /dev/null 2>&1"
#lsitab
------------>Lists records in the /etc/inittab file
lsitab xcmd
#chitab Changes
records in the /etc/inittab file.
# chitab
"xcmd:2:once:find / -type f > /dev/null 2>&1"
#rmitab ---> Removes
records from the /etc/inittab file.
# rmitab xcmd
# lsitab xcmd
Replacing Faulty Disk in ROOTVG
Analyzing Disk Fault
The first signs that a
hard disk is going faulty are temporary error log messages in Error Reporter.
If you see random temporary errors, then you don't have an immediate problem
but if you start to see a bundle of temporary errors then the disk will need
replacing. The worse case scenario is permanent error against a hard disk and
stale partitions.
Check to see how many errors
have been logged and whether they are permanent of temporary by:
errpt |more
1581762B 0727203502 T H hdisk0 DISK OPERATION ERROR
1581762B 0727203502 P H hdisk0 DISK OPERATION ERROR
The first error log
message shows that there is a temporary disk problem on hdisk0, whilst the
second error log message shows a permanent error also on hdisk0. The procedures
for replacing hdisk0 & hdisk1 <part of rootvg> are slightly
different. See the steps below.
To check for stale
partitons, run the command: lsvg -l rootvg
rootvg:
LV
NAME TYPE LPs PPs PVs LV STATE MOUNT
POINT
hd5 boot 1 2
2 closed/syncd N/A
hd6 paging 64 128 2 open/syncd N/A
hd8 jfslog 1 2
2 open/stale N/A
hd4 jfs 4 8
2 open/stale /
Steps
for replacing faulty disks in other volume groups are much simpler than
replacing disks in rootvg. I have written a procedure for this below also.
For
procedures on replacing faulty SSA disk, refer to the link
Replacing
hdisk0 in rootvg
Change bootlist
bosboot -a -d hdisk1 Make sure hdisk1
has a boot image
bootlist -m normal hdisk1
hdisk0 Change the bootlist so
the system will use hdisk1 before hdisk0
Removing Primary Dump Device
sysdumpdev -l The primary dump device will always be on hdisk0, this will need to be
changed
primary /dev/pdumplv
secondary /dev/sdumplv
copy directory /var/adm/dump
forced copy flag FALSE
always allow dump TRUE
dump compression ON
sysdumpdev -Pp /dev/hd6 Changes primary dump
device
primary /dev/hd6
secondary /dev/sdumplv
copy directory /var/adm/dump
forced copy flag FALSE
always allow dump TRUE
dump compression ON
rmlv pdumplv Remove
the logical volume pdumplv, the primary dump device
Un-Mirroring Hard Disk from
VG
Now
you need to un-mirror the volume group so the disk can be removed. There are two
ways you can do this, one is whereby you run it at a disk level and the other
is at a logical partition level. The outcome will be the same with both
commands but with the second you have more control.
Method One
unmirrorvg
rootvg hdisk0 Unmirrors
the disk.
NB:
Sometimes this is unstable, especially if you have stale partitions. I have
also noticed that if pdumplv is mirrored <shouldn't be by default>, this
command will fail. In this instance, unmirror the logical volume and then run
the unmirrorvg command, alternatively follow the method below.
Method Two
lsvg
-l rootvg Lists
all logical volumes in rootvg
rootvg:
LV
NAME TYPE LPs PPs PVs LV STATE MOUNT
POINT
hd5 boot 1 2
2 closed/syncd N/A
hd6 paging 64 128 2 open/syncd N/A
hd8 jfslog 1 2
2 open/syncd N/A
hd4 jfs 4 8
2 open/syncd /
rmlvcopy LVNAME 1
hdisk0 Run this command for each logical
volume
e.g: rmlvcopy hd5 1 hdisk0
Check the disk has been
umirrored by: lsvg -l rootvg. For each LV,
the PVs column will have 1
rootvg:
LV
NAME TYPE LPs PPs PVs LV STATE MOUNT
POINT
hd5 boot 1 2 1 closed/syncd N/A
hd6 paging 64 128 1 open/syncd N/A
hd8 jfslog 1 2
1 open/syncd N/A
hd4 jfs 4 8
1 open/syncd /
Make
a note of the SCSI id and serial number which will make the CE's life easier
when he has to remove the disk. I have highlighted the SCSI id <8> and
serial number <4DFJY156> from the example below. The command you need to
run is. lscfg -vl hdisk0
DEVICE LOCATION DESCRIPTION
hdisk0 10-88-00-8,0 16 Bit LVD SCSI Disk Drive <9100
MB>
Manufacturer............................IBM
Machine Type and
Model......DDYS-T09170M
FRU Number...........................00P1517
ROS Level and
ID...................53394841
Serial
Number.........................4DFJY156
EC
Level...................................F79924
Part
Number............................07N3852
Device
Specific.<Z0>...............000003029F00013A
Device
Specific.<Z1>...............07N4925
Device
Specific.<Z2>...............0933
Device
Specific.<Z3>...............00315
Device
Specific.<Z4>...............0001
Device
Specific.<Z5>...............22
Device
Specific.<Z6>...............F79924
Remove the Disk from VG
reducevg rootvg hdisk0 Remove hdisk0 from the volume group
rmdev -l hdisk0 -d Remove the definition of
hdisk0 from the system
lsvg rootvg Ensure disk is
removed
lspv hdisk0 Ensure disk is
removed
Now Remove the Disk physically and add the New Disk.
Add the New Disk to the
System
cfgmgr Now run configuration Manager to add the
new disk to the system
diag Then go into diagnostics to update the system log so the
system is aware that hdisk0 has been replaced
Task Selection ->
Log Repair Action ->
hdisk0
Esc
0 To exit
diagnostics after Log Repair Action has completed.
errpt | more Check Log Repair
Action has taken place. You should see an entry like :-
2F3E09A4 0819110902 I H hdisk2 REPAIR ACTION
diag Go back into diagnostics and
certify this disk. This will indicate whether the new disk is ok
Task Selection ->
Certify the disk ->
hdisk0 Commit
the changes and exit by pressing F3
Esc
0 To
exit diagnostics after Certifying the new disk
Add disk into the Volume
Group
extendvg rootvg hdisk0 Add disk into the
volume group rootvg
Now you need to re-mirror
the disk. Again you can mirror at a disk level or at a logical level.
Re-Mirroring Hard Disk
Method One
mirrorvg rootvg hdisk0 Mirrors the disk
syncvg -v rootvg Synchronizes
the volume group and the data contained within it
NB: This method will
mirror the logical volume pdumplv. Unmirror the logical volume by:
rmlvcopy pdumplv 1 hdisk1
Method Two
lsvg -l rootvg Lists
all the logical volumes to re-mirror
mklvcopy -k LVNAME 2
hdisk0 Run this command for
each logical volume. This will also synchronize the data <-k>
e.g: mklvcopy hd5 hdisk0
NB: Do not mirror the logical volume pdumplv
syncvg -v rootvg Synchronizes
the volume group and the data contained within it
lsvg -l rootvg Check
datavg has been mirrored and status is open/syncd
Check the volume group has
been completely re-mirrored by: lsvg -l rootvg. The PV column should have 2 for
each LVNAME apart from pdumplv & sdumplv
rootvg:
LV
NAME TYPE LPs
PPs PVs LV STATE MOUNT
POINT
hd5 boot 1 2
2 closed/syncd N/A
hd6 paging 64 128
2 open/syncd N/A
hd8 jfslog 1 2
2 open/syncd N/A
hd4 jfs 4 8
2 open/syncd /
mklv -y 'pdumplv' rootvg 40
hdisk0 Re-create the
logical volume for your primary dump device
sysdumpdev -Pp /dev/pdumplv Re-alocate your
primary dump device.
primary /dev/pdumplv
secondary /dev/sdumplv
copy directory /var/adm/dump
forced copy flag FALSE
always allow dump TRUE
dump compression ON
bosboot -a -d hdisk0 Update
the boot image on hdisk0
bootlist -m normal hdisk0
hdisk1 Change
your boot list back.