DISCLAIMER : Please note that blog owner takes no responsibility of any kind for any type of data loss or damage by trying any of the command/method mentioned in this blog. You may use the commands/method/scripts on your own responsibility.If you find something useful, a comment would be appreciated to let other viewers also know that the solution/method work(ed) for you.
Replacing Faulty Disk in ROOTVG
Analyzing Disk Fault
The first signs that a
hard disk is going faulty are temporary error log messages in Error Reporter.
If you see random temporary errors, then you don't have an immediate problem
but if you start to see a bundle of temporary errors then the disk will need
replacing. The worse case scenario is permanent error against a hard disk and
stale partitions.
Check to see how many errors
have been logged and whether they are permanent of temporary by:
errpt |more
1581762B 0727203502 T H hdisk0 DISK OPERATION ERROR
1581762B 0727203502 P H hdisk0 DISK OPERATION ERROR
The first error log
message shows that there is a temporary disk problem on hdisk0, whilst the
second error log message shows a permanent error also on hdisk0. The procedures
for replacing hdisk0 & hdisk1 <part of rootvg> are slightly
different. See the steps below.
To check for stale
partitons, run the command: lsvg -l rootvg
rootvg:
LV
NAME TYPE LPs PPs PVs LV STATE MOUNT
POINT
hd5 boot 1 2
2 closed/syncd N/A
hd6 paging 64 128 2 open/syncd N/A
hd8 jfslog 1 2
2 open/stale N/A
hd4 jfs 4 8
2 open/stale /
Steps
for replacing faulty disks in other volume groups are much simpler than
replacing disks in rootvg. I have written a procedure for this below also.
For
procedures on replacing faulty SSA disk, refer to the link
Replacing
hdisk0 in rootvg
Change bootlist
bosboot -a -d hdisk1 Make sure hdisk1
has a boot image
bootlist -m normal hdisk1
hdisk0 Change the bootlist so
the system will use hdisk1 before hdisk0
Removing Primary Dump Device
sysdumpdev -l The primary dump device will always be on hdisk0, this will need to be
changed
primary /dev/pdumplv
secondary /dev/sdumplv
copy directory /var/adm/dump
forced copy flag FALSE
always allow dump TRUE
dump compression ON
sysdumpdev -Pp /dev/hd6 Changes primary dump
device
primary /dev/hd6
secondary /dev/sdumplv
copy directory /var/adm/dump
forced copy flag FALSE
always allow dump TRUE
dump compression ON
rmlv pdumplv Remove
the logical volume pdumplv, the primary dump device
Un-Mirroring Hard Disk from
VG
Now
you need to un-mirror the volume group so the disk can be removed. There are two
ways you can do this, one is whereby you run it at a disk level and the other
is at a logical partition level. The outcome will be the same with both
commands but with the second you have more control.
Method One
unmirrorvg
rootvg hdisk0 Unmirrors
the disk.
NB:
Sometimes this is unstable, especially if you have stale partitions. I have
also noticed that if pdumplv is mirrored <shouldn't be by default>, this
command will fail. In this instance, unmirror the logical volume and then run
the unmirrorvg command, alternatively follow the method below.
Method Two
lsvg
-l rootvg Lists
all logical volumes in rootvg
rootvg:
LV
NAME TYPE LPs PPs PVs LV STATE MOUNT
POINT
hd5 boot 1 2
2 closed/syncd N/A
hd6 paging 64 128 2 open/syncd N/A
hd8 jfslog 1 2
2 open/syncd N/A
hd4 jfs 4 8
2 open/syncd /
rmlvcopy LVNAME 1
hdisk0 Run this command for each logical
volume
e.g: rmlvcopy hd5 1 hdisk0
Check the disk has been
umirrored by: lsvg -l rootvg. For each LV,
the PVs column will have 1
rootvg:
LV
NAME TYPE LPs PPs PVs LV STATE MOUNT
POINT
hd5 boot 1 2 1 closed/syncd N/A
hd6 paging 64 128 1 open/syncd N/A
hd8 jfslog 1 2
1 open/syncd N/A
hd4 jfs 4 8
1 open/syncd /
Make
a note of the SCSI id and serial number which will make the CE's life easier
when he has to remove the disk. I have highlighted the SCSI id <8> and
serial number <4DFJY156> from the example below. The command you need to
run is. lscfg -vl hdisk0
DEVICE LOCATION DESCRIPTION
hdisk0 10-88-00-8,0 16 Bit LVD SCSI Disk Drive <9100
MB>
Manufacturer............................IBM
Machine Type and
Model......DDYS-T09170M
FRU Number...........................00P1517
ROS Level and
ID...................53394841
Serial
Number.........................4DFJY156
EC
Level...................................F79924
Part
Number............................07N3852
Device
Specific.<Z0>...............000003029F00013A
Device
Specific.<Z1>...............07N4925
Device
Specific.<Z2>...............0933
Device
Specific.<Z3>...............00315
Device
Specific.<Z4>...............0001
Device
Specific.<Z5>...............22
Device
Specific.<Z6>...............F79924
Remove the Disk from VG
reducevg rootvg hdisk0 Remove hdisk0 from the volume group
rmdev -l hdisk0 -d Remove the definition of
hdisk0 from the system
lsvg rootvg Ensure disk is
removed
lspv hdisk0 Ensure disk is
removed
Now Remove the Disk physically and add the New Disk.
Add the New Disk to the
System
cfgmgr Now run configuration Manager to add the
new disk to the system
diag Then go into diagnostics to update the system log so the
system is aware that hdisk0 has been replaced
Task Selection ->
Log Repair Action ->
hdisk0
Esc
0 To exit
diagnostics after Log Repair Action has completed.
errpt | more Check Log Repair
Action has taken place. You should see an entry like :-
2F3E09A4 0819110902 I H hdisk2 REPAIR ACTION
diag Go back into diagnostics and
certify this disk. This will indicate whether the new disk is ok
Task Selection ->
Certify the disk ->
hdisk0 Commit
the changes and exit by pressing F3
Esc
0 To
exit diagnostics after Certifying the new disk
Add disk into the Volume
Group
extendvg rootvg hdisk0 Add disk into the
volume group rootvg
Now you need to re-mirror
the disk. Again you can mirror at a disk level or at a logical level.
Re-Mirroring Hard Disk
Method One
mirrorvg rootvg hdisk0 Mirrors the disk
syncvg -v rootvg Synchronizes
the volume group and the data contained within it
NB: This method will
mirror the logical volume pdumplv. Unmirror the logical volume by:
rmlvcopy pdumplv 1 hdisk1
Method Two
lsvg -l rootvg Lists
all the logical volumes to re-mirror
mklvcopy -k LVNAME 2
hdisk0 Run this command for
each logical volume. This will also synchronize the data <-k>
e.g: mklvcopy hd5 hdisk0
NB: Do not mirror the logical volume pdumplv
syncvg -v rootvg Synchronizes
the volume group and the data contained within it
lsvg -l rootvg Check
datavg has been mirrored and status is open/syncd
Check the volume group has
been completely re-mirrored by: lsvg -l rootvg. The PV column should have 2 for
each LVNAME apart from pdumplv & sdumplv
rootvg:
LV
NAME TYPE LPs
PPs PVs LV STATE MOUNT
POINT
hd5 boot 1 2
2 closed/syncd N/A
hd6 paging 64 128
2 open/syncd N/A
hd8 jfslog 1 2
2 open/syncd N/A
hd4 jfs 4 8
2 open/syncd /
mklv -y 'pdumplv' rootvg 40
hdisk0 Re-create the
logical volume for your primary dump device
sysdumpdev -Pp /dev/pdumplv Re-alocate your
primary dump device.
primary /dev/pdumplv
secondary /dev/sdumplv
copy directory /var/adm/dump
forced copy flag FALSE
always allow dump TRUE
dump compression ON
bosboot -a -d hdisk0 Update
the boot image on hdisk0
bootlist -m normal hdisk0
hdisk1 Change
your boot list back.
Resetting an unknown root password in AIX
The following procedure requires some system
downtime.
a) Insert the product media for the same version
and level as the current installation into the appropriate drive.
b) Power on the machine.
c) When the screen of icons appears, or when you
hear a double beep, press the F1 key repeatedly until the System Management
Services menu appears.
d) Select Boot options
e) Select Install/boot device.
f) Select the device that holds the product media
and then select Install.(cd/DVD)
g) Media type if installations (SCSCi,IDE etc.)
h) Select shown devices
i) Select Normal
boot mode
j) Exit SMS mode
k) System boots from the media
l) Define your current system as the system
console by pressing the F1 key and then press Enter.
m) Select the number of your preferred language
and press Enter.
n) Choose Start Maintenance Mode for System
Recovery by typing 3 and press Enter.
o) Select Access a Root Volume Group. A message
displays explaining that you will not be able to return to the Installation
menus without rebooting if you change the root volume group at this point.
Type 0 and press Enter.
p) Type the number of the appropriate volume group
from the list and press Enter.
Select Access this Volume Group and start a shell
by typing 1 and press Enter.
At the # (number sign) prompt, type the passwd
command at the command line prompt to reset the root password. For example:
# passwd
Changing password for "root"
root's New password:
Enter the new password again:
To write everything from the buffer to the hard
disk and reboot the system, type the following:
sync;sync;sync;reboot
*****************************************************************************
Switching between 32-bit & 64-bit modes in AIX
SWITCHING BETWEEN 32-BIT AND 64-BIT
MODES
----------------------------------------------------------------------------------------------
To switch from 32-bit mode to 64-bit mode run do following commands,
in the given order:
1. ln -sf /usr/lib/boot/unix_64 /unix
2. ln -sf /usr/lib/boot/unix_64 /usr/lib/boot/unix
3. smitty load64bit
4. Select Enable/Disable at System Restart
5. Choose Yes and press ENTER.
6. Quit smitty.
7. bosboot -ad /dev/ipldevice
8. shutdown -Fr
9. bootinfo -K (should now show 64)
===============================================================
SWITCHING BETWEEN 64-BIT AND 32-BIT MODES
-----------------------------------------------------------------------------------------------
To switch from 64-bit mode to 32-bit mode run the following commands,
in the given order:
1. ln -sf /usr/lib/boot/unix_mp /unix
2. ln -sf /usr/lib/boot/unix_mp /usr/lib/boot/unix
3. smitty load64bit
4. Select Enable/Disable at System Restart
5. Choose No and press ENTER.
6. Quit smitty.
7. bosboot -ad /dev/ipldevice
8. shutdown -Fr
9. bootinfo -K (should now show 32)
Paging Space in AIX
System paging space
#lsps -a --
current utilization of each of the paging spaces on a system
lsps -s -------
total active paging space and its current utilization.
swapon---- to activate the initial paging-space device.
swapoff
--command to dynamically deactivate the paging space,
swapon -a
---Activates listed in the /etc/swapspaces
Creating paginglv-----
smit mklv
eg: Name-------paging03
On
---------------------hdisk5
VG
--------------------rootvg
b) swapon
/dev/paging03-----To activate paging
Increasing and decreasing paginglv
Use the chps -d
command to decrease the size of paging03 by 2 logical partitions
# chps -d 2
paging03
To increase
#chps –s 2
paging03
Stopping and restarting the errdemon in AIX
Here are the commands related to error demon in AIX
To stop logging run the below command
#/usr/lib/errstop
To get rid of that log.
# rm
/var/adm/ras/errlog
To restart the daemon, thus creating a new error log
# /usr/lib/errdemon
To determine the
path to your system's error log file, run the following command:
#
/usr/lib/errdemon -l
output :
# /usr/lib/errdemon -l
Error Log Attributes
---------------------------------------------
Log File /var/adm/ras/errlog
Log Size 1048576 bytes
Memory Buffer Size 32768 bytes
Duplicate Removal true
Duplicate Interval 10000 milliseconds
Duplicate Error Maximum 1000
Error Log Attributes
---------------------------------------------
Log File /var/adm/ras/errlog
Log Size 1048576 bytes
Memory Buffer Size 32768 bytes
Duplicate Removal true
Duplicate Interval 10000 milliseconds
Duplicate Error Maximum 1000
# errpt
-a------------To display a detailed report of all the errors
#errclear---
Deletes entries from the error log.
#errinstall---
Installs messages in the error logging message sets.
#errupdate
------Updates the Error Record Template repository.
/usr/adm/ras/errlog
#errpt -A -j
identifier - to check the error log with a specific identifier
#errclear 0 - clears the error log
#
/usr/lib/errstop - stops the error deamon
#
/usr/lib/errdemon - starts error deamon
How to mount CD-ROM in AIX
Steps to mount a CD-ROM in AIX
1. Insert the CD-ROM into the CD-ROM drive.
2. Log in as user ROOT or type su - root to login using the root profile.
3. Create a /cdrom directory by entering mkdir /cdrom.
4. Enter smit to add a CD-ROM file system.
5. Select System Storage Management (Physical & Logical Storage) -> File Systems -> Add/Change/Show/Delete File Systems -> CDROM File Systems -> Add a CDROM File System.
6. Select a device name, such as cd0. CD-ROM file system device names must be unique.
7. Type /cdrom to get the Mount Point prompt.
8. Select OK, or press Enter if using the smit ASCII interface, returning to the previous smit level,
System Storage Management (Physical & Logical Storage).
9. Select File Systems -> Mount a File System.
10. For file system name, select /dev/cd0.
11. For directory over which to mount, select /cdrom.
12. For type of file system, select cdrfs.
13. For Mount as a READ-ONLY system, select Yes.
14. Select OK, or press Enter if using the smit ASCII interface.
15. Exit smit.
Steps for FC Adapter Firmware update
The steps that are needed to be followed for FC adapter firmware update in AIX is show below
1) To check how many FC adapters are there, same can be used for (datapath) also
pcmpath query adapter(SDDPCM) or datapath query adapter(for SDD Device drivers)
2) To check Firmware Level of FC adapter
lscfg –vpl fcs0
check for Z9 from above,Which is TS1.91A5
3) To find out what is the latest firmware level for FC adapter.
The URL to find out firmware for above part number is
http://support.bull.com/ols/product/platforms/escala/firmware/dev/fc/files/lp10000/df1000fa.191a5.html
We need to check the below info from above URL
Check part number from below which is affected or not
check for firmware level and here in this server we have latest firmware level
4) If the lpar is having lowest firmware level, follow the below
procedure if it has more than 1 FC adapter.
Download latest Firmware and upload to the LPAR ,
a) Unpack the file as follows:
# cd /tmp/ucode
# zcat df1000fa.191a5.tar.Z | tar xvf –
b) Move the microcode to the microcode directory.
Download latest Firmware and upload to the LPAR ,
a) Unpack the file as follows:
# cd /tmp/ucode
# zcat df1000fa.191a5.tar.Z | tar xvf –
b) Move the microcode to the microcode directory.
# mv /tmp/ucode/df1000fa.191105 /etc/microcode
5) Updating firmware for FC adapters
pcmpath query adapter
Active Adapters :2
Adpt# Name State Mode Select Errors Paths Active
0 fscsi1 NORMAL ACTIVE 17109 0 20 20
1 fscsi0 NORMAL ACTIVE 17238 0 20 20
updating firmware level for adapter FCS1,for this I need to make FCS1 adapter offline by running the below command
pcmpath set adapter 0 offline
After setting as offline,it looks as
pcmpath query adapter===============================
Active Adapters :2
Adpt# Name State Mode Select Errors Paths Active
0 fscsi1 FAILED OFFLINE 17124 0 20 0
1 fscsi0 NORMAL ACTIVE 17536 0 20 20
pcmpath query device=================
DEV#: 2 DEVICE NAME: hdisk2 TYPE: 2107900 ALGORITHM: Load Balance
SERIAL: 75208013605
==========================================================================
Path# Adapter/Path Name State Mode Select Errors
0 fscsi1/path0 OPEN OFFLINE 188 0
1 fscsi1/path1 OPEN OFFLINE 179 0
2 fscsi0/path2 OPEN NORMAL 181 0
3 fscsi0/path3 OPEN NORMAL 178 0
DEV#: 3 DEVICE NAME: hdisk3 TYPE: 2107900 ALGORITHM: Load Balance
SERIAL: 75208013704
==========================================================================
Path# Adapter/Path Name State Mode Select Errors
0 fscsi1/path0 OPEN OFFLINE 375 0
1 fscsi1/path1 OPEN OFFLINE 444 0
2 fscsi0/path2 OPEN NORMAL 421 0
3 fscsi0/path3 OPEN NORMAL 440 0
Adpt# Name State Mode Select Errors Paths Active
0 fscsi1 NORMAL ACTIVE 17109 0 20 20
1 fscsi0 NORMAL ACTIVE 17238 0 20 20
updating firmware level for adapter FCS1,for this I need to make FCS1 adapter offline by running the below command
pcmpath set adapter 0 offline
After setting as offline,it looks as
pcmpath query adapter===============================
Active Adapters :2
Adpt# Name State Mode Select Errors Paths Active
0 fscsi1 FAILED OFFLINE 17124 0 20 0
1 fscsi0 NORMAL ACTIVE 17536 0 20 20
pcmpath query device=================
DEV#: 2 DEVICE NAME: hdisk2 TYPE: 2107900 ALGORITHM: Load Balance
SERIAL: 75208013605
==========================================================================
Path# Adapter/Path Name State Mode Select Errors
0 fscsi1/path0 OPEN OFFLINE 188 0
1 fscsi1/path1 OPEN OFFLINE 179 0
2 fscsi0/path2 OPEN NORMAL 181 0
3 fscsi0/path3 OPEN NORMAL 178 0
DEV#: 3 DEVICE NAME: hdisk3 TYPE: 2107900 ALGORITHM: Load Balance
SERIAL: 75208013704
==========================================================================
Path# Adapter/Path Name State Mode Select Errors
0 fscsi1/path0 OPEN OFFLINE 375 0
1 fscsi1/path1 OPEN OFFLINE 444 0
2 fscsi0/path2 OPEN NORMAL 421 0
3 fscsi0/path3 OPEN NORMAL 440 0
6) Now go to /etc/microcode and check fix is there, then run diag
cd /etc/microcode
-rwxr-xr-x 1 10384 10000 664748 Jul 27 2006 df1000fa.191105
diag--------------------------------run this in command prompt, select as
-rwxr-xr-x 1 10384 10000 664748 Jul 27 2006 df1000fa.191105
diag--------------------------------run this in command prompt, select as
press enter----------------------------------
Select ---------Task Selection (Diagnostics, Advanced Diagnostics, Service Aids, etc.)
Select-----------------Microcode Taks
select------------Download Latest Available Microcode
Now it shows as below
Now you select fcs1,which we made it offline in step 5 and press enter
The below shows installation of new microcode completed for fcs1
press---------f10
7)After installation ,we have to make fcs1 adapter online
8) Now check paths are normal are not by running
9) Check new firmware level for FC adapter by running
10) The same procedure we need to follow for adapter fcs0,as we followed above from 5th step.
0 fscsi1 NORMAL ACTIVE 17109 0 20 20
1 fscsi0 NORMAL ACTIVE 17238 0 20 20
update firmware level for adapter FCS0,for this I need to make FCS1 adapter offline by running the below command
pcmpath set adapter 1 offline
Working with Crontab
Crontab is the scheduler program used ina unix servers.
Each user can have their own crontab, and though these are files in /var/spool/cron , they are not intended to be edited directly.
If the cron.allow file exists, then you must be listed therein in order to be allowed to use this command.
If the cron.allow file does not exist but the cron.deny file does exist, then you must not be listed in the cron.deny file in order to use this command.
If neither of these files exists, only the super user will be allowed to use this command.
cronatb -l --- To view the crontab
crontab -e --- To edit the crontab
crontab -e --- To edit the crontab
MIN HOUR DOM MON DOW CMD /script/location/full/path
minute 0 through
59
hour 0 through
23
day_of_month 1
through 31
month 1 through
12
weekday 0
through 6 for Sunday through Saturday
Example :
Example :
30 08 10 06 * /home/xyz/full-backup
- 30 – 30th Minute
- 08 – 08 AM
- 10 – 10th Day
- 06 – 6th Month (June)
- * – Every day of the week
Editing crontab
#crontab –l
# crontab -l
> crontab.current ------backup
crontab
#vi
crontab.current---------Modify changes to backup file
# crontab
crontab.current---------To re-enable crontab with modified changes
UNIX History
Unix is a multitasking, multi-user computer operating system originally developed in 1969 by a group of AT&T employees at Bell Labs, including Ken Thompson, Dennis Ritchie, Brian Kernighan, Douglas McIlroy, Michael Lesk and Joe Ossanna. The Unix operating system was first developed in assembly language, but by 1973 had been almost entirely recoded in C, greatly facilitating its further development and porting to other hardware. Today's Unix system evolution is split into various branches, developed over time by AT&T as well as various commercial vendors, universities such as University of California, Berkeley's BSD and non-profit organizations.
During the late 1970s and early 1980s, the influence of Unix in academic circles led to large-scale adoption of Unix by commercial startups, the most notable of which are Solaris, HP-UX and AIX, as well as Darwin, which forms the core set of components upon which Apple's OS X, Apple TV, and iOS are based. Today, in addition to certified Unix systems such as those already mentioned, Unix-like operating systems such as MINIX, Linux, Android, and BSD descendants (FreeBSD, NetBSD, OpenBSD, and DragonFly BSD) are commonly encountered. The term traditional Unix may be used to describe an operating system that has the characteristics of either Version 7 Unix or UNIX System V.
The companies developed their own Unix flavor and the details are :
IBM - AIX
SUN - Solaris ( now owned by Oracle company)
HP - HP-UX
RedHat - Redhat Linux (RHEL)
These are the most used unix flavors in the servers.
They use different architecture for the operating system to bulid in.