Cloud DevOps Admin Guide

SRC (System Resource Controller) in AIX

One of the unique facilities available on AIX is System Resource Controller (SRC). The SRC gives a set of commands that make it very easy for the system administrator to maintain and manage the subsystems and subservers running on your AIX system.

What is a subsystem ?
Subsystem is a set of related programs designed to perform one particular function. The subsystems can be sub divided into subservers(daemons).
SRC helps you to manage the whole subsystems and their respective subservers by creating subsystem groups. You can use SRC related commands to start/stop/refresh the subsystems and subservers.
In the example above, there is a subsystem group called tcpip. Under the subsystem group called tcpip you have a subsystem called inetd. Under the subsystem called inetd you have a subserver called telnet.
So to work with either the group called tcpip or the subsystem called inetd or the subserver called telnet, you need to use the SRC set of commands.
In other words, the SRC (System Resource Controller) is a process manager that is used to spawn, monitor and control services. Many of the standard Unix daemons are managed via this interface on AIX.
SRC does not have a persistent "service profile" and therefore does not comprehend persistence beyond the current boot. For this reason, it is necessary to find where the service is started and add or remove the startsrc (service start) command there. The most popular locations for this are rc.tcp and inittab.

SRC controlled processes must be started and stopped via the SRC interface. If a SRC process dies or is killed the srcmstr daemon will re-spawn that process and log an error to the system error log.

The core process for SRC (srcmstr) is spawned from /etc/initttab. Services that run under SRC control do not leave their process group (ie: have a PPID of 1), but instead, stay children of srcmstr.

List the status of the cdromd service
lssrc -s cdromd
List the status of inetd subservices
lssrc -l -s inetd
List the status of all members of the NFS group
lssrc -g nfs

Start the cdromd service
startsrc -s cdromd
››› There is not a persistent flag for the startsrc command.
For this service to automatically start on the next boot, a change must be made to one of the system initialization files.
In this case, an entry must be made in /etc/initttab.

Stop the cdromd service
stopsrc -s cdromd
Send a refresh request to the syslogd service
refresh -s syslogd
››› This would typically be communicated via a HUP signal.
Not all SRC controlled processes respond to a refresh request and may require a HUP signal.

AIX Evolution – Over Twenty years of Progress

I have attached here the picture showing the AIX evolution over 20 years of progress.

Basic Linux Commands

mkdir - make directories

Usage : mkdir [OPTION] DIRECTORY

Options

Create the DIRECTORY(ies), if they do not already exist.

Mandatory arguments to long options are mandatory for short options too.

-m, mode=MODE set permission mode (as in chmod), not rwxrwxrwx - umask

-p, parents no error if existing, make parent directories as needed

-v, verbose print a message for each created directory

-help display this help and exit

-version output version information and exit

cd - change directories

Use cd to change directories. Type cd followed by the name of a directory to access that directory.Keep in mind that you are always in a directory and can navigate to directories hierarchically above or below.

mv- change the name of a directory

Type mv followed by the current name of a directory and the new name of the directory.

Ex: mv testdir newnamedir

pwd - print working directory

will show you the full path to the directory you are currently in. This is very handy to use, especially when performing some of the other commands on this page

rmdir - Remove an existing directory

rm -r - Removes directories and files within the directories recursively.

rm -rf - Forcefully removes directories and files within the directories recursively

chown - change file owner and group

Usage

chown [OPTION] OWNER[:[GROUP]] FILE

chown [OPTION] :GROUP FILE

chown [OPTION] --reference=RFILE FILE

Options

Change the owner and/or group of each FILE to OWNER and/or GROUP. With --reference, change the owner and group of each FILE to those of RFILE.

-c, changes like verbose but report only when a change is made

-dereference affect the referent of each symbolic link, rather than the symbolic link itself

-h, no-dereference affect each symbolic link instead of any referenced file (useful only on systems that can change the ownership of a symlink)

-from=CURRENT_OWNER:CURRENT_GROUP

change the owner and/or group of each file only if its current owner and/or group match those specified here. Either may be omitted, in which case a match is not required for the omitted attribute.

-no-preserve-root do not treat `/' specially (the default)

-preserve-root fail to operate recursively on `/'

-f, -silent, -quiet suppress most error messages

-reference=RFILE use RFILE's owner and group rather than the specifying OWNER:GROUP values

-R, -recursive operate on files and directories recursively

-v, -verbose output a diagnostic for every file processed

The following options modify how a hierarchy is traversed when the -R option is also specified. If more than one is specified, only the final one takes effect.

-H if a command line argument is a symbolic link to a directory, traverse it

-L traverse every symbolic link to a directory encountered

-P do not traverse any symbolic links (default)

chmod - change file access permissions

Usage

chmod [-r] permissions filenames

r Change the permission on files that are in the subdirectories of the directory that you are currently in. permission Specifies the rights that are being granted. Below is the different rights that you can grant in an alpha numeric format.filenames File or directory that you are associating the rights with Permissions

u - User who owns the file.

g - Group that owns the file.

o - Other.

a - All.

r - Read the file.

w - Write or edit the file.

x - Execute or run the file as a program.

Numeric Permissions:

CHMOD can also to attributed by using Numeric Permissions:

400 read by owner

040 read by group

004 read by anybody (other)

200 write by owner

020 write by group

002 write by anybody

100 execute by owner

010 execute by group

001 execute by anybody

ls - Short listing of directory contents

-a list hidden files

-d list the name of the current directory

-F show directories with a trailing '/'

executable files with a trailing '*'

-g show group ownership of file in long listing

-i print the inode number of each file

-l long listing giving details about files and directories

-R list all subdirectories encountered

-t sort by time modified instead of name

cp - Copy files

cp myfile yourfile
Copy the files "myfile" to the file "yourfile" in the current working directory. This command will create the file "yourfile" if it doesn't exist. It will normally overwrite it without warning if it exists.

cp -i myfile yourfile

With the "-i" option, if the file "yourfile" exists, you will be prompted before it is overwritten.

cp -i /data/myfile

Copy the file "/data/myfile" to the current working directory and name it "myfile". Prompt before overwriting the file.

cp -dpr srcdir destdir

Copy all files from the directory "srcdir" to the directory "destdir" preserving links (-poption), file attributes (-p option), and copy recursively (-r option). With these options, a directory and all it contents can be copied to another dir

ln - Creates a symbolic link to a file.

ln -s test symlink

Creates a symbolic link named symlink that points to the file test Typing "ls -i test symlink" will show the two files are different with different inodes. Typing "ls -l test symlink" will show that symlink points to the file test.

locate - A fast database driven file locator.

more - Allows file contents or piped output to be sent to the screen one page at a time

less - Opposite of the more command

cat - Sends file contents to standard output. This is a way to list the contents of short files to the screen. It works well with piping.

whereis - Report all known instances of a command wc - Print byte, word, and line counts

bg

bg jobs Places the current job (or, by using the alternative form, the specified jobs) in the background, suspending its execution so that a new user prompt appears immediately. Use the jobs command to discover the identities of background jobs.

cal month year - Prints a calendar for the specified month of the specified year.

cat files - Prints the contents of the specified files.

clear - Clears the terminal screen.

cmp file1 file2 - Compares two files, reporting all discrepancies. Similar to the diff command, though the output format differs.

diff file1 file2 - Compares two files, reporting all discrepancies. Similar to the cmp command, though the output format differs.

dmesg - Prints the messages resulting from the most recent system boot.

fg

fg jobs - Brings the current job (or the specified jobs) to the foreground.

file files - Determines and prints a description of the type of each specified file.

find path -name pattern -print

Searches the specified path for files with names matching the specified pattern (usually enclosed in single quotes) and prints their names. The find command has many other arguments and functions; see the online documentation.

finger users - Prints descriptions of the specified users.

free - Displays the amount of used and free system memory.

ftp hostname

Opens an FTP connection to the specified host, allowing files to be transferred. The FTP program provides subcommands for accomplishing file transfers; see the online documentation.

head files - Prints the first several lines of each specified file.

ispell files - Checks the spelling of the contents of the specified files.

kill process_ids

kill - signal process_ids

kill -l

Kills the specified processes, sends the specified processes the specified signal (given as a number or name), or prints a list of available signals.

killall program

killall - signal program

Kills all processes that are instances of the specified program or sends the specified signal to all processes that are instances of the specified program.

mail - Launches a simple mail client that permits sending and receiving email messages.

man title

man section title - Prints the specified man page.

ping host - Sends an echo request via TCP/IP to the specified host. A response confirms that the host is operational.

reboot - Reboots the system (requires root privileges).

shutdown minutes

shutdown -r minutes

Shuts down the system after the specified number of minutes elapses (requires root privileges). The -r option causes the system to be rebooted once it has shut down.

sleep time - Causes the command interpreter to pause for the specified number of seconds.

sort files - Sorts the specified files. The command has many useful arguments; see the online documentation.

split file - Splits a file into several smaller files. The command has many arguments; see the online documentation

sync - Completes all pending input/output operations (requires root privileges).

telnet host - Opens a login session on the specified host.

top - Prints a display of system processes that's continually updated until the user presses the q key.

traceroute host - Uses echo requests to determine and print a network path to the host.

uptime - Prints the system uptime.

w - Prints the current system users.

wall - Prints a message to each user except those who've disabled message reception. Type Ctrl-D to end the message.

Set backspace in Unix console

Sometimes the backspace key doesn't work for Unix.

we can setup the backspace key to work in the console by typing the command and press backspace key.

stty erase <backspace>

it will be shown as
stty erase ^H

If you have a problem using the backspace it should display you the above output.
If not, it will just perform a backspace.

After doing that press CTRL+c . And now type anything and try backspace, you can see the backspace working...

Sendmail Configuration in AIX

Daemon : sendmail

To start the daemon :

# startsrc -s sendmail -a "-bd -q30m"
where
bd - To start the sendmail as a SMTP mail relay router
q - Is the interval in which the sendmail daemon processes the saved messages

To start the daemon automatically after the system boot:

a. # vi /etc/rc.tcpip

b. Uncomment the below line
start /usr/lib/sendmail "$src_running" "-bd -q${qpi}"

To display the status of the daemon :

# lssrc -s sendmail
# ps -ef | grep sendmail

To stop the daemon :

# stopsrc -s sendmail
# kill -1 `cat /etc/sendmail.pid`

Configuration File:

/etc/sendmail.cf - Where the hostname, Relay server name,... are stored.

Alias File :

/etc/aliases - Where the group(alias) to member mapping is stored.

To Add the hostname in the sendmail configuration :

a. Vi /etc/sendmail.cf

b. Change "#DwYourHostName" to "Dw{hostname of local server}"

c. # refresh -s sendmail

To Add the mail (relay) server in the sendmail configuration :

a. Vi /etc/sendmail.cf

b. Change "#DSrelayhostname" to "DS{hostname of the Relay Server}"

c. # refresh -s sendmail

To send the mails,

# echo "Test Message" | sendmail -v raja@server1.domain.com

If you add any alias in /etc/aliases file, then do the following

# sendmail -bi

This will make the sendmail daemon to re-read the aliases file.

To display the list of messages in the mail queue :

# mailq (or) # sendmail -bp

Directory containing log files and temp files associated with messages in the mail queue :

/var/spool/mqueue

To delete the first 1000 messages in the root's mail queue :

# mail -u root , then enter "d 1-1000"

Using find command

The command find is used to search a given directory for a file or a given expression mentioned in the command. we can also do necessary actions on the output files using xargs

Some important options:

     -xdev                                          Stay on the same file system (dev in fstab).
    -exec cmd {} \;                           Execute the command and replace {} with the full path
    -iname                                        Like -name but is case insensitive
    -ls                                                Display information about the file (like ls -la)
    -size n                                         n is +-n (k M G T P)
    -cmin n                                      File's status was last changed n minutes ago.

find . -type f ! -perm -444	Find files not readable by all
find . -type d ! -perm -111	Find dirs not accessible by all
find /home/user/ -cmin 10 -print	Files created or modified in the last 10 min.
find . -name '*.[ch]' \| xargs grep -E 'expr'	Search 'expr' in this dir and below.
find / -name "*.core" \| xargs rm	Find core dumps and delete them
find / -name "*.core" -print -exec rm {} \;	Other syntax
find . $ -name ".png" -o -name ".jpg" $ -print	iname is not case sensitive
find . -type f -name "*.txt" ! -name README.txt -print	Exclude README.txt files
find /var/ -size +1M -exec ls -lh {} \;
find /var/ -size +1M -ls	Find in /var files above 1M and longlist them
find . -size +10M -size -50M -print
find /usr/ports/ -name work -type d -print -exec rm -rf {} \;	Clean the ports

Find files with SUID; those file have to be kept secure.

Some more Examples:

1 .To list all files in the file system with a given base file name, type:
find / -name .profile -print

This searches the entire file system and writes the complete path names of all files named .profile.
The / (slash) tells the find command to search the root directory and all of its subdirectories.
In order not to waste time, it is best to limit the search by specifying the directories where you think the
files might be.

2. To list files having a specific permission code in the current directory tree, type:
find . -perm 0600 -print

This lists the names of the files that have only owner-read and owner-write permission. The . (dot) tells the find command to search the current directory and its subdirectories. See the chmod command for an explanation of permission codes.

3. To search several directories for files with certain permission codes, type:
find manual clients proposals -perm -0600 -print

This lists the names of the files that have owner-read and owner-write permission and possibly other permissions. The manual, clients, and proposals directories and their subdirectories are searched. In the previous example, -perm 0600 selects only files with permission codes that match 0600 exactly.
In this example, -perm -0600 selects files with permission codes that allow the accesses indicated by 0600 and other accesses above the 0600 level. This also matches the permission codes 0622 and 2744.

4 .To list all files in the current directory that have been changed during the current 24-hour period, type:
find . -ctime 1 -print

5 .To search for regular files with multiple links, type:
find . -type f -links +1 -print

This lists the names of the ordinary files (-type f) that have more than one link (-links +1). Note: Every directory has at least two links: the entry in its parent directory and its own . (dot) entry. The ln command explains multiple file links.

6 . To find all accessible files whose path name contains find, type:
find . -name '*find*' -print

7. To remove all files named a.out or *.o that have not been accessed for a week and that are not mounted using nfs, type:
find / $ -name a.out -o -name '*.o' $ -atime +7 ! -fstype nfs -exec rm {} \;

Note: The number used within the -atime expression is +7. This is the correct entry if you want the command to act on files not accessed for more than a week (seven 24-hour periods).

8 . To print the path names of all files in or below the current directory, except the directories named SCCS or files in the SCCS directories, type:
find . -name SCCS -prune -o -print

To print the path names of all files in or below the current directory, including the names of SCCS directories, type:
find . -print -name SCCS -prune

9. To search for all files that are exactly 414 bytes long, type:
find . -size 414c -print

10. To find and remove every file in your home directory with the .c suffix, type:
find /u/arnold -name "*.c" -exec rm {} \;

Every time the find command identifies a file with the .c suffix, the rm command deletes that file. The rm command is the only parameter specified for the -exec expression. The {} (braces) represent the current path name.

11 .In this example, dirlink is a symbolic link to the directory dir. You can list the files in dir by refering to the symbolic link dirlink on the command line. To do this, type:
find -H dirlink -print

12 . In this example, dirlink is a symbolic link to the directory dir. To list the files in dirlink, traversing the file hierarchy under dir including any
symbolic links, type:
find -L dirlink -print

13 . To determine whether the file dir1 referred by the symbolic link dirlink is newer than dir2, type:
find -H dirlink -newer dir2
Note: Because the -H flag is used, time data is collected not from dirlink but instead from dir1, which is found by traversing the symbolic link.

14. To produce a listing of files in the current directory in ls format with expanded user and group name, type : find . -ls -long

15 .To list the files with ACL/EA set in current directory, type:
find . -ea

System dump devices - AIX

Traditionally the default dump device for system dumps was: /dev/hd6 (paging space) and still is on a lot of systems. If there is not enough space to copy over the dump file after a crash, then the system administrator is prompted upon restart to copy the dump file over to some removable media , like a tape or DVD. This can be time consuming and it is sometimes the case that you want to get your system back up quickly. I can sympathise with system administrators who just ignore the prompt to get the system back up due to business pressure, thus deleting the dump, so then one does not know why it crashed in the first place. If you do not have enough space on your dump device to copy the dump, then during the start-up process, the copydumpmenu menu utility is invoked to give the system administrator the opportunity to copy the dump to a removable media, for example to a tape device if present. The copydumpmenu utility can also be called from the command line when the system is up. The copy directory by default is /var/adm/ras with the file-name:vmcore.<X>.BZ , where X is a sequence number. The dump file is a BZ (BZIP) and not a Z compressed file format.

The snap command can be used to gather information about the dump file, be-sure to include the -D flag, it gathers the information from the primary dump device.

With systems now having more memory available, this has provided more flexibility as to where the primary dump device could be placed. Typically, for systems with over 4 GB of memory there is now a dedicated dump device, called: lg_dumplv

# lsvg -l rootvg |grep sysdump

lg_dumplv sysdump 8 8 open/syncd N/A

Using the sysdumpdev command, one can determine what devices are used for the system dumps.

The following output shows a system using AIX 7.1 having the lg_dumplv as its primary dump device:
# sysdumpdev -l

primary /dev/lg_dumplv

secondary /dev/sysdumpnull

copy directory /var/adm/ras

forced copy flag TRUE

always allow dump TRUE

dump compression ON

type of dump traditional

Looking more closely at the above output fields. Notice that an extra field is now present for AIX 6.1 onwards: type of dump. Currently set to traditional, here you can have it set at (firmware) fw-assisted, if your hardware supports it. For the secondary field, there is no dump device. This is denoted by using the sysdumpnull device. This means all system dumps are lost if it goes to that device. The copy directory is /var/adm/ras, this is where the system dump will be copied to , for either further examination, or to be copied off to go to IBM support. Note that 'always allow dump' is set to true, this must be the case if a dump is to be successfully initiated. Dump compression is on by default.

Common settings using sysdumpdev are:
To change the primary device use: sysdumpdev -P -p <device_name>
To change the secondary device use: sysdumpdev -P -s <device_name>
To change the copy directory use: sysdumpdev -D <path_name>
To change the always dump condition use: sysdumpdev -k for false, sysdumpdev -K for true
To change the type of dump use: sysdumpdev -t <fw-assisted | traditional>

Few Commands:

1. To view the current dump configuration :

# sysdumpdev -l

primary /dev/hd6
secondary /dev/sysdumpnull
copy directory /var/adm/ras
forced copy flag TRUE
always allow dump FALSE
dump compression OFF

2. To change the primary dump device temporarily :

# sysdumpdev -p /dev/dumplv

3. To change the primary dump device permanently :

# sysdumpdev -P -p /dev/dumplv

4. To change the secondary dump device temporarily :

# sysdumpdev -s /dev/dumplv

5. To change the secondary dump device permanently :

# sysdumpdev -P -s /dev/dumplv

6. To set the copy flag :

# sysdumpdev -K

7. To unset the copy flag :

# sysdumpdev -k

8. To estimate the dump size :

# sysdumpdev -e

9. To list the last dump information :

# sysdumpdev -L
Device name: /dev/lg_dumplv
Major device number: 12
Minor device number: 4
Size: 42123543 bytes
Date/Time: Wed Jan 01 12:03:00 CDT 2009
Dump status: 0
dump completed successfully
Dump copy filename: /var/adm/ras/vmcore.1

10. To copy the saved vmcoren file to tape :

# snap -gfkD -o /dev/rmt0

11. To read the dump file :

# crash dump unix
>

12. To change the dump file location and if the copy fails it should ask external media to copy the dump file:

# sysdumpdev -D /opt/dumpfiles

13. To change the dump file location and if the copy fails it should ignore the system dump:

# sysdumpdev -d /opt/dumpfiles

14. To specify the dumps should not be compressed :

# sysdumpdev -c

15. To specify the dumps should be always compress :

# sysdmpdev -C

16. To find out whether a new systemp dump has occured before the last reboot :

# sysdumpdev -z

The compressed dump is now on the LV lg_dumplv. The dump was not copied across to the copy directory when issuing a user initiated dump. To copy the most recent system dump from a system dump device to a directory, use the savecore command. For example, to copy the dump to the directory /var/adm/ras. I could use:

# savecore -d /var/adm/ras
vmcore.0.BZ

If you need to uncompress the file use the dmpuncompress utility. The format of the command is:

dmpuncompress  < filename>

After uncompressing, the dump file is now ready for further investigation using kdb or for transfer to IBM support.

# dmpuncompress vmcore.0.BZ
replaced with vmcore.0

Alternatively you can use the smit dump menu option and select,Copy a system dump. The following screen displays:

                              Copy dump image to:

Type or select values in entry fields.
Press Enter after making all desired changes.

                                                        [Entry Fields]
* Copy dump image from:                              [/dev/lg_dumplv]         /
* Copy dump image to:                                [/var/adm/ras/dump_fil>
* Input and output file blocksize for copy           [4096]                   #
  Size in bytes of dump image                         63894528
  Date of last dump                                   Thu Oct 27 18-02-28 B>

The fields are populated with the current dump that is on the primary dump device. This is the default setting, after the copy, the dump file is present in: /var/adm/ras:

# ls -l dump_file_copy.BZ
-rw-r--r--    1 root     system     63894528 Oct 27 18:15 dump_file_copy.BZ

After a dump has occurred there may well be a minidump generated as a well. Contained in the errorlog output listing earlier in the article, there was an entry for:

F48137AC   1027180411 U O minidump       COMPRESSED MINIMAL DUMP

The minidump is a small compress dump that will be present in: /var/adm/ras. This file contains a snapshot of the system when the system was dumped or crashed. This file can be used for diagnosing if the main dump is not present, due to the dump being removed or not captured.

AIX ML/TL Upgradation steps

1. Pre-installation checks

To check packages/file set consistency
# lppchk –v

If we found some errors. We can get more information about problem & resolve it before continue with installation.
# lppchk -v -m3

Check the current installed ML/TL
# instfix -i|grep ML
# oslevel –s

Check Rootvg

Commit all package/fileset installed on the servers
# smit maintain_software

Check if rootvg is mirrored and all lv's are mirrored correctly (excluding dump and boot volumes). If your rootvg is not mirrored we can skip later in document part for alt_disk_install,
# lsvg -p rootvg
# lsvg rootvg
# lsvg -l rootvg

2. Preinstallation Task

Check for HACMP cluster

Check if cluster software is installed .Check for HACMP running on server.

# lslpp -l | grep -i cluster
Check if the cluster processes are active
# lssrc -g cluster

If HACMP is used, a current fix pack for HACMP should be installed when a new AIX Technology Level is installed. Currently available HACMP fix packs can be downloaded via http://www14.software.ibm.com/webapp/set2/sas/f/hacmp/home.html

3. Check for IBM C/C++ compiler

Updates needs to be installed with TL up gradation. Same can be downloaded from below mentioned links.
http://www-1.ibm.com/support/docview.wss?rs=2239&uid=swg21110831

4. Check for Java Version

If Java is used, current software updates for the Java version(s) should be installed when a new AIX Technology Level is installed. If Java is being used in conjunction with other software, consult the vendor of that software for recommended Java levels

The Java version(s) installed on AIX can be identified with the commands
# lslpp -l | grep -i java

Default Java version can be identified with the
# java -fullversion command.
Java fixes can be downloaded from below link.
http://www14.software.ibm.com/webapp/set2/sas/f/hacmp/home.html

5. Check for recommended TL/SP for system

Gets information of latest TL/SP for system using Fix Level Recommendation Tool available in below link
http://www14.software.ibm.com/webapp/set2/flrt/home

Download latest updates from IBM fix central website & dump in NIM server.

Create resources in NIM servers.

Run mksysb backup of servers on safer side.

Check for running application compatibility if any. Confirm it with application owner.
Free hdisk1 for alternate disk installation

Remove the secondary dump device if present from hdisk1. Then change the settings for secondary dump device to /dev/sysdumpnull.
# sysdumpdev –P –s /dev/sysdumpnull

Unmirror rootvg
#unmirrorvg rootvg

migrate logical volume from hdisk1 to hdisk0 which are not mirrored.
# migratepv hdisk1 hdisk0.

Clear boot record from hdisk0
# chpv -c hdisk1

Add new boot image to the first PV to have “fresh” boot record just for safer side
# bosboot –ad /dev/hdisk0

Set bootlist to hdisk0
# bootlist –m normal hdisk0 hdisk1 (hdisk1 after installation will contain upgraded OS)

Removes the second PV from rootvg
# reducevg rootvg hdisk1

7. Alternate disk migration

Carry out alternate disk installation via nim on hdisk1. We will carry out preview install. If it gets succeed we will go ahead & install TL/SP in applied mode
# smit nimadm

Reboot system. It will be booted from hdisk1 which contains upgraded OS.
# shutdown -Fr

8. Recreate the mirror of rootvg

After few days of stable work and some tests from application users.

Remove alternate disk installed disk
# alt_disk_install –X

Add disk hdisk0 in rootvg
# extendvg rootvg hdisk0

Check for estimated dump
# sysdumpdev –e

Re-create secondary dump device
# sysdumpdev –P –s “dump_device”

Mirror rootvg with hdisk1 in background.
# nohup mirrorvg '-S' rootvg hdisk1 &

Create bootimage on hdisk1
# bosboot -ad /dev/hdisk1

Add hdisk1 to bootlist
# bootlist -m normal hdisk0 hdisk1

Synchronize rootvg
# nohup syncvg -v rootvg &

To Do List - Before migration from AIX 5.3 to 6.1

Copy Sendmail.cf
Tar Perl Mondule if you use it in 5.3, this will be upgraded when going 6.1 and you might want to fall back to older version
Make sure you note the Number of Allowed Envrioment Process, this will fall back to default which 128 allowed process. **THIS IS CRITICAL**
Copy MOTD “Message of the Day”
Please Make Sure you turn of RSH since you use NIMSH/RSH to upgrade
Make sure you not the NIC Tunning TCP Send/Rcv you will have to put it back afterwards
Memory Tunning (no –a) please review after upgrade, upgrading to 6.1 will always take the best tunning parameters but please review to make sure its good
After upgrade please make sure you run the following command to make sure your OS 6.1 is consistent “lppchk –v” and “lppchk –c”
Copy "/etc/security/ulimits" and make sure after upgrade they are the same
Please upddate the NMON script and cron job to reference new AIX 6.1

Please Note: If you do your copy/tars make sure you do it before your nimadm migration so when you reboot to 6.1 disk the copy files are there. Also make sure have your replace back the files you take note of the permission owner and group of the files.

Two ways to create mksysb images in AIX

1) create on NIM server command:

nim -o define -t mksysb -a server=master -a source=<server name> -a mk_image=yes -a location=<location of the store image> <mksysb image name>

This will create the mksysb image of the client server and define it on the NIM server.

Example:
nim -o define -t mksysb -a server=master -a source=edppbuslvd01 -a mk_image=yes -a location=/nim/mksysb/edppbuslvd01_6100-04-03-05112010 edppbuslvd01_6100-04-03-05112010

server=master: server to store image, in this case is master
source=edppbuslvd01: the source of the image, which is client
location: the location of the stored mksysb image

2) create on client machine and then copy to NIM server and define on NIM server, or NFS mount the filesystem from NIM server on the client server.

let say you successfully NFS mount nim server filesystem on the client machine as /mnt.

mksysb -ieX /mnt/edppbuslvd01_6100-04-03-05112010

-e: exclude the filesystem/dir that defined on /etc/exclude.rootvg
-i: call the mkszfile command to generate the /image.data file
The /image.data file contains information on volume groups, logical volumes, file systems, paging space, and physical volumes.
This information is included in the backup for future use by the installation process.
-X: set to automatically expand the /tmp if necessary

After the mksysb image created, you need to define it on NIM server.

nim -o define -t mksysb -a server=master -a location=<image location> <image name>

Steps to remove PowerPath software, cleanup ODM and reinstall PowerPath

varyoff Volume Group (varyoffvg <VGNAME>)

/etc/rc.agent stop (if you have clariion devices)

Remove paths from Powerpath configuration
powermt remove hba=all

Delete all hdiskpower devices
lsdev -Cc disk -Fname | grep power | xargs -n1 rmdev -dl

Remove the PowerPath driver instance
rmdev -dl powerpath0

Delete all hdisk devices

- for Symmetrix devices, use this command:

lsdev -CtSYMM* -Fname | xargs -n1 rmdev -dl

- for CLARiiON devices, use this command:

lsdev -CtCLAR* -Fname | xargs -n1 rmdev -dl

Confirm with lsdev -Cc disk that there are no EMC hdisks or hdiskpowers

***If needed:

odmdelete -q name=powerpath0 -o CuDv

odmdelete -q name=powerpath0 -o CuAt

rm /dev/powerpath0

odmget CuDv |grep hdisk

odmdelete -q name=xxxxx -o CuDv (value you get from above NOT ROOTVG DISK)

odmget CuAt |grep hdisk

odmdelete -q name=xxxxx -o CuAt (value you get from above NOT ROOTVG DISK)

odmget CuDvDr|grep hdisk

odmdelete –q value3=xxxxxxxx -o CuDvDr (value you get from above NOT ROOTVG DISK)

odmget CuVPD|grep hdisk

odmdelete –q name=xxxxxxx -o CuVPD (value you get from above NOT ROOTVG DISK)

odmget CuDvDr|grep hdisk

odmdelete –q value3 =xxxxxxx -o CuDvDr (value you get from above NOT ROOTVG DISK)

cd /dev

rm hdiskxxxx (NOT ROOTVG DISK)

rm rhdiskxxxx (NOT ROOTVG DISK)

rm hdiskpower*

rm rhdiskpower*

savebase -v

***

Remove all Fiber driver instances rmdev -Rdl fscsiX ---> X being driver instance number i.e. 0,1,2, etc.

Verify through lsdev -Cc driver that there are no more fiber driver instances (fscsi)

Change the adapter instances in Defined state rmdev -l fcsX ---> X being adapterr instance number i.e. 0,1,2, etc.

Create the hdisk entries for all EMC devices

--> remove Clarrays definition here.

--> install ODM definition. (if you are reinstall ODM)

emc_cfgmgr or cfgmgr -vl fcsx ---> x being each adapter instance which was rebuilt Skip this part if no PowerPath.

Configure all EMC devices into PowerPath powermt config
Check the system to see if it now displays correctly powermt display
powermt display dev=all lsdev -Cc disk

Unix filesystems explained

A filesystem is a logical collection of files on a partition or disk. A partition is a container for information and can span an entire hard drive if desired.
Everything in Unix is considered to be a file, including physical devices such as DVD-ROMs, USB devices, floppy drives, and so forth.

Unix uses a hierarchical file system structure, much like an upside-down tree, with root (/) at the base of the file system and all other directories spreading from there.
A UNIX filesystem is a collection of files and directories that has the following properties:

It has a root directory (/) that contains other files and directories.
Each file or directory is uniquely identified by its name, the directory in which it resides, and a unique identifier, typically called an inode.
By convention, the root directory has an inode number of 2 and the lost+found directory has an inode number of 3. Inode numbers 0 and 1 are not used. File inode numbers can be seen by specifying the -i option to ls command.
It is self contained. There are no dependencies between one filesystem and any other.

The directories have specific purposes and generally hold the same types of information for easily locating files. Following are the directories that exist on the major versions of Unix:

Directory	Description
/	This is the root directory which should contain only the directories needed at the top level of the file structure.
/bin	This is where the executable files are located. They are available to all user.
/dev	These are device drivers.
/etc	Supervisor directory commands, configuration files, disk configuration files, valid user lists, groups, ethernet, hosts, where to send critical messages.
/lib	Contains shared library files and sometimes other kernel-related files.
/boot	Contains files for booting the system.
/home	Contains the home directory for users and other accounts.
/mnt	Used to mount other temporary file systems, such as cdrom and floppy for the CD-ROM drive and floppy diskette drive, respectively
/proc	Contains all processes marked as a file by process number or other information that is dynamic to the system.
/tmp	Holds temporary files used between system boots
/usr	Used for miscellaneous purposes, or can be used by many users. Includes administrative commands, shared files, library files, and others
/var	Typically contains variable-length files such as log and print files and any other type of file that may contain a variable amount of data
/sbin	Contains binary (executable) files, usually for system administration. For example fdisk and ifconfig utlities.
/kernel	Contains kernel files

Flavors of UNIX

The table below summarizes some of the common UNIX variants and clones. While the table lists about forty different variants, the UNIX world isn't nearly as diverse as it used to be. Some of them are defunct and are listed for historical purposes. Others are on their way out. In some cases, vendors have defected to Microsoft technology. In others, mergers and acquisitions have led to the consolidation of different UNIX implementations. A list of "dead" UNIX implementations would be substantial indeed, consisting of hundreds of variations on the letters "U," "I," and "X" (CLIX, CX/UX, MV/UX, SINIX, VENIX, etc.).

**UNIX Variants and Clones**
UNIX Variant	Company/Org.	For More Info
A/UX	Apple Computer, Inc.	defunct
AIX	IBM	http://www.rs6000.ibm.com/ software/
AT&T System V	AT&T	defunct
BS2000/OSD-BC	Siemens AG	http://www.siemens.com/ servers/bs2osd/
BSD/OS	Berkeley Software Design, Inc.	http://www.bsdi.com
CLIX	Intergraph Corp.	http://www.intergraph.com
Debian GNU/Hurd	Software in the Public Interest, Inc.	http://www.gnu.org/ software/hurd/debian- gnu-hurd.html
Debian GNU/Linux	Software in the Public Interest, Inc.	http://www.debian.org
DG/UX	Data General Corp.	http://www.dg.com/ products/html/dg_ux.html
Digital Unix	Compaq Computer Corporation	http://www.unix.digital.com/
DYNIX/ptx	Sequent Computer Systems, Inc.	http://www.sequent.com/ products/software/ operatingsys/dynix.html
Esix UNIX	Esix Systems	http://www.esix.com/
FreeBSD	FreeBSD group	http://www.freebsd.org
GNU Herd	GNU organization	http://www.gnu.org
HAL SPARC64/OS	HAL Computer Systems, Inc.	http://www.hal.com
HP-UX	Hewlett-Packard Company	http://www.hp.com/ unixwork/hpux/
Irix	Silicon Graphics, Inc.	http://www.sgi.com/ software/irix6.5/
Linux	several	http://www.linux.org
LynxOS	Lynx Real-Time Systems, Inc.	http://www.lynx.com/ products/lynxos.html
MachTen	Tenon Intersystems	http://www.tenon.com/ products/machten/
MacOS X Server	Apple Computer, Inc.	http://www.apple.com/macosx/
Minix	none	http://www.cs.vu.nl/~ast/ minix.html
MkLinux	Apple Computer, Inc.	http://www.mklinux.apple.com
NCR UNIX SVR4 MP-RAS	NCR Corporation	http://www3.ncr.com/ product/integrated/ software/p2.unix.html
NetBSD	NetBSD group	http://www.netbsd.org
NeXTSTEP	NeXT Computer Inc.	defunct, see http://www.apple.com/ enterprise/
NonStop-UX	Compaq Computer Corporation	http://www.tandem.com
OpenBSD	OpenBSD group	http://www.openbsd.org
OpenLinux	Caldera Systems, Inc.	http://www.calderasystems.com
Openstep	Apple Computer, Inc.	http://www.apple.com/ enterprise/
QNX Realtime OS	QNX Software Systems Ltd.	http://www.qnx.com/ products/os/qnxrtos.html
Red Hat Linux	Red Hat Software, Inc.	http://www.redhat.com/
Reliant UNIX	Siemens AG	http://www.siemens.com/ servers/rm/
Solaris	Sun Microsystems	http://www.sun.com/ software/solaris/
SunOS	Sun Microsystems	defunct
SuSE	S.u.S.E., Inc.	http://www.suse.com
UNICOS	Silicon Graphics, Inc.	http://www.sgi.com/software/ unicos/
UnixWare	SCO -- The Santa Cruz Operation Inc.	http://www.sco.com/unix/
UTS	Amdahl Corporation	http://www.amdahl.com/uts/

RAM disk in AIX

AIX provides 'mkramdisk' command for producing a disk that resides in the RAM for very high I/O intensive applications like database.
Here is a simple set of commands to create a ramdisk and a filesystem on top of it:

1.create a RAM disk specifying the size

# mkramdisk 5G

The system will assign the available RAM disk. Since this is the first one, it will be called as ramdisk0

2.Check for the new disk

# ls -l /dev | grep -i ram

If there isn't sufficient available memory, the mkramdisk command will warn about the same during the creation.

3.Create and mount a filesystem on top of the ram disk

/sbin/helpers/jfs2/mkfs -V jfs2 /dev/ramdiskx

mount -V jfs2 -o log=NULL /dev/ramdiskx /ramdiskx

The new filesystem will now be available like any other FS.

To remove a ram disk, unmount/remove the filesystem and use 'rmramdisk' command to remove the ram disk.

How to clear AIX NFS cache on a server

Do the following on a server that is having problem exporting NFS mounts
------------------------------------------------------------------------------------

1) Move the currents exports file to another name
       mv /etc/exports /etc/exports.old

2) Create a new exports file
       touch /etc/exports

3) Unexport everything
       exportfs -ua

4) Stop NFS
       stopsrc -g nfs

5) Stop portmapper
       stopsrc -s portmap

6) Change directory to /etc and remove or rename the following files if they exist.
       rm -rf xtab state sm sm.bak rmtab

7) change directory to /var/statman and remote the status monitoring files.
       rm -rf state sm sm.bak

8) start the portmapper
       startsrc -s portmap

9)   start nfs
       startsrc -g nfs

10) re-export what is left in /etc/exports
       exportfs -va

11) refresh the inetd daemon subsystem
       refresh -s inetd

12) Move the /etc/exports file that you backed up back in place.
       mv /etc/exports.old /etc/exports

13) export all directories in /etc/exports
       exportfs -a

Procedure to mount and unmount NFS filesystems on AIX

1) Show what is being exported on the source server

     showmount -e

    Note: If the command above does not show the correct mount points
    that needs to be exported. You can run the following command to attempt
    to export the filesystems.

     exportfs -a

2) To unmount the filesystem on the source server that is being NFS on other systems.

     a) unmount the NFS mount points on the target server.

          umount (filesystems)   target servers

     b) umount the filesystem on the source server once the target servers
         are unmounted.

           umount (filesystems)

3) Mounting NFS mount points on target server.

     a) mount (IP):(mount point) (mount point)

Replace failed mirrored internal disk in AIX

The following procedure should be used to replace a failed internal (boot) disk on AIX 5 or higher, with software mirroring.
(Note: in these examples, hdisk0 and hdisk1 are doubly-mirrored internal disks and members of rootvg; hdisk1 has failed)

1. Identify the failed disk by analyzing the errpt logs. Confirm the failure using lspv by checking if "PV State" is "Missing".

2. Break the mirror and remove the device from AIX:

# unmirrorvg rootvg hdisk1
# reducevg rootvg hdisk1
# rmdev -l hdisk1 -d

3. Confirm that the device is no longer present using lspv.

4. Replace the disk drive, letting the new device take the same device name (hdisk1).

5. Add the new device into rootvg:

# extendvg rootvg hdisk1

6. Re-mirror the volume group. No additional arguments are required to doubly-mirror the two internal disks.

# mirrorvg rootvg

7. Re-add the boot image to the new internal disk:

# bosboot -ad hdisk1

8. Re-add the new disk to the bootlist and confirm it is present:

# bootlist -m normal hdisk0 hdisk1
# bootlist -m normal -o
hdisk0 blv=hd5
hdisk1 blv=hd5

Linux boot process

In this topic we will discuss indepth of Linux Boot Sequence.How a linux system boots?
This will help unix administrators in troubleshooting some bootup problem.
Before discussing about it I will notedown the major component we need to know which are responsible for the booting process.

        1.BIOS(Basic Input/Output System)
        2.MBR(Master Boot Record)
        3.LILO or GRUB
             LILO:-LInux LOader
             GRUB:-GRand Unified Bootloader
        4.Kernel
        5.init
        6.Run Levels

1.BIOS:
      i.When we power on BIOS performs a Power-On Self-Test (POST) for all of the different hardware components in the system to make sure everything is working properly
     ii.Also it checks for whether the computer is being started from an off position (cold boot) or from a restart (warm boot) is
stored at this location.
     iii.Retrieves information from CMOS (Complementary Metal-Oxide Semiconductor) a battery operated memory chip on the motherboard that stores time, date, and critical system information.
     iv.Once BIOS sees everything is fine it will begin searching for an operating system Boot Sector on a valid master boot sector
on all available drives like hard disks,CD-ROM drive etc.
     v.Once BIOS finds a valid MBR it will give the instructions to boot and executes the first 512-byte boot sector that is the first
sector (“Sector 0″) of a partitioned data storage device such as hard disk or CD-ROM etc .
2.MBR
     i. Normally we use multi-level boot loader.Here MBR means I am referencing to DOS MBR.
     ii.Afer BIOS executes a valid DOS MBR,the DOS MBR will search for a valid primary partition marked as bootable on the hard disk.
     iii.If MBR finds a valid bootable primary partition then it executes the first 512-bytes of that partition which is second level MBR.
     iv. In linux we have two types of the above mentioned second level MBR known as LILO and GRUB
3.LILO
     i.LILO is a linux boot loader which is too big to fit into single sector of 512-bytes.
     ii.So it is divided into two parts :an installer and a runtime module.
     iii.The installer module places the runtime module on MBR.The runtime module has the info about all operating systems installed.
     iv.When the runtime module is executed it selects the operating system to load and transfers the control to kernel.
     v.LILO does not understand filesystems and boot images to be loaded and treats them as raw disk offsets
GRUB
     i.GRUB MBR consists of 446 bytes of primary bootloader code and 64 bytes of the partition table.
     ii.GRUB locates all the operating systems installed and gives a GUI to select the operating system need to be loaded.
     iii.Once user selects the operating system GRUB will pass control to the karnel of that operating system.
see below what is the difference between LILO and GRUB
4.Kernel
     i.Once GRUB or LILO transfers the control to Kernel,the Kernels does the following tasks

Intitialises devices and loads initrd module
mounts root filesystem

5.Init
     i.The kernel, once it is loaded, finds init in sbin(/sbin/init) and executes it.
     ii.Hence the first process which is started in linux is init process.
     iii.This init process reads /etc/inittab file and sets the path, starts swapping, checks the file systems, and so on.
     iv.It runs all the boot scripts(/etc/rc.d/*,/etc/rc.boot/*)
     v.starts the system on specified run level in the file /etc/inittab

6.Runlevel
     i.There are 7 run levels in which the linux OS runs and different run levels serves for different purpose.The descriptions are
given below.

0 – halt
1 – Single user mode
2 – Multiuser, without NFS (The same as 3, if you don’t have networking)
3 – Full multiuser mode
4 – unused
5 – X11
6 – Reboot

ii.We can set in which runlevel we want to run our operating system by defining it on /etc/inittab file.
Now as per our setting in /etc/inittab the Operating System the operating system boots up and finishes the bootup process.
Below are given some few important differences about LILO and GRUB

LILO	GRUB
LILO has no interactive command interface	GRUB has interactive command interface
LILO does not support booting from a network	GRUB does support booting from a network
If you change your LILO config file, you have to rewrite the LILO stage one boot loader to the MBR	GRUB automatically detects any change in config file and auto loads the OS
LILO supports only linux operating system	GRUB supports large number of OS

AIX Boot Process

Three phases available in BOOT Process

1. Ros kernel init phase
2. Base Device Configuration
3. System boot phase

1. Ros Kernel init phase (PHASE1)

A. Post (power on self test)

In this post it will do basic hardware checking

B. Then it will go to NVRAM and check the boot list for last boot device (hdisk0 or hdisk1).

C. Then it will check the BLV (hd5) in boot device.

D. Then it will check the boot image

E. Then boot image is moved to memory.

F. Then kernel will execute.

2. Base Device configuration (PHASE2)

A. Here cfgmgr will run for device configuration.

3. System Boot Phase (PHASE3)

A. Kernel will execute.
B. The paging space (hd6) will get started.
C. Then following file system will be mounted /, /var. /usr, /home. /tmp
D. Kernel start the init process, it will read the /etc/inittab file and execute the following process.

/etc/rc.boot,
srcmstr
/etc/rc.tcpip
/etc/rc.net

The above network related files /etc/rc.tcpip, /etc/rc.net, used to configure the ip address and routing.

E. Then it will start the system by default run level 2.

NOTE:

Run level 2: It contains all of the terminal process and daemons that are run in the multi user environment. This is default run level.

/etc/inittab file contains four fields, 1. Identifier, 2. Command, 3. Action, 4. Runlevel