DISCLAIMER : Please note that blog owner takes no responsibility of any kind for any type of data loss or damage by trying any of the command/method mentioned in this blog. You may use the commands/method/scripts on your own responsibility.If you find something useful, a comment would be appreciated to let other viewers also know that the solution/method work(ed) for you.


Showing posts with label VCS. Show all posts
Showing posts with label VCS. Show all posts

Veritas Cluster Cheat sheet

 VCS is built on three components: LLT, GAB, and VCS itself. LLT handles kernel-to-kernel communication over the LAN heartbeat links, GAB handles shared disk communication and messaging between cluster members, and VCS handles the management of services.

Once cluster members can communicate via LLT and GAB, VCS is started.
In the VCS configuration, each Cluster contains systems, Service Groups, and Resources. Service Groups contain a list of systems belonging to that group, a list of systems on which the Group should
be started, and Resources. A Resource is something controlled or monitored by VCS, like network interfaces, logical IP's, mount point, physical/logical disks, processes, files, etc. Each resource
corresponds to a VCS agent which actually handles VCS control over the resource.

VCS configuration can be set either statically through a configuration file, dynamically through the CLI, or both. LLT and GAB configurations are primarily set through configuration files.

Configuration

VCS configuration is fairly simple. The three configurations to worry about are LLT, GAB, and VCS resources.

LLT

LLT configuration requires two files: /etc/llttab and /etc/llthosts.
llttab contains information on node-id, cluster membership, and heartbeat links. It should look like this:

# llttab -- low-latency transport configuration file



GAB

GAB requires only one configuration file, /etc/gabtab. This file lists the number of nodes in the cluster and also, if there are any communication disks in the system, configuration for them. Ex:

/sbin/gabconfig -c -n2

tells GAB to start GAB with 2 hosts in the cluster.

LLT and GAB

VCS uses two components, LLT and GAB to share data over the private networks among systems.
These components provide the performance and reliability required by VCS.

LLT LLT (Low Latency Transport) provides fast, kernel-to-kernel comms and monitors network connections. The system admin configures the LLT by creating a configuration file (llttab) that describes the systems in the cluster and private network links among them. The LLT runs in layer 2 of the network stack
GAB GAB (Group membership and Atomic Broadcast) provides the global message order required to maintain a synchronised state among the systems, and monitors disk comms such as that required by the VCS heartbeat utility. The system admin configures GAB driver by creating a configuration file ( gabtab).

LLT and GAB files

/etc/llthosts The file is a database, containing one entry per system, that links the LLT system ID with the hosts name. The file is identical on each server in the cluster.
/etc/llttab The file contains information that is derived during installation and is used by the utility lltconfig.
/etc/gabtab The file contains the information needed to configure the GAB driver. This file is used by the gabconfig utility.
/etc/VRTSvcs/conf/config/main.cf The VCS configuration file. The file contains the information that defines the cluster and its systems.

Gabtab Entries

/sbin/gabdiskconf - i /dev/dsk/c1t2d0s2 -s 16 -S 1123
/sbin/gabdiskconf - i /dev/dsk/c1t2d0s2 -s 144 -S 1124
/sbin/gabdiskhb -a /dev/dsk/c1t2d0s2 -s 16 -p a -s 1123
/sbin/gabdiskhb -a /dev/dsk/c1t2d0s2 -s 144 -p h -s 1124
/sbin/gabconfig -c -n2

gabdiskconf
-i   Initialises the disk region
-s   Start Block
-S   Signature
gabdiskhb (heartbeat disks)
-a   Add a gab disk heartbeat resource
-s   Start Block
-p   Port
-S   Signature
gabconfig
-c   Configure the driver for use
-n   Number of systems in the cluster.

LLT and GAB Commands


Verifying that links are active for LLT lltstat -n
verbose output of the lltstat command lltstat -nvv | more
open ports for LLT lltstat -p
display the values of LLT configuration directives lltstat -c
lists information about each configured LLT link lltstat -l
List all MAC addresses in the cluster lltconfig -a list
stop the LLT running lltconfig -U
start the LLT lltconfig -c
verify that GAB is operating gabconfig -a
Note: port a indicates that GAB is communicating, port h indicates that VCS is started
stop GAB running gabconfig -U
start the GAB gabconfig -c -n <number of nodes>
override the seed values in the gabtab file gabconfig -c -x

GAB Port Memberbership


List Membership gabconfig -a
Unregister port f /opt/VRTS/bin/fsclustadm cfsdeinit
Port Function a   gab driver
b   I/O fencing (designed to guarantee data integrity)
d   ODM (Oracle Disk Manager)
f   CFS (Cluster File System)
h   VCS (VERITAS Cluster Server: high availability daemon)
o   VCSMM driver (kernel module needed for Oracle and VCS interface)
q   QuickLog daemon
v   CVM (Cluster Volume Manager)
w   vxconfigd (module for cvm)

Cluster daemons


High Availability Daemon had
Companion Daemon hashadow
Resource Agent daemon <resource>Agent
Web Console cluster managerment daemon CmdServer

Cluster Log Files

Log Directory /var/VRTSvcs/log
primary log file (engine log file) /var/VRTSvcs/log/engine_A.log

Starting and Stopping the cluster


"-stale" instructs the engine to treat the local config as stale
"-force" instructs the engine to treat a stale config as a valid one
hastart [-stale|-force]
Bring the cluster into running mode from a stale state using the configuration file from a particular server hasys -force <server_name>
stop the cluster on the local server but leave the application/s running, do not failover the application/s hastop -local
stop cluster on local server but evacuate (failover) the application/s to another node within the cluster hastop -local -evacuate
stop the cluster on all nodes but leave the application/s running hastop -all -force

Cluster Status


display cluster summary hastatus -summary
continually monitor cluster hastatus
verify the cluster is operating hasys -display

Cluster Details



information about a cluster haclus -display
value for a specific cluster attribute haclus -value <attribute>
modify a cluster attribute haclus -modify <attribute name> <new>
Enable LinkMonitoring haclus -enable LinkMonitoring
Disable LinkMonitoring haclus -disable LinkMonitoring

Users


add a user hauser -add <username>
modify a user hauser -update <username>
delete a user hauser -delete <username>
display all users hauser -display

System Operations


add a system to the cluster hasys -add <sys>
delete a system from the cluster hasys -delete <sys>
Modify a system attributes hasys -modify <sys> <modify options>
list a system state hasys -state
Force a system to start hasys -force
Display the systems attributes hasys -display [-sys]
List all the systems in the cluster hasys -list
Change the load attribute of a system hasys -load <system> <value>
Display the value of a systems nodeid (/etc/llthosts) hasys -nodeid
Freeze a system (No offlining system, No groups onlining) hasys -freeze [-persistent][-evacuate]
Note: main.cf must be in write mode
Unfreeze a system ( reenable groups and resource back online) hasys -unfreeze [-persistent]
Note: main.cf must be in write mode

Dynamic Configuration 

The VCS configuration must be in read/write mode in order to make changes. When in read/write mode the
configuration becomes stale, a .stale file is created in $VCS_CONF/conf/config. When the configuration is put
back into read only mode the .stale file is removed.
Change configuration to read/write mode haconf -makerw
Change configuration to read-only mode haconf -dump -makero
Check what mode cluster is running in haclus -display |grep -i 'readonly'
0 = write mode
1 = read only mode
Check the configuration file hacf -verify /etc/VRTSvcs/conf/config
Note: you can point to any directory as long as it has main.cf and types.cf
convert a main.cf file into cluster commands hacf -cftocmd /etc/VRTSvcs/conf/config -dest /tmp
convert a command file into a main.cf file hacf -cmdtocf /tmp -dest /etc/VRTSvcs/conf/config

Service Groups


add a service group haconf -makerw
  hagrp -add groupw
  hagrp -modify groupw SystemList sun1 1 sun2 2
  hagrp -autoenable groupw -sys sun1
haconf -dump -makero
delete a service group haconf -makerw
  hagrp -delete groupw
haconf -dump -makero
change a service group haconf -makerw
  hagrp -modify groupw SystemList sun1 1 sun2 2 sun3 3
haconf -dump -makero
Note: use the "hagrp -display <group>" to list attributes
list the service groups hagrp -list
list the groups dependencies hagrp -dep <group>
list the parameters of a group hagrp -display <group>
display a service group's resource hagrp -resources <group>
display the current state of the service group hagrp -state <group>
clear a faulted non-persistent resource in a specific grp hagrp -clear <group> [-sys] <host> <sys>
Change the system list in a cluster # remove the host
hagrp -modify grp_zlnrssd SystemList -delete <hostname>
# add the new host (don't forget to state its position)
hagrp -modify grp_zlnrssd SystemList -add <hostname> 1
# update the autostart list
hagrp -modify grp_zlnrssd AutoStartList <host> <host>

Service Group Operations


Start a service group and bring its resources online hagrp -online <group> -sys <sys>
Stop a service group and takes its resources offline hagrp -offline <group> -sys <sys>
Switch a service group from system to another hagrp -switch <group> to <sys>
Enable all the resources in a group hagrp -enableresources <group>
Disable all the resources in a group hagrp -disableresources <group>
Freeze a service group (disable onlining and offlining) hagrp -freeze <group> [-persistent]
note: use the following to check "hagrp -display <group> | grep TFrozen"
Unfreeze a service group (enable onlining and offlining) hagrp -unfreeze <group> [-persistent]
note: use the following to check "hagrp -display <group> | grep TFrozen"
Enable a service group. Enabled groups can only be brought online haconf -makerw
  hagrp -enable <group> [-sys]
haconf -dump -makero
Note to check run the following command "hagrp -display | grep Enabled"
Disable a service group. Stop from bringing online haconf -makerw
  hagrp -disable <group> [-sys]
haconf -dump -makero
Note to check run the following command "hagrp -display | grep Enabled"
Flush a service group and enable corrective action. hagrp -flush <group> -sys <system>

Resources


add a resource haconf -makerw
  hares -add appDG DiskGroup groupw
  hares -modify appDG Enabled 1
  hares -modify appDG DiskGroup appdg
  hares -modify appDG StartVolumes 0
haconf -dump -makero
delete a resource haconf -makerw
  hares -delete <resource>
haconf -dump -makero
change a resource haconf -makerw
  hares -modify appDG Enabled 1
haconf -dump -makero
Note: list parameters "hares -display <resource>"
change a resource attribute to be globally wide hares -global <resource> <attribute> <value>
change a resource attribute to be locally wide hares -local <resource> <attribute> <value>
list the parameters of a resource hares -display <resource>
list the resources hares -list  
list the resource dependencies hares -dep

Resource Operations


Online a resource hares -online <resource> [-sys]
Offline a resource hares -offline <resource> [-sys]
display the state of a resource( offline, online, etc) hares -state
display the parameters of a resource hares -display <resource>
Offline a resource and propagate the command to its children hares -offprop <resource> -sys <sys>
Cause a resource agent to immediately monitor the resource hares -probe <resource> -sys <sys>
Clearing a resource (automatically initiates the onlining) hares -clear <resource> [-sys]

Resource Types

Add a resource type hatype -add <type>
Remove a resource type hatype -delete <type>
List all resource types hatype -list
Display a resource type hatype -display <type>
List a partitcular resource type hatype -resources <type>
Change a particular resource types attributes hatype -value <type> <attr>

Resource Agents


add a agent pkgadd -d . <agent package>
remove a agent pkgrm <agent package>
change a agent n/a
list all ha agents haagent -list  
Display agents run-time information i.e has it started, is it running ? haagent -display <agent_name>  
Display agents faults haagent -display |grep Faults

Resource Agent Operations


Start an agent haagent -start <agent_name>[-sys]
Stop an agent haagent -stop <agent_name>[-sys]