DISCLAIMER : Please note that blog owner takes no responsibility of any kind for any type of data loss or damage by trying any of the command/method mentioned in this blog. You may use the commands/method/scripts on your own responsibility.If you find something useful, a comment would be appreciated to let other viewers also know that the solution/method work(ed) for you.
Showing posts with label VCS. Show all posts
Showing posts with label VCS. Show all posts
Veritas Cluster Cheat sheet
VCS is built on three components: LLT, GAB, and VCS itself. LLT handles kernel-to-kernel communication over the LAN heartbeat links, GAB handles shared disk communication and messaging between cluster members, and VCS handles the management of services.
Once cluster members can communicate via LLT and GAB, VCS is started.
In the VCS configuration, each Cluster contains systems, Service Groups, and Resources. Service Groups contain a list of systems belonging to that group, a list of systems on which the Group should
be started, and Resources. A Resource is something controlled or monitored by VCS, like network interfaces, logical IP's, mount point, physical/logical disks, processes, files, etc. Each resource
corresponds to a VCS agent which actually handles VCS control over the resource.
VCS configuration can be set either statically through a configuration file, dynamically through the CLI, or both. LLT and GAB configurations are primarily set through configuration files.
Configuration
VCS configuration is fairly simple. The three configurations to worry about are LLT, GAB, and VCS resources.
LLT
LLT configuration requires two files: /etc/llttab and /etc/llthosts.
llttab contains information on node-id, cluster membership, and heartbeat links. It should look like this:
# llttab -- low-latency transport configuration file
GAB
GAB requires only one configuration file, /etc/gabtab. This file lists the number of nodes in the cluster and also, if there are any communication disks in the system, configuration for them. Ex:
/sbin/gabconfig -c -n2
tells GAB to start GAB with 2 hosts in the cluster.
LLT and GAB
VCS uses two components, LLT and GAB to share data over the private networks among systems.
These components provide the performance and reliability required by VCS.
LLT and GAB files
Gabtab Entries
LLT and GAB Commands
GAB Port Memberbership
Cluster daemons
Cluster Log Files
Starting and Stopping the cluster
Cluster Status
Cluster Details
Users
System Operations
Dynamic Configuration
The VCS configuration must be in read/write mode in order to make changes. When in read/write mode the
configuration becomes stale, a .stale file is created in $VCS_CONF/conf/config. When the configuration is put
back into read only mode the .stale file is removed.
Service Groups
Service Group Operations
Resources
Resource Operations
Resource Types
Resource Agents
Resource Agent Operations
Once cluster members can communicate via LLT and GAB, VCS is started.
In the VCS configuration, each Cluster contains systems, Service Groups, and Resources. Service Groups contain a list of systems belonging to that group, a list of systems on which the Group should
be started, and Resources. A Resource is something controlled or monitored by VCS, like network interfaces, logical IP's, mount point, physical/logical disks, processes, files, etc. Each resource
corresponds to a VCS agent which actually handles VCS control over the resource.
VCS configuration can be set either statically through a configuration file, dynamically through the CLI, or both. LLT and GAB configurations are primarily set through configuration files.
Configuration
VCS configuration is fairly simple. The three configurations to worry about are LLT, GAB, and VCS resources.
LLT
LLT configuration requires two files: /etc/llttab and /etc/llthosts.
llttab contains information on node-id, cluster membership, and heartbeat links. It should look like this:
# llttab -- low-latency transport configuration file
GAB
GAB requires only one configuration file, /etc/gabtab. This file lists the number of nodes in the cluster and also, if there are any communication disks in the system, configuration for them. Ex:
/sbin/gabconfig -c -n2
tells GAB to start GAB with 2 hosts in the cluster.
LLT and GAB
VCS uses two components, LLT and GAB to share data over the private networks among systems.
These components provide the performance and reliability required by VCS.
LLT | LLT (Low Latency Transport) provides fast, kernel-to-kernel comms and monitors network connections. The system admin configures the LLT by creating a configuration file (llttab) that describes the systems in the cluster and private network links among them. The LLT runs in layer 2 of the network stack |
GAB | GAB (Group membership and Atomic Broadcast) provides the global message order required to maintain a synchronised state among the systems, and monitors disk comms such as that required by the VCS heartbeat utility. The system admin configures GAB driver by creating a configuration file ( gabtab). |
LLT and GAB files
/etc/llthosts | The file is a database, containing one entry per system, that links the LLT system ID with the hosts name. The file is identical on each server in the cluster. |
/etc/llttab | The file contains information that is derived during installation and is used by the utility lltconfig. |
/etc/gabtab | The file contains the information needed to configure the GAB driver. This file is used by the gabconfig utility. |
/etc/VRTSvcs/conf/config/main.cf | The VCS configuration file. The file contains the information that defines the cluster and its systems. |
Gabtab Entries
/sbin/gabdiskconf - i /dev/dsk/c1t2d0s2 -s 16 -S 1123
/sbin/gabdiskconf - i /dev/dsk/c1t2d0s2 -s 144 -S 1124 /sbin/gabdiskhb -a /dev/dsk/c1t2d0s2 -s 16 -p a -s 1123 /sbin/gabdiskhb -a /dev/dsk/c1t2d0s2 -s 144 -p h -s 1124 /sbin/gabconfig -c -n2 |
gabdiskconf
|
-i Initialises the disk region -s Start Block -S Signature |
gabdiskhb (heartbeat disks)
|
-a Add a gab disk heartbeat resource -s Start Block -p Port -S Signature |
gabconfig
|
-c Configure the driver for use -n Number of systems in the cluster. |
LLT and GAB Commands
Verifying that links are active for LLT | lltstat -n |
verbose output of the lltstat command | lltstat -nvv | more |
open ports for LLT | lltstat -p |
display the values of LLT configuration directives | lltstat -c |
lists information about each configured LLT link | lltstat -l |
List all MAC addresses in the cluster | lltconfig -a list |
stop the LLT running | lltconfig -U |
start the LLT | lltconfig -c |
verify that GAB is operating | gabconfig -a Note: port a indicates that GAB is communicating, port h indicates that VCS is started |
stop GAB running | gabconfig -U |
start the GAB | gabconfig -c -n <number of nodes> |
override the seed values in the gabtab file | gabconfig -c -x |
GAB Port Memberbership
List Membership | gabconfig -a |
Unregister port f | /opt/VRTS/bin/fsclustadm cfsdeinit |
Port Function | a gab driver b I/O fencing (designed to guarantee data integrity) d ODM (Oracle Disk Manager) f CFS (Cluster File System) h VCS (VERITAS Cluster Server: high availability daemon) o VCSMM driver (kernel module needed for Oracle and VCS interface) q QuickLog daemon v CVM (Cluster Volume Manager) w vxconfigd (module for cvm) |
Cluster daemons
High Availability Daemon | had |
Companion Daemon | hashadow |
Resource Agent daemon | <resource>Agent |
Web Console cluster managerment daemon | CmdServer |
Cluster Log Files
Log Directory | /var/VRTSvcs/log |
primary log file (engine log file) | /var/VRTSvcs/log/engine_A.log |
Starting and Stopping the cluster
"-stale" instructs the engine to treat the local config as stale "-force" instructs the engine to treat a stale config as a valid one |
hastart [-stale|-force] |
Bring the cluster into running mode from a stale state using the configuration file from a particular server | hasys -force <server_name> |
stop the cluster on the local server but leave the application/s running, do not failover the application/s | hastop -local |
stop cluster on local server but evacuate (failover) the application/s to another node within the cluster | hastop -local -evacuate |
stop the cluster on all nodes but leave the application/s running | hastop -all -force |
Cluster Status
display cluster summary | hastatus -summary |
continually monitor cluster | hastatus |
verify the cluster is operating | hasys -display |
Cluster Details
information about a cluster | haclus -display |
value for a specific cluster attribute | haclus -value <attribute> |
modify a cluster attribute | haclus -modify <attribute name> <new> |
Enable LinkMonitoring | haclus -enable LinkMonitoring |
Disable LinkMonitoring | haclus -disable LinkMonitoring |
Users
add a user | hauser -add <username> |
modify a user | hauser -update <username> |
delete a user | hauser -delete <username> |
display all users | hauser -display |
System Operations
add a system to the cluster | hasys -add <sys> |
delete a system from the cluster | hasys -delete <sys> |
Modify a system attributes | hasys -modify <sys> <modify options> |
list a system state | hasys -state |
Force a system to start | hasys -force |
Display the systems attributes | hasys -display [-sys] |
List all the systems in the cluster | hasys -list |
Change the load attribute of a system | hasys -load <system> <value> |
Display the value of a systems nodeid (/etc/llthosts) | hasys -nodeid |
Freeze a system (No offlining system, No groups onlining) | hasys -freeze [-persistent][-evacuate] Note: main.cf must be in write mode |
Unfreeze a system ( reenable groups and resource back online) | hasys -unfreeze [-persistent] Note: main.cf must be in write mode |
Dynamic Configuration
The VCS configuration must be in read/write mode in order to make changes. When in read/write mode the
configuration becomes stale, a .stale file is created in $VCS_CONF/conf/config. When the configuration is put
back into read only mode the .stale file is removed.
Change configuration to read/write mode | haconf -makerw |
Change configuration to read-only mode | haconf -dump -makero |
Check what mode cluster is running in | haclus -display |grep -i 'readonly' 0 = write mode 1 = read only mode |
Check the configuration file | hacf -verify /etc/VRTSvcs/conf/config Note: you can point to any directory as long as it has main.cf and types.cf |
convert a main.cf file into cluster commands | hacf -cftocmd /etc/VRTSvcs/conf/config -dest /tmp |
convert a command file into a main.cf file | hacf -cmdtocf /tmp -dest /etc/VRTSvcs/conf/config |
Service Groups
add a service group | haconf -makerw hagrp -add groupw hagrp -modify groupw SystemList sun1 1 sun2 2 hagrp -autoenable groupw -sys sun1 haconf -dump -makero |
delete a service group | haconf -makerw hagrp -delete groupw haconf -dump -makero |
change a service group | haconf -makerw hagrp -modify groupw SystemList sun1 1 sun2 2 sun3 3 haconf -dump -makero Note: use the "hagrp -display <group>" to list attributes |
list the service groups | hagrp -list |
list the groups dependencies | hagrp -dep <group> |
list the parameters of a group | hagrp -display <group> |
display a service group's resource | hagrp -resources <group> |
display the current state of the service group | hagrp -state <group> |
clear a faulted non-persistent resource in a specific grp | hagrp -clear <group> [-sys] <host> <sys> |
Change the system list in a cluster | # remove the host hagrp -modify grp_zlnrssd SystemList -delete <hostname> # add the new host (don't forget to state its position) hagrp -modify grp_zlnrssd SystemList -add <hostname> 1 # update the autostart list hagrp -modify grp_zlnrssd AutoStartList <host> <host> |
Service Group Operations
Start a service group and bring its resources online | hagrp -online <group> -sys <sys> |
Stop a service group and takes its resources offline | hagrp -offline <group> -sys <sys> |
Switch a service group from system to another | hagrp -switch <group> to <sys> |
Enable all the resources in a group | hagrp -enableresources <group> |
Disable all the resources in a group | hagrp -disableresources <group> |
Freeze a service group (disable onlining and offlining) | hagrp -freeze <group> [-persistent] note: use the following to check "hagrp -display <group> | grep TFrozen" |
Unfreeze a service group (enable onlining and offlining) | hagrp -unfreeze <group> [-persistent] note: use the following to check "hagrp -display <group> | grep TFrozen" |
Enable a service group. Enabled groups can only be brought online | haconf -makerw hagrp -enable <group> [-sys] haconf -dump -makero Note to check run the following command "hagrp -display | grep Enabled" |
Disable a service group. Stop from bringing online | haconf -makerw hagrp -disable <group> [-sys] haconf -dump -makero Note to check run the following command "hagrp -display | grep Enabled" |
Flush a service group and enable corrective action. | hagrp -flush <group> -sys <system> |
Resources
add a resource | haconf -makerw hares -add appDG DiskGroup groupw hares -modify appDG Enabled 1 hares -modify appDG DiskGroup appdg hares -modify appDG StartVolumes 0 haconf -dump -makero |
delete a resource | haconf -makerw hares -delete <resource> haconf -dump -makero |
change a resource | haconf -makerw hares -modify appDG Enabled 1 haconf -dump -makero Note: list parameters "hares -display <resource>" |
change a resource attribute to be globally wide | hares -global <resource> <attribute> <value> |
change a resource attribute to be locally wide | hares -local <resource> <attribute> <value> |
list the parameters of a resource | hares -display <resource> |
list the resources | hares -list |
list the resource dependencies | hares -dep |
Resource Operations
Online a resource | hares -online <resource> [-sys] |
Offline a resource | hares -offline <resource> [-sys] |
display the state of a resource( offline, online, etc) | hares -state |
display the parameters of a resource | hares -display <resource> |
Offline a resource and propagate the command to its children | hares -offprop <resource> -sys <sys> |
Cause a resource agent to immediately monitor the resource | hares -probe <resource> -sys <sys> |
Clearing a resource (automatically initiates the onlining) | hares -clear <resource> [-sys] |
Resource Types
Add a resource type | hatype -add <type> |
Remove a resource type | hatype -delete <type> |
List all resource types | hatype -list |
Display a resource type | hatype -display <type> |
List a partitcular resource type | hatype -resources <type> |
Change a particular resource types attributes | hatype -value <type> <attr> |
Resource Agents
add a agent | pkgadd -d . <agent package> |
remove a agent | pkgrm <agent package> |
change a agent | n/a |
list all ha agents | haagent -list |
Display agents run-time information i.e has it started, is it running ? | haagent -display <agent_name> |
Display agents faults | haagent -display |grep Faults |
Resource Agent Operations
Start an agent | haagent -start <agent_name>[-sys] |
Stop an agent | haagent -stop <agent_name>[-sys] |