Tuesday, August 31, 2010

controlling cpu usage part 6: Creating and Using Processor Sets

Processor sets extend the idea of CPU bindings to a more general relationship. With processor sets some number of CPU's are collected together into a set. These CPU's are effectively fenced from the rest of the system. Normal thread cannot use these CPU's. This is different to processor bindings, where the CPU's are still available for non-bound threads.
Processor sets should only be used on legacy systems that are currently using processor sets. All new installations should use pools, as they have greater flexibility.

The following example creates an empty processor set, assigns CPU id 0 to the newly created set, then binds the current shell to the newly created set.
We then query the processor sets for the details on bound process before destroying the bindings.
Finally the processor set itself is deleted.

 # psrset -c
 created processor set 1

 # psrset -a 1 0
 processor 0: was not assigned , now 1

 # psrset -b 1 $$
 process id 18219: was not bound, now 1

 # psrset
 user processor set 1: processor 0

 # psrset -Q 1
 process id 18329: 1
 process id 18219: 1

 # psrset -U
 # psrset -Q 1
 # psrset
 user processor set 1: processor 0

 # psrset -d 1
 removed processor set 1

 # psrset


Friday, August 27, 2010

controlling cpu usage part 5: Binding a Process to a Processor

Processor Binding is the forced locking of a process onto a particular CPU. The nominated process, or threads within a process, are only excecuted by the specified CPU. All process binding is performed through the pbind command. To bind all the threads in a process the pbind command is called with the -b option and the CPU to bind to is specified.


 # psrinfo
 0       on-line   since 06/11/2010 12:18:49
 1       on-line   since 06/11/2010 12:18:51

 # echo $$
 16587

 # pbind -b 1 $$
 process id 16587: was not bound, now 1

 # sh
 # echo $$
 18219

 # pbind -q
 process id 18220: 1
 process id 16857: 1
 process id 18219: 1

All the threads of the specified process are bound. Also, processor bindings are inherited by any new threads or processes, so any child processes are likewise bound to the same CPU.

To remove the bindings for a process the -u option to pbind can be used, the -U option removes all bindings.

 # pbind -u 18219
 process id 18219: was 1, now not bound

 # pbind -U
 # pbind -q

Binding a process or a thread to a CPU does not prohibit that CPU from being used for other threads.
It can be used to limit the maximum amount of CPU that a process, or group of process, can use to a single CPU.


Thursday, August 26, 2010

controlling cpu usage part 4: The Fair Share Scheduler

The Fair Share Scheduler (FSS) is an alternative scheduling class. It is not used by default an is explicitly enabled. The FSS guarantees a minimum proportion of the machines CPU resources are made available to each holder of shares, in proportion of the number of shares held.
The absolute quantity of shares is not important. Any number that is in proportion with the desired CPU entitlement can be used.
To configure projects the /etc/project file needs to be modified to identify the number of shares to be granted to each project, and the /etc/user_attr file needs to be modified to assign each user to a project.

To define two users, u1 and u2 with u1 having twice the CPU resources as u2 the entries in /etc/user_attr and /etc/project would be similar to the following:



 # egrep 'u[12]' /etc/passwd
 u1:x:1000:1::/export/home/u1:/bin/sh
 u2:x:1001:1::/export/home/u2:/bin/sh

 # egrep 'u[12]' /etc/user_attr
 u1::::type=normal;project=u1
 u2::::type=normal;project=u2

 # egrep 'u[12]' /etc/project
 u1:1000:User 1:u1::project.cpu-shares=(privileged,20,none)
 u2:1001:User 2:u2::project.cpu-shares=(privileged,10,none)

To determine the project of the current process the ps command may be used, and the prctl command will show the number of shares.



 # ps -o project= -p $$
 user.root

 # su - u1
 $ ps -o project= -p $$
          u1

 $ prctl -t privileged -n project.cpu-shares -i pid $$
 process: 1444: -sh
 NAME      PRIVILEGED         VALUE       FLAG    ACTION          RECIPIENT
 project.cpu-shares
           privileged            20                               None            -         

To change the scheduling class of a running process you can use the priocntl command.


 #  priocntl -s -c FSS -i pid            # Change one process
 #  priocntl -s -c FSS -i class TS       # Change everything currently in TS
 #  priocntl -s -c FSS -i zoneid 1       # Change all processes in zone ID 1
 #  priocntl -s -c FSS -i pid 1          # Change init (special case)

To examine the shares granted to a process (or zone) use the prctl command.

 # prctl -t privileged -n zone.cpu-shares -i zoneid 1     # Shares for zone ID 1

To modify the number of shares granted to a zone we can use -r option to prctl. This change only lasts until next reboot.

 # prctl -r -v 10 -t privileged -n zone.cpu-shares -i zoneid 1  
 # Change number of shares to 10

To change the default scheduling class, so that on next and subsequent reboots all process will use FSS by default we can use the dispadmin command.

 # dispadmin -d FSS


Wednesday, August 25, 2010

controlling cpu usage part 3: Manipulating the dispatch parameter tables

Each scheduling class maintains a set of tables in the kernel. These are used to control aspects of the scheduling class. These tables may be manipulated by the dispadmin command:


 # dispadmin -l
 CONFIGURED CLASSES
 ==================

 SYS     (System Class)
 TS      (Time Sharing)
 FX      (Fixed Priority)
 RT      (Real Time)
 IA      (Interactive)

Changing the Scheduler
Solaris comes with six defined scheduling classes. Of these classes four are provided for use by user threads time sharing (TS), interactive (IA), fixed priority (FX) & fair share scheduling (FSS) . the other two are system, for kernel threads, and real-time.
If there are multiple processor sets in use then each processor set can theoretically use a different scheduling class. This is only practical when using the pool subsystem, which allows scheduling class to be specified per pool.
Time Sharing/Interactive Scheduling Classes
Time sharing and interactive classes use the same algorithm, the difference between them is that interactive scheduling class attempts to provide a slight boost to the foreground process
The two classes provide a table which has entries for:
  • quantum - number of time periods allowed
  • tqexp - priority to change thread to when quantum expired
  • slpret - priority to change thread to when returning from a sleep
  • maxwait - maximum number of seconds to wait for CPU before changing priority
  • lwait - priority to change thread to when maxwait expired


 # dispadmin -g -c TS
 # Time Sharing Dispatcher Configuration
 RES=1000

 # ts_quantum  ts_tqexp  ts_slpret  ts_maxwait ts_lwait  PRIORITY LEVEL
       200         0        50           0        50        #     0
       200         0        50           0        50        #     1
       200         0        50           0        50        #     2
       200         0        50           0        50        #     3
       200         0        50           0        50        #     4
       200         0        50           0        50        #     5
       200         0        50           0        50        #     6
       200         0        50           0        50        #     7
       200         0        50           0        50        #     8
       200         0        50           0        50        #     9
 ...
       160         0        51           0        51        #    10
       160         1        51           0        51        #    11
       160         2        51           0        51        #    12
       160         3        51           0        51        #    13
       160         4        51           0        51        #    14
 ...
        40        40        58           0        59        #    50
        40        41        58           0        59        #    51
        40        46        58           0        59        #    56
        40        47        58           0        59        #    57
        40        48        58           0        59        #    58
        20        49        59       32000        59        #    59

To change the dispatch parameter table for the TS and IA classes create a new table in a file and insert this file into the running kernel:


 # dispadmin -c TS -g > new_table
 # ( edit new_table )
 # dispadmin -c TS -s new_table

The new table will come effect immediately no reboot is required here. But the change will only have effect during the current life-time of the current boot time. To make the change effective on subsequent boots the dispadmin -c TS -s new_table has to be run as an initialization script on each boot. It is recommended the this is placed after the single-user milestone is reached to enable the system to be booted to single user mode in the case the table turns out to be incorrect.


Monday, August 23, 2010

controlling cpu usage part 2: CPU Usage Limit in the Shell

The shell ulimit command can be used to check or set the CPU limit for any subsequently created children, and their descendants. The -t option to ulimit sets the amount of CPU time a process may use before it is sent a SIGXCPU signal by the kernel. The default  is unlimited (maximum CPU time).


 # su - useruser $ ulimit -t
 unlimited
 user $ sh
 user $ ulimit -t
 unlimited
 user $ ulimit -t 10
 user $ date; while : ; do : ; done; date
 Friday, 5 September 2008  3:34:56 PM EST
 Cpu Limit Exceeded  (core dumped)
 user $




Friday, August 20, 2010

controlling cpu usage part 1: Introduction

CPU usage can be controlled in a number of different ways. The possible choices as of Solaris 05/08 are:
  • We can set a CPU usage limit in the shell
  • We can manipulate the dispatch parameter kernel tables
  • We can use different schedulers, such as the FSS (Fair Share Scheduler)
  • We can bind a process to a CPU
  • We can use processor sets
  • We can create pools, which combine scheduler changes and processor sets
  • We can set a capped-cpu resource control for solaris container or zones
In the coming weeks I will discuss these options in a little detail in hope you can improve performance or tune aspects of your environment better.


Monday, August 16, 2010

solaris: Description of all services

A quick tip for all Solaris 10/OpenSolaris users… some companies have a strict requirement to know exactly what each and every startup script does on their system. With releases of Solaris 9 and earlier, one would check the rc scripts. This is time consuming and may not give an accurate description or one liner. Solaris 10/OpenSolaris makes things much easier…

Solaris 10/OpenSolaris now uses services for all the Sun supplied start up scripts, but still supports the old (legacy) rc scripts too. A summary of all services can be obtained by running: svcs -o FMRI,DESC. This will produce output similar to the following:



 #svcs -o FMRI,DESC
  FMRI                                               DESC
  lrc:/etc/rc2_d/S00set-tmp-permissions              -
  lrc:/etc/rc2_d/S07set-tmp-permissions              -
  lrc:/etc/rc2_d/S10lu                               -
  lrc:/etc/rc2_d/S20sysetup                          -
  lrc:/etc/rc2_d/S40llc2                             -
  lrc:/etc/rc2_d/S42ncakmod                          -
  lrc:/etc/rc2_d/S70nddconfig                        -
  lrc:/etc/rc2_d/S72autoinstall                      -
  lrc:/etc/rc2_d/S73cachefs_daemon                   -
  lrc:/etc/rc2_d/S81dodatadm_udaplt                  -
  lrc:/etc/rc2_d/S89bdconfig                         -
  lrc:/etc/rc2_d/S91afbinit                          -
  lrc:/etc/rc2_d/S91gfbinit                          -
  lrc:/etc/rc2_d/S91ifbinit                          -
  lrc:/etc/rc2_d/S91jfbinit                          -
  lrc:/etc/rc2_d/S91kfbinit                          -
  lrc:/etc/rc2_d/S91zuluinit                         -
  lrc:/etc/rc2_d/S94ncalogd                          -
  lrc:/etc/rc2_d/S95lwact                            -
  lrc:/etc/rc2_d/S95nbclient                         -
  lrc:/etc/rc2_d/S98deallocate                       -
  lrc:/etc/rc2_d/S99sneep                            -
  lrc:/etc/rc3_d/S16boot_server                      -
  lrc:/etc/rc3_d/S50apache                           -
  lrc:/etc/rc3_d/S52imq                              -
  lrc:/etc/rc3_d/S84appserv                          -
  svc:/system/fpsd:default                           FP Scrubber - Online Floating Point Unit Test
  svc:/system/svc/restarter:default                  master restarter
  svc:/network/pfil:default                          packet filter
  svc:/network/tnctl:default                         trusted networking templates
  svc:/network/loopback:default                      loopback network interface
  svc:/system/installupdates:default                 system update installer
  svc:/system/filesystem/root:default                root file system mount
  svc:/system/scheduler:default                      default scheduling class configuration
  svc:/system/boot-archive:default                   check boot archive content
  svc:/network/physical:default                      physical network interfaces
  svc:/system/identity:node                          system identity (nodename)
  svc:/system/filesystem/usr:default                 read/write root file systems mounts
  svc:/system/keymap:default                         keyboard defaults
  svc:/network/ipfilter:default                      IP Filter
  svc:/system/device/local:default                   Standard Solaris device configuration.
  svc:/system/filesystem/minimal:default             minimal file system mounts
  svc:/system/rmtmpfiles:default                     remove temporary files
  svc:/system/resource-mgmt:default                  Global zone resource management settings
  svc:/system/coreadm:default                        system-wide core file configuration
  svc:/system/name-service-cache:default             name service cache
  svc:/system/identity:domain                        system identity (domainname)
  svc:/system/cryptosvc:default                      cryptographic services
  svc:/system/sysevent:default                       system event notification
  svc:/system/device/fc-fabric:default               Solaris FC fabric device configuration.
  svc:/network/ipsec/ipsecalgs:default               IPsec algorithm initialization
  svc:/milestone/devices:default                     device configuration milestone
  svc:/system/picl:default                           platform information and control
  svc:/network/ipsec/policy:default                  IPsec policy initialization
  svc:/milestone/network:default                     Network milestone
  svc:/system/pkgserv:default                        Flush package command database to disk
  svc:/application/print/ppd-cache-update:default    ppd cache update
  svc:/network/initial:default                       initial network services
  svc:/system/manifest-import:default                service manifest import
  svc:/network/service:default                       layered network services
  svc:/system/patchchk:default                       Launcher for Automatic Patching services
  svc:/network/dns/client:default                    DNS resolver
  svc:/milestone/name-services:default               name services milestone
  svc:/network/iscsi/initiator:default               -
  svc:/milestone/single-user:default                 single-user milestone
  svc:/platform/sun4v/efdaemon:default               embedded FCode interpreter
  svc:/system/filesystem/local:default               local file system mounts
  svc:/network/shares/group:default                  Share Group
  svc:/system/cron:default                           clock daemon (cron)
  svc:/network/shares/group:zfs                      Share Group
  svc:/system/sysidtool:net                          sysidtool
  svc:/system/boot-archive-update:default            update boot archive if necessary
  svc:/network/routing-setup:default                 Initial routing-related configuration.
  svc:/network/ntp:default                           Network Time Protocol (NTP)
  svc:/network/rpc/bind:default                      RPC bindings
  svc:/application/psncollector:default              Product Serial Number Collector
  svc:/system/sysidtool:system                       sysidtool
  svc:/milestone/sysconfig:default                   Basic system configuration milestone
  svc:/system/sac:default                            SAF service access controller
  svc:/system/postrun:default                        Postponed package postinstall command execution
  svc:/network/inetd:default                         inetd
  svc:/system/utmp:default                           utmpx monitoring
  svc:/system/console-login:default                  Console login
  svc:/system/dumpadm:default                        system crash dump configuration
  svc:/network/ssh:default                           SSH server
  svc:/system/system-log:default                     system log
  svc:/application/management/seaport:default        net-snmp SNMP daemon
  svc:/network/smtp:sendmail                         sendmail SMTP mail transfer agent
  svc:/network/sendmail-client:default               sendmail SMTP client queue runner
  svc:/system/fmd:default                            Solaris Fault Manager
  svc:/network/rpc/rstat:default                     kernel statistics server
  svc:/network/rpc/smserver:default                  removable media management
  svc:/network/cde-spc:default                       CDE subprocess control
  svc:/network/bpcd/tcp:default                      bpcd
  svc:/network/vnetd/tcp:default                     vnetd
  svc:/network/vopied/tcp:default                    vopied
  svc:/network/bpjava-msvc/tcp:default               bpjava-msvc
  svc:/system/filesystem/volfs:default               Volume Management filesystem
  svc:/application/management/sma:default            net-snmp SNMP daemon
  svc:/milestone/multi-user:default                  multi-user milestone
  svc:/milestone/multi-user-server:default           multi-user plus exports milestone
  svc:/application/stosreg:default                   Service Tag OS Registry Inserter
  svc:/system/zones:default                          Zones autoboot and graceful shutdown
  svc:/system/basicreg:default                     
  svc:/application/sthwreg:default                   Hardware Service Tag Collector
  svc:/application/print/ipp-listener:default        Internet Print Protocol Listening Service
  svc:/application/print/rfc1179:default             BSD print protocol adapter


Wednesday, August 11, 2010

veritas: Cannot unmount a Locked VxFS filesystem

Storage Foundation 5.0MP3 introduced a new feature called VxFS filesystem lock which disallows accidental unmounts when the  file system resource is online. New umount option mntunlock is used to clear the lock and then unmount the filesystem. The offline script for the Mount resource will use this new option.








How to check if the filesystem is locked by VCS:

 # mount -v | grep mntlock


Sometimes, it may be necessary to unmount a mount locked filesystem. This is for cases where VCS service groups having DiskGroup resources configured with UnMountVolumes attribute set and the volumes are mounted outside of VCS control (this is not very common).

The filesystem locking system is to prevent accidental unmounts, if the attribute is set to 0 VCS will not lock the filesystem.
















The following fix from Symantec to umount the locked vxfs filesystem may not work, a bug in 5.0 MP3 was found and it could be fixed in future releases. 
If the following command does not work a bounce is required.

Solaris:

 # /opt/VRTS/bin/umount -o mntunlock=VCS /mount-point

If you continue to experience issues such as :

 UX:vxfs umount: ERROR: V-3-21705: mount-point cannot unmount : Device busy

The above "umount" command has already cleared the Mount Lock silently but the Mount Lock is still shown in the "mount -v" output.   Now the file system will not be able to be unmounted.
Run fuser checks on the mount points to confirm any outbound processes still running:

 # fuser -c /mount-point

but if you continue to encounter issues and recieve the following error:

 UX:vxfs umount: ERROR: V-3-26365: Incorrect mntlock id  (Invalid argument)

You can attempt to unmount it using the fsadm command:

 # fsadm -o mntunlock=VCS /mount-point

Trying to clear the Mount Lock using fsadm could also fail.

 UX:vxfs fsadm: ERROR: V-3-26348: file system not mount locked

The workaround is to lock the file system again using fsadm with the same lock name.

 # fsadm -o mntlock=VCS /mount-point

Now the system can be unmounted successfully by umount.

 # umount -o mntunlock=VCS /mount-point

Please note that if the VxFS file system is disabled, fsadm will not be able to remove the lock.   The only way to unmount the disabled file system is to reboot the system.

 Apr 17 13:00:07 alaw2 vxfs: [ID 702911 kern.warning] WARNING: msgcnt 3 mesg 031: V-2-31: vx_disable  - /mount-point file system disabled

 # fsadm -o mntlock=VCS /mount-point
 UX:vxfs fsadm: ERROR: V-3-20275: cannot open /mount-point



Tuesday, August 10, 2010

veritas: Mult-Link-based IPMP setup with VCS

With Solaris 10 came a nice feature – Multi-Link-based IP Multipathing (IPMP). It determines NIC availability solely on the NIC driver reporting the physical link status – UP or DOWN. Previous versions used “probe-based” IPMP, where connectivity is tested by pinging something on the network from each interface. While probe-based is actually a more thorough test (tests network layer 3 as well as 2), it is much more cumbersome to configure, and you need an extra IP address for each interface for “test” addresses. IMO Multi-Link-based IPMP is sufficient for most applications.
To achieve multi-link-based IPMP, here’s how I’ve configured my MultiNICB resource in this large 10 node clustered environment:
Multi-Link-based IPMP MultiNICB properties

These are the values you must change from the defaults:
UseMpathd: 1
Tells VCS to use mpathd for network link status
MpathCommand: /usr/sbin/in.mpathd
Be sure to create a symbolic link to /usr/lib/inet/in.mpathd -a if the above does not exist. 
ConfigCheck: 1
If you leave this at 1, it will overwrite your /etc/hostname.xxx files with probe-based IPMP configuration, if left at 0 it will not change.
Device: (your IPMP interfaces here)
List of interfaces and there interface aliases.
Tick on per System and add the device and interface alias entry for each IPMP grouped interface from each host in the cluster.
GroupName:
Do not use your IPMP group name here, it’s not needed. VCS is not monitoring the group, mpathd is.

veritas: Veritas Volume Manager (VxVM) Commands

Here are some links to Basic and Advanced (VxVM) commands for your online storage management enterprise or Storage Area Network (SAN) environments.

Basic VxVm Commands:
Advance VxVM Commands:

Friday, August 6, 2010

commands: eXtended System Control Facility (XSCF)



The Sun SPARC Enterprise M4000/M5000/M8000/M9000 Servers XSCF useful console commands:













 XSCF> console -d 0
 XSCF> console -f -d 0
 XSCF> showstatus
 XSCF> showversion -c xcp -v [shows xcp firmware, version, openboot prom version
 XSCF> showenvironment
 XSCF> showenvironment temp
 XSCF> showenvironment volt
 XSCF> showhardconf
 XSCF> showdcl -va [check domain id...]
 XSCF> showdomainstatus -a
 XSCF> showboards -a
 XSCF> poweron -a [powers up all domains]
 XSCF> poweroff -a [powers off all domains]
 XSCF> poweron -d 0 [powers on domain 0]
 XSCF> poweroff -d 0 [powers off domain 0]
 XSCF> poweroff -f -d 0 [forces a power off domain 0]
 XSCF> reset -d 0 por [resets domain 0]
 XSCF> reset -d 0 xir [resets domain 0 with XIR reset]
 XSCF> sendbreak -d 0 [sends break command to domain 0]
 XSCF> setautologout -s 60 [sets autologout to 60 minutes]
 XSCF> showautologout
 XSCF> shownetwork -a
 XSCF> setnetwork xscf#0-lan#0 -m 255.255.255.0 10.10.10.5
 XSCF> sethostname xscf#0 fire-xscf
 XSCF> sethostname -h host.org
 XSCF> setroute -h host.org
 XSCF> setnameserver 10.10.10.2 10.10.10.3
 XSCF> setroute -c add -n 10.10.10.1 -m 255.255.255.0 xscf#0-lan#
 XSCF> snapshot -L F -t [username]@[hostname]:[directory_to_save_to]


 #. to break from console