Wednesday, July 28, 2010

commands: /usr/ucb/ps

/usr/bin/ps and /usr/ucb/ps are the same in that they are wrappers to call the platform specific ps. /usr/ucb/ps is the ("University of California, Berkeley") BSD ps & /usr/sbin/ps should be the SVR4 ps. If you compare the ps and UCB ps commands that are *really* run, you'll see they are different:









 # ls -l /usr/bin/sparcv9/ps /usr/ucb/sparcv9/ps
 -r-xr-xr-x   1 root     bin        38464 Jan 18  2003 /usr/bin/sparcv9/ps
 -r-sr-xr-x   1 root     sys        28592 Jan 18  2003 /usr/ucb/sparcv9/ps


It's there for historical reasons.  SunOS 4.x was based on BSD unix. Solaris 2.x (= SunOS 5.x) was based on SYSV, with a bunch of commands having different syntax and behavior. To ease the transition, the /usr/ucb directory was created to hold the incompatible BSD versions.
People who really wanted BSD could put /usr/ucb before /usr in their PATH. Trouble shooting web applications or such java type unix components can be very useful using the /usr/ucb/ps command. This is useful for some people who wants to know how an application actually got launched, which may include a long list of configuration parameters. Seeing the full command-line including all arguments is useful for configuration troubleshooting.

If you run:

 # /usr/ucb/ps -e

or

 # /usr/ucb/ps -axe 

(the -e flag) if you show really long output like you're doing, you can actually  see what environment variables are set for every process that's running. So, since you know the userid of the process from your ps output, you have a pretty good chance of knowing what environment variables are set in their shell.  Some could be from the script they're running, but mostly are just from the user's env.

Further system and performance information can also be presented using the following command:

 # /usr/ucb/ps -auxww

The w switch uses a wide output format,  that  is,  132  columns rather  than  80. If the option letter is repeated, that is, -ww, this  option  uses  arbitrarily wide output. This information is used to decide how much of long commands to print. 


Monday, July 26, 2010

fmadm: Solaris Fault Manager Defined

Fault management allows system software to send telemetry data to the fmd(1m) daemon, which then diagnoses the problem, and takes action (e.g., offlining a faulty components and logging an error with FMRI/UUID information to syslog) based on the type of event received. The diagnosis phase is controlled by a set of diagnosis engines, which can be viewed with the fmadm(1m) utilities “config” option: 












 # fmadm config
 MODULE                   VERSION STATUS  DESCRIPTION
 cpumem-diagnosis         1.6     active  CPU/Memory Diagnosis
 cpumem-retire            1.1     active  CPU/Memory Retire Agent
 disk-transport           1.0     active  Disk Transport Agent
 eft                      1.16    active  eft diagnosis engine
 etm                      1.2     active  FMA Event Transport Module
 fabric-xlate             1.0     active  Fabric Ereport Translater
 fmd-self-diagnosis       1.0     active  Fault Manager Self-Diagnosis
 io-retire                1.0     active  I/O Retire Agent
 snmp-trapgen             1.0     active  SNMP Trap Generation Agent
 sp-monitor               1.0     active  Service Processor Monitor
 sysevent-transport       1.0     active  SysEvent Transport Agent
 syslog-msgs              1.0     active  Syslog Messaging Agent
 zfs-diagnosis            1.0     active  ZFS Diagnosis Engine
 zfs-retire               1.0     active  ZFS Retire Agent

If the fault manager daemon (fmd) detects a fault, it will log a detailed message to syslog, and update the fault manager error and fault logs. The contents of these logfiles can be viewed with the fmdump(1m) utility:

 # fmdump -v
 TIME                 UUID                                 SUNW-MSG-ID
 fmdump: /var/fm/fmd/fltlog is empty


 # fmdump -e -v
 TIME                 CLASS                                 ENA
 fmdump: /var/fm/fmd/errlog is empty

If a device is diagnosed as faulty, this will be indicated in the fmadm(1m) “faulty” output:


 # fmadm faulty
 STATE RESOURCE / UUID
 -------- ---------------------------------------------------------

The fault management daemon (fmd) keeps track of service events and numerous pieces of key statistical data. This information can be accessed and printed with the fmstat(1m) utility: 


 # fmstat
 module             ev_recv ev_acpt wait  svc_t  %w  %b  open solve  memsz  bufsz
 cpumem-diagnosis         0       0  0.0    0.1   0   0     0     0   3.0K      0
 cpumem-retire            0       0  0.0    0.0   0   0     0     0    12b      0
 disk-transport           0       0  0.0    1.6   0   0     0     0    40b      0
 eft                      0       0  0.0    0.2   0   0     0     0   925K      0
 etm                      0       0  0.0    0.0   0   0     0     0   8.2K   144b
 fabric-xlate             0       0  0.0    0.1   0   0     0     0      0      0
 fmd-self-diagnosis     309       0  0.0    0.0   0   0     0     0      0      0
 io-retire                0       0  0.0    0.0   0   0     0     0      0      0
 snmp-trapgen             0       0  0.0    0.0   0   0     0     0      0      0
 sp-monitor               0       0  0.0   46.9   0   0     0     0    24b      0
 sysevent-transport       0       0  0.0   27.2   0   0     0     0      0      0
 syslog-msgs              0       0  0.0    0.0   0   0     0     0    32b      0
 zfs-diagnosis            8       0  0.0    0.9   0   0     0     0      0      0
 zfs-retire               0       0  0.0    0.0   0   0     0     0      0      0

To clear FMA faults and Error logs from Solaris.

Show faults in FMA: 
 
 # fmadm faulty

For each fault listed from the 'fmadm faulty' run:

 # fmadm repair event-ID

Clear error reports and resource cache:

 # cd /var/fm/fmd
 # rm e* f/* c*/eft/* r**

Reset the fmd serd modules:

 # fmadm config
 # fmadm reset cpumem-diagnosis
 # fmadm reset cpumem-retire
 # fmadm reset eft
 # fmadm reset io-retire
 # fmadm config

Reset or refresh any diabled modules:

 # fmadm config
(Check and confirm missing module)

Check the fmd service if its online:

 # svcs -a fmd

Check if you do have the disabled service under:

 # ls /var/fm/fmd/ckpt

Clear the Faulted / Disabled module via:

 # fmdadm repair fmd:///module/module-name

Restore and activate disalbed module:

 # svcadm disable -st fmd
 # cd /var/fm/fmd/ckpt
 # mv module-name save.module-name
 # svcadm enable fmd

Confirm Disabled module is now active.

 # fmdadm config

If you are interested in learning more about this amazingly cool technology, you can check out the following resources: 

Sun's Fault Management Presentation


fmadm: logadm: Failed To Rotate Errlog


 logadm: Warning: command failed: /bin/sh -c /usr/sbin/fmadm -q rotate errlog && mv /var/fm/fmd/errlog.0- /var/fm/fmd/errlog.0
 fmadm: failed to rotate errlog: log file is too busy to rotate (try again later)

As you can see, fmadm is actually being used to rotate the log "live" (while still in use), but as the error suggests the /var/fm/fmd/ errlog file is in use.
In fact, if this has been going on for more than a few days, you may notice that the file is growing considerably fast (a few bytes per second).
I propose changing the /etc/logadm.conf entry to something along the lines of:

 /var/fm/fmd/errlog -M '/usr/sbin/svcadm disable fmd && sleep 15 && mv $file $nfile && /usr/sbin/svcadm enable fmd' -N -P 'Thu Mar 27 10:10:00 2008' -s 10m

(The reason why "fmadm -q rotate" is not used to rotate the log is that it depends on the fmd service that we stop) 

 

Thursday, July 22, 2010

scripts: Find Large Files

A helpful script whenever you need to locate large files from any location on the server your running the script from.
The script will prompt you with a few questions before hand, it will then run and display the output on screen, the output will be separated into two sections , the top section will display "Newer big Files" and the bottom will display the "Older big files" the listing will be in an 'ls -ld" format and will be shown on screen and also directed to a temporary file.  Click Here:

Wednesday, July 21, 2010

syslog: Real-Time Email-Notification For Critical syslog Events

One of the big advantages of syslog is the separation between the log request and the logging action.
The following steps will show how to :

   1. Write all critical events to a local logfile
   2. Log this to the console
   3. Send an email notification of the event. 

In this example, we'll take all critical messages written from all facilities and (in addition to logging) send them to the mail recipient, named@example.com. 





First, create a log file for critical messages, for example:

 # touch /var/adm/critmessages
 # chmod 600 /var/adm/critmessages


Next, configure syslog to log all critical messages written from all facilitites to this log file. add the following statement to your syslog.conf file.

One thing to note when editing /etc/syslog.conf, you can't use spaces, you must use tabs.


 *.crit                             /var/adm/critmessages


The final step is to mail out any messages that are written to the pipe. you can do this with a simple shell script. i've included an example below, let's call it /usr/bin/syslogMailer:


 #!/bin/bash
 # syslogMailer: a script to read stdin and turn each line into an alert
 # email typically this is used to read a named-pipe written to by syslog
 #
 #   example usage: syslogMailer.sh < /var/adm/critmessages
 #

 hostname=`hostname`
 alertRecipient="named@example.com"      # the mail recipient for alerts
 TMOUT=1                                 # don't wait > 1 second for input

 # process each line of input and produce an alert email
 while read line
 do
    # remove any repeated messages
    echo ${line} | grep "message repeated" > /dev/null 2>&1
    if test $? -eq 1
    then
       # send the alert
       echo "${line}" | mailx -s "Critical Error on syslog | $hostname" ${alertRecipient}
       cat /dev/null > /var/adm/critmessages
    fi
 done

This allows you to schedule it in cron to run, say, every 1 min of every hour with a statement like:

 * * * * * /usr/bin/syslogMailer.sh < /var/adm/critmessages > /dev/null

After changing the configuration, make sure that you restart syslogd, on Solaris 10 Update 1 or newer you'd use svcadm restart svc:/system/system-log:default.

Test it by simply running the following logger command.

 logger -p auth.crit "test"




syslog: Syslog Defined

When a program sends a message to syslog, its given to the syslog daemon (syslogd) and then routed according to the /etc/syslog.conf
configuration file. By default most of the useful output from syslog is placed into /var/adm/messages.

When messages are sent to syslog they are given a "facility" (whats sending the message) and a "level" (how important is it). If you look in /etc/syslog.conf you'll see a variety of facility.level pairs and then for each of these a destination for that type of message. Facility can be named (daemon, kern, ...) or can be a wildcard (*). Level must be named (alert, crit, notice, ..), but a given level will also report all levels above it, so if you use the "notice" level you'll also get crit's as well, for instance.  

Of the facilities and levels, note that two are special, the wildcard (*) facility and the "none" level. Using the wildcard for the facility means all of them. The "none" severity level isn't useful on its own, but can be handy when creating compound statements for several facilities and levels.
 
So lets look at some sample lines from /etc/syslog.conf:

                 *.err;kern.debug;deamon.notice;mail.crit                /var/adm/messages

So this line passes along anything more sever than an error, any kernel notices or more sever, daemon notices or more sever, and critical or more sever mail messages, and routes these into the /var/adm/messages file. If we wanted to put all mail messages into a log file named "/var/adm/maillog" we could use something like "mail.debug /var/adm/maillog". 

Putting log messages into a file is handy, but we can also send those messages across the network to a centralized syslog server. Simply use "@mysyslogserver" instead of a filename. So, "auth.notice @logserver" will send all authorization notices (or more sever) to the syslog daemon running on a system called "logserver". 

                 auth.notice                @logserver

Syslog on Solaris is setup by default to accept messages both locally and over the network. Reguardless of how a message comes into the syslogd daemon, its routed according to the syslog.conf configuration. So you could put "*.notice @syslogcentral" into the syslog.conf of each of your clients and "*.notice /var/adm/centralized_messages" in the syslog.conf of syslogcentral and wamo bamo, you'd have a centralized syslog infrastructure! 

One thing to note when editing /etc/syslog.conf, you can't use spaces, you must use tabs. Also, after changing the configuration, make sure that you restart syslogd, on Solaris 10 Update 1 or newer you'd use svcadm restart svc:/system/system-log:default



Tuesday, July 20, 2010

veritas: Basic VCS Commands

Here is a link to a Symantec education quick reference guide to Basic VCS Commands

Basic VCS Commands

syslog: Logging su Attempts & Failed Logins

From a security perspective when configuring system logging, I like to configure the syslogd daemon to monitor the following.

I like to log each time a user logs into my systems, as well as all attempts to su to another user. To log all su attempts, the file /var/adm/sulog can be created (in recent releases of Solaris, this file is created by default):










 # touch /var/adm/sulog

To log all successful and unsuccessful logins, you will first need to set the variable:

 SYSLOG_FAILED_LOGINS in /etc/default/login to the value 0

Once the variable is adjusted, you will need to create a log file to store the login attempts:

 # touch /var/adm/loginlog


After the log file is created, the auth priority needs to be added to /etc/syslog.conf:

 auth.debug /var/adm/loginlog


With the loginlog and sulog files in place, it is relatively easy to see who accessed a given system at time X, and who tried to become the super user.

One thing to note when editing /etc/syslog.conf, you can't use spaces, you must use tabs. Also, after changing the configuration, make sure that you restart syslogd, on Solaris 10 Update 1 or newer you'd use svcadm restart svc:/system/system-log:default. 



Saturday, July 17, 2010

crontab: Crontab Fields

How many times admins forget the field order of the crontab files and alway reference the man pages over-and-over.
Make your life easy. Just put the field definitions in your crontab file
and comment (#) the lines out so the crontab file ignores it. 










 # minute (0-59),
 # |    hour (0-23),
 # |    |       day of the month (1-31),
 # |    |       |       month of the year (1-12),
 # |    |       |       |       day of the week (0-6 with 0=Sunday).
 # |    |       |       |       |       commands
   3    2       *       *       0,6     /some/command/to/run
   3    2       *       *       1-5     /another/command/to/run




crontab: Are my Cron Jobs Running Fine?

How can I tell if my cronjobs are running ok since they don't produce any output?
Check the following file:

 /var/cron/log   : (cron history information)
 /var/cron/olog : (moves log file to /var/cron/olog if log file exceeds system ulimit)









 The file looks something like this: 

 ! *** cron started *** pid = 260 Tue Jun 4 00:30:56 2002
 >  CMD: [ -x /usr/sbin/rtc ] && /usr/sbin/rtc -c > /dev/null 2>&1
 >  root 429 c Tue Jun  4 02:01:00 2002
 <  root 429 c Tue Jun  4 02:01:00 2002 rc=1
 >  CMD: /usr/sbin/logadm
 >  root 440 c Tue Jun  4 03:10:00 2002
 <  root 440 c Tue Jun  4 03:10:00 2002
 >  CMD: [ -x /usr/lib/gss/gsscred_clean ] && /usr/lib/gss/gsscred_clean
 >  root 452 c Tue Jun  4 03:30:00 2002
 <  root 452 c Tue Jun  4 03:30:00 2002



It provided the following:
  • The CMD is the command that was run
  • Next entry is the time the job started
  • Next is the time the job finished
  • The rc is the return code from the job 
  •  
     

Friday, July 16, 2010

scripts: PCP - Show Open Ports & PIDs on Solaris

PCP is a very useful security and administration script that can help you quickly find Processes (PIDs) having particular TCP Port(s) open, TCP ports open by specific PIDs or even list all the TCP Ports open by all PIDs running on your system.

The PCP script works on Solaris 10/9/8 and can be downloaded from Click Here:






PIDs for TCP Port:

Run PCP with “-p” option to show the PIDs of processes having a TCP port (say Port 22)


 # ./pcp.sh -p 22
 PID     Process Name and Port
 -------------------------------------------------
 5455    /usr/lib/ssh/sshd 22
 sockname: AF_INET6 :: port: 22
 -------------------------------------------------

instance, to find PIDs opening TCP port 22.


TCP Ports open by PIDs:

Run PCP with “-P” option to show the TCP ports open by specific PID

For instance, here I try to find the TCP ports open by PID 18805

 # ./pcp.sh -P 18805
 PID     Process Name and Port
 -------------------------------------------------
 18805   /usr/lib/gnome-netstatus-applet
 sockname: AF_INET6 ::  port: 32809
 sockname: AF_INET 0.0.0.0  port: 32810
 sockname: AF_INET 127.0.0.1  port: 32823
 sockname: AF_INET 0.0.0.0  port: 0
 -------------------------------------------------


PIDs for all open TCP Ports:

Use the “-a” option to list all TCP ports open with all the PIDs

 # ./pcp.sh -a
 PID     Process Name and Port
 -------------------------------------------------
 39      /sbin/dhcpagent
 sockname: AF_INET 0.0.0.0  port: 68
 sockname: AF_INET6 ::  port: 546
 sockname: AF_INET 127.0.0.1  port: 4999
 sockname: AF_INET 127.0.0.1  port: 4999
 sockname: AF_INET 192.168.0.8  port: 68
 -------------------------------------------------
 73      /usr/lib/firefox/firefox-bin
 sockname: AF_INET 127.0.0.1  port: 4999
 -------------------------------------------------
 1219    /usr/lib/nfs/statd
 sockname: AF_INET 0.0.0.0  port: 0
 -------------------------------------------------
 3224    /usr/sadm/lib/smc/bin/smcboot
 sockname: AF_INET 127.0.0.1  port: 5987
 sockname: AF_INET 127.0.0.1  port: 898
 sockname: AF_INET 127.0.0.1  port: 5988
 -------------------------------------------------
 3225    /usr/sadm/lib/smc/bin/smcboot
 sockname: AF_INET 127.0.0.1  port: 32773
 -------------------------------------------------




Thursday, July 15, 2010

Wednesday, July 14, 2010

acct: Unix System Accounting Disabled

On various hardware platforms I have seen system accounting consume a large amount of system resources if considering to completely remove system accounting this is my preferred and easy method.








How to Permanently Disable System Accounting:
  1. Become superuser.
  2. Edit the adm crontab file and delete the entries for the ckpacct, runacct, and monacct programs.
  3. EDITOR=/usr/bin/vi; export EDITOR
  4. crontab -e adm
  5. Edit the root crontab file and delete the entries for the dodisk program.
  6. crontab -e
  7. Remove the startup script for Run Level 2.
  8. unlink /etc/rc2.d/S22acct
  9. Remove the stop script for Run Level 0.
  10. unlink /etc/rc0.d/K22acct
  11. Stop the accounting program.
  12. /etc/init.d/acct stop


acct: Unix System Accounting Enabled Pt 5

Project Accounting: 
If access to accounting source code can be obtained, another useful modification to the accounting system would be to retain the group id (gid) component of the pacct data when it is summarized and converted to tacct data. In the PROCESS state, the acctprc command summarizes process accounting (pacct) data into total accounting (tacct) data. In this process the group id is stripped out of the data and leads to a loss of information in the tacct file. A modification of acctdef.h, acctprc.c and acctmerg.c to retain the gid in the tacct structure and the cooresponding summarizing of data by user and by group would provide project accounting capabilities in Unix accounting. (The author is working with several vendors to adopt this project accounting modification.)
Billing and Security Auditing Capabilities: 
When Unix accounting is enabled, additional auditing capabilites are available if the pacct* and wtmpx files are retained from deletion from the standard accounting process. If you preserved the pacct files and a user complains about a specific charge he may have incurred by running on your system, a more detailed report about the resource usage consumption can be generated with the acctcom command. This same information can be used as clues for a suspected security intrusion. Output 1 is a sample accounting report generated by acctcom from preserved Solaris pacct data.

If the wtmpx file is preserved you can use this information to provide additional clues as to how long a suspected intruder has been lurking about your system. If a security intrusion is detected on a specific users account (victor, for example) and the host where the suspected intruder comes from remains constant (unknown.fake.edu, for example), you can use the last command and the preserved wtmpx files to determine how long the suspected intrusion has been occuring. Output 2 is a sample session that uses the preserved daily wtmpx records to generate a report of all the login activity for user victor. 

acct: Unix System Accounting Enabled Pt 4

Useful runacct Modifications: 
runacct generates several default reports, summarizes accounting data into a few binary data files, and removes several raw data input files. Each of these processes should be reviewed by the system administrator for usefulness for the site. If a component of runacct is determined to not be useful to a site it should be removed or commented out. It is recommended if modifications are made to accounting scripts that a copy of the original script be retained for comparison and debugging purposes. Modifications can easily be maintained by a source code management system, such as, rcs.

Preserve Raw Data: 
The most important and useful modification to runacct is the preservation of the raw pacct and wtmpx data files. These files provide the audit trail for both billing and security. It is standard procedure for runacct to remove all pacct files by the completion of the script. These files, if preserved from deletion and archived, can provide supporting documentation for billing inquiries and can provide clues for suspected security intrusions. It is recommended to keep and archive the raw input pacct* and wtmpx files for security and accounting system auditing purposes. The user exit capability (see the USEREXIT state) in the runacct script can be used to accomplish this archiving task. A sample user exit script, called runacct.local, is provided in Listing 1.

Mail Recipient Configuration: 
Another useful accounting modification is the modification of the recipient list of mail from the runacct script. This can easily be done by creating an environment variable called _maillist and calling mailx ${_maillist} instead of the default mailx root adm. Add the following line in the environment variable section:

_maillist="root adm"

Replace all instances of

mailx root adm

with

mailx ${_maillist}

Of course, now change the mail recipient list to those users you want notified of an error in runacct.
USEREXIT environment
Additionally, the USEREXIT state has two deficiencies. First, the environment of the runacct script is not passed when the /usr/lib/acct/runacct.local script is called. Second, the exit status of the runacct.local script is not checked. The following runacct changes are recommended. Change

USEREXIT)
# "any installation dependant accounting programs should be run here"
[ -s /usr/lib/acct/runacct.local ] && /usr/lib/acct/runacct.local

echo "CLEANUP" > ${_statefile}
;;

to

USEREXIT)
# "any installation dependant accounting programs should be run here"
if [ -s /usr/lib/acct/runacct.local ]
then
. /usr/lib/acct/runacct.local
if [ ${?} -ne 0 ]
then
_errmsg="\n\n***** Accounting error with runacct.local *******\n\n\n"
(date ; echo "$_errmsg" ) | logger -p daemon.err
echo "$_errmsg" | mailx adm root
echo "ERROR: problem with runacct.local, run aborted" >> ${_active}
rm -f ${_nite}/lock*
exit 1
fi
fi

Listing 1 is an example runacct.local script to preserve the pacct* and wtmpx files in /var/adm/acct/sum/YYYYMMDD. YYYYMMDD represents the four digit year, the month and day of the accounting run. This new subdirectory is created by the script and will help in the organization and manageability of these files.

acct: Unix System Accounting Enabled Pt 3

Upon the completion of a successful execution of runacct, several reports and binary data files are created and several raw data files are removed. The following is a brief description of the action performed in each runacct state. 

Table 1 provides the following information in tabular format. The CONNECT state creates lineuse, reboots, and ctacct files from the wtmp file.
The lineuse and reboots files provide the history of login connections and system reboots, respectively. 

The ctacct file contains this login and reboot information in total accounting (tacct) format.
The PROCESS state creates the ptacct total accounting file from the process accounting data (pacct* files). The total accounting file generated in this state is a summary by user of pacct data. 

The MERGE state combines the ptacct and ctacct files to create the daytacct total accounting file which contains all of the accounting data available in summarized form. (Files in tacct format can be processed by the acctmerg command.) If the system administrator has configured fee and disk accounting then these files will be processed into the daytacct total accounting file by the FEE and DISK states. 

The MERGETACCT state processes the current days total accounting data into another tacct file called /var/adm/acct/sum/tacct.MMDD (where MMDD is the month and day of the accounting run). 

This tacct file is a running total of all daily tacct files which will be used by the monthly accounting script monacct. The CMS state uses the acctcms command to summarize the process accounting data by command into cms files. One file is this data in binary format and the other is in ascii format. 

The CMS state also creates the last login report by using the lastlogin command which shows the last time a user logged in. 

The USEREXIT state will run the /usr/lib/acct/runacct.local script if it exists and has a size greater than zero. An example USEREXIT script is given in Listing 1. 

The CLEANUP state generates the final report using the prdaily report, removes all unnecessary data files, removes the lock files, reports the completion of the accounting run, and terminates. 

acct: Unix System Accounting Enabled Pt 2


Accounting Data Flow:
The Figure to the right provides an overview of the flow of data in the accounting subsystem available with Solaris. Accounting data starts out as process data contained in the Unix kernel data structures. If accounting is turned on, then upon termination of each process, the kernel writes a process accounting (pacct) record which is appended to /var/adm/pacct. The pacct file format is described by the acct structure contained in /usr/include/sys/acct.h. pacct files can be processed outside of the daily periodic accounting run with the acctcom command.

wtmp and wtmpx are valuable files used by the accounting subsystem process. wtmp and wtmpx keep track of information about interactive logins, their tty's and originating hosts. The wtmp format is described by the utmp structure contained in /usr/include/utmp.h. The wtmpx format is described by the utmpx structure contained in /usr/include/utmpx.h. The information contained in the wtmpx file can be processed outside of the daily periodic accounting run with the last command.

The ckpacct script should be executed by adm's cron and checks both the size of the /var/adm/pacct file and the availability of space in the file system which contains /var/adm/pacct. If the pacct file is over the size threshold contained in the script then the pacct file is rolled over to /var/adm/pacctN where N is the next unused number is sequence for the pacct file set starting with 1. If the file system has less space than that described in the script, then ckpacct will turn the accounting system off by executing turnacct off. It is recommended to run ckpacct by adm's cron once per hour.

The daily periodic accounting run is accomplished by executing the runacct script. The runacct script should be run daily by adm's cron. An example of the crontab entry for runacct was provided in the Quick Start section. runacct processes and generates reports of the data contained in the pacct and wtmp files. runacct is a bourne shell script made up of eleven restartable states. The states are

* SETUP
* WTMPFIX
* CONNECT
* PROCESS
* MERGE
* FEES
* DISK
* MERGETACCT
* CMS
* USEREXIT
* and CLEANUP.

As runacct executes, it logs its progress by writing descriptive messages to /var/adm/acct/nite/active. /usr/adm/acct/nite/statefile and /usr/adm/acct/nite/lastdate contain the last known state and date, respectively, of the runacct script. To prevent execution of runacct for the same accounting period lock files are used. The lock files are /var/adm/acct/nite/lock and /var/adm/acct/nite/lock1.

In the event an error is encountered during execution, a message is logged on the console, mail is sent to the administrators (root and adm) and runacct exits. If a state aborts due to an error, review both statefile and active to determine the location and cause of the error. The error can be investigated and repaired with accounting utility programs such as fwtmp and wtmpfix. Accounting can then be restarted at the state where the error was encountered, thereby reducing the overhead involved if accounting had to be restarted from the beginning. To restart runacct it must be invoked with an argument, such as, runacct 1117. The argument should coorespond to the date when the error was encountered. This date should also be described in the lastdate file. The state described by statefile, will be the state where runacct will restart execution. This can be overidden by adding the state to the argument list, such as, runacct 1117 USEREXIT. Upon successful completion runacct will write COMPLETE into the statefile.


acct: Unix System Accounting Enabled Pt 1

Unix accounting, when enabled, can provide useful information about who is using your system and their overall resource consumption in basic terms. By implementing the suggested modifications you can magically transform the accounting system into a more useful billing and security auditing subsystem. Use of the user exit provided in the runacct script is quite possibly the most useful modification to the accounting system. The user exit script, runacct.local, should be created and could contain all necessary local file archives and manipulations. By using the information provided by the Unix accounting system with the recommended modifications, one can provide useful reports on system utilization and provide additional audit trails for billing and security inquiries.

Quick Start:
The Unix accounting system is made up of scripts and utility programs, each of which performs a specific function in creating, processing, or reporting accounting data. These are located in /usr/lib/acct. When maintaining the accounting system, the /usr/lib/acct directory should be placed in your path. Additionally, adding yourself to the adm group will provide access to accounting directories and data files without the need to access these by using superuser or adm user privileges (su - root or su - adm).

On a new system or a system with accounting disabled there are three easy steps to starting the accounting system. The three steps to perform are

1. start process accounting with turnacct on
2. place ckpacct in adm's cron entry

0 * * * * /usr/lib/acct/ckpacct

3. place runacct in adm's cron entry

50 23 * * * /usr/lib/acct/runacct > /var/adm/acct/nite/fd2log 2>&1

It is recommended to perform the above as user adm. Performing them as root will work, but file permissions may not be set properly requiring superuser privileges for access to accounting files and directories. The accounting system is designed to give the proper privileges to adm to perform the accounting functions, including enabling or disabling process accounting in the kernel and setup of proper permissions for accounting files including those created by the runacct script.

To determine if turnacct was successful, check for the existence of the /var/adm/pacct file. As processes are created and complete, the kernel will write a process accounting record to /var/adm/pacct. As you issue commands you should see /var/adm/pacct grow in size. More details about the pacct file, the ckpacct and runacct scripts are discussed in subsequent paragraphs.

If you plan to run Unix accounting on a large system or have requirements of keeping several months of accounting data online, you may want to consider creating a separate file system for storage of accounting files. /var/adm/acct is the toplevel directory which contains accounting data files and reports except for /var/adm/wtmp, /var/adm/wtmpx (these are sometimes found in /etc) and the current process accounting data files located in the /var/adm/pacct* files. Creating a separate file system for /var/adm/acct is recommended if you enable Unix accounting


Tuesday, July 13, 2010

zones: TTYsrch Issues Within Zones

Symptoms such as process-related commands like "ps", "ps -ef", "ptree", "pgrep" etc. were hanging and not completing. This had knock-on effects for utilities and applications which use the ps command, such as VCS, VxVM etc. 
The above problem only occurred when one or more non-global zones was running on the system.

Root Cause:

Among other things, the ps command traverses the system "device tree" in /dev, to link processes with terminal devices. Such searches are optimized using the /etc/ttysrch file, which is like a PATH file for the /dev directory. If the required device is not found under one of the directories listed in the /etc/ttysrch file, it must do a recursive search of the whole /dev directory, which can take a long time. This is what was happening in the above case.

When non-global zones are running, the relevant terminal device files reside under the /dev/zcons directory. This entry was missing from the /etc/ttysrch file. It seems that servers installed with Solaris 10 to start with have this entry included in their ttysrch file, but on servers upgraded from Solaris 8 to 10 via Live Upgrade, this entry is missing. This would appear to be a bug in the Sun Live Upgrade process.

Fix:
 
Add the entry "/dev/zcons" to the /etc/ttysrch file. See the following examples - last few lines of /etc/ttysrch in each case.

 #
 /dev/pts
 /dev/term
 /dev/xt
 /dev/zcons

logadm: Log Administration

The logadm utility can handle pretty much any application's log files for us. By using copy,truncate and compression features, we aren't forced to restart each application when we cycle the logs, especially with apache,tomcat and other web applications log files tend to get too large to maintain and is required to be monitored at all times..
logadm ends up giving us a huge amount of control over our log files, without having to write and maintain shell scripts.
 

Every solaris build environment has already a root crontab entry for logadm utility and it reads off a configuration file in /etc/logadm.conf

The configuration file already monitors and maintains existing /var/adm system administration log files eg, syslog, messages log , etc

Running the following command we want to keep 1 copy of the log and we want to cycle when it reaches 1gb , copy and truncate them (without restarting any service), compress a copy of the log file and write an entry into the /etc/logadm.conf


 logadm -C 1 -s 1g -c -z 0 -w /usr/local/apache2.2/logs/error_log