Wednesday, October 21, 2009

Enable Tracing for WebSphere Plugin Related Issues

WebServer Plugin writes a log, by default its named as http-plugin.log, by default placed under PLUGIN_HOME/logs/
Plugin writes Error messages into this log. The attribute which deals with this is
<> in the plugin-cfg.xml
Ex.,
< loglevel="Error" name="/usr/IBM/WebSphere/Plugins/logs/http_plugin.log">

According to above line all Error messages will be written into http-plugin.log.

How to enable trace in the plugin-cfg.xml? if that is the question, do like this -

< loglevel="Trace" name="/usr/IBM/WebSphere/Plugins/logs/http_plugin.log">

From the InfoCenter -
Plug-in Problem Determination Steps
The plug-in provides very readable tracing which can be beneficial in helping to figure out the problem. By setting the LogLevel attribute in the config/plugin-cfg.xml file to Trace, you can follow the request processing to see what is going wrong.
Note: If you are using a Veritas File System with large file support enabled, file sizes up to two terabytes are allowed. In this case, if you set the LogLevel attribute in the plugin-cfg.xml file to LogLevel=Trace, then the http_plugin.log file might grow quickly and consume all available space on your file system. Therefore, you should set the value of the LogLevel attribute to ERROR or DEBUG to prevent high CPU utilization.
At a high level, complete these steps.
The plug-in gets a request.
The plug-in checks the routes defined in the plugin-cfg.xml file.
It finds the server group.
It finds the server.
It picks the transport protocol, HTTP or HTTPS.
It sends the request.
It reads the response.
It writes it back to the client.

Wednesday, October 14, 2009

Weblogic 8.1 debug flags

Enable these WebLogic System Level Flags using

the Following

-Dweblogic.debug=debug-parm1,debugparm2,..


Example: For JDBC ConnectionPool debugging use

-Dweblogic.debug=weblogic.JDBCConn,weblogic.JDBCConnStackTrace


IIOP

weblogic.iiop.ots
weblogic.iiop.transport
weblogic.iiop.marshal
weblogic.iiop.startup

JDBC/Data Sources
weblogic.JDBCConn
weblogic.JDBCSQL
weblogic.JDBCConnStackTrace

JMX/Deployment
weblogic.commoAdmin
weblogic.commoProxy
weblogic.deployerRuntime
weblogic.MasterDeployer
weblogic.deployTask
weblogic.deployHelper
weblogic.MasterDeployer
weblogic.OamDelta
weblogic.OamVersion
weblogic.slaveDeployer.semaphore
weblogic.slaveDeployer
weblogic.ConfigMBean
weblogic.ConfigMBeanEncrypt
weblogic.ConfigMBeanSetAttribute
weblogic.management.DynamicMBeanImpl
weblogic.management.DynamicMBeanImpl.setget
weblogic.mbeanProxyCache
weblogic.mbeanDelete
weblogic.mbeanQuery
weblogic.MBeanInteropList
weblogic.mbeanProxy
weblogic.registerMBean
weblogic.getMBeanInfo
weblogic.getMBeanAttributes
weblogic.addDependenciesRecursively
weblogic.MBeanListener
weblogic.application
weblogic.deployer
weblogic.appPoller
weblogic.appManager
weblogic.BootstrapServlet
weblogic.fileDistributionServlet

Application Deployment
weblogic.J2EEApplication
weblogic.application
weblogic.appPoller
weblogic.appManager

JTA
weblogic.JTAGateway
weblogic.JTAGatewayStackTrace
weblogic.JTA2PC
weblogic.JTA2PCStackTrace
weblogic.JTAHealth
weblogic.JTAPropagate
weblogic.JTARecovery
weblogic.JTAXA
weblogic.JTAXAStackTrace
weblogic.JTAResourceHealth
weblogic.JTAMigration
weblogic.JTARecoveryStackTrace
weblogic.JTANaming
weblogic.JTATLOG
weblogic.JTALifecycle

EJB
weblogic.ejb.cache.debug
weblogic.ejb.cache.verbose
ejb.enableCacheDump
weblogic.ejb20.cmp.rdbms.debug
weblogic.ejb20.cmp.rdbms.verbose
weblogic.ejb20.persistence.debug
weblogic.ejb20.persistence.verbose
weblogic.ejb20.compliance.debug
weblogic.ejb20.compliance.verbose
weblogic.ejb.deployment.debug
weblogic.ejb.deployment.verbose
weblogic.ejb20.dd.xml
weblogic.ejb.deployer.debug
weblogic.ejb.deployer.verbose
weblogic.ejb.verbose.deployment
weblogic.ejb20.ejbc.debug
weblogic.ejb20.ejbc.verbose
weblogic.ejb.runtime.debug
weblogic.ejb.runtime.verbose
weblogic.ejb20.jms.poll.debug
weblogic.ejb20.jms.poll.verbose
weblogic.ejb20.security.debug
weblogic.ejb20.security.verbose
weblogic.ejb.locks.debug
weblogic.ejb.locks.verbose
weblogic.ejb.bean.manager.debug
weblogic.ejb.bean.manager.verbose
weblogic.ejb.pool.InstancePool.debug
weblogic.ejb.pool.InstancePool.verbose
weblogic.ejb.swap.debug
weblogic.ejb.swap.verbose
weblogic.j2ee.dd.xml

General
weblogic.debug
weblogic.kernel.debug
weblogic.debug.DebugConnection
weblogic.debug.DebugRouting
weblogic.debug.DebugMessaging
weblogic.debug.isLogRemoteExceptionsEnabled
weblogic.StdoutDebugEnabled

WLI
wlc.debug.signature
wli.bpm.client.security.debug
wli.bpm.studio.timeprocessor.debug
wli.bpm.studio.debug
wli.bpm.server.common.timedevent.debug
wli.bpm.server.common.xmltemplate.debug
wli.bpm.server.eventprocessor.addrmsgdebug
wli.bpm.server.eventprocessor.debug
wli.bpm.server.jms.debug
wli.bpm.server.plugin.debug
wli.bpm.server.workflow.debug
wli.bpm.server.businesscalendar.debug
wli.bpm.server.busop.debug
wli.bpm.server.workflow.action.taskduedate.debug
wli.bpm.server.workflow.timedevent.debug
wli.bpm.server.xml.debug
wli.bpm.server.xslt.debug
wli.bpm.server.workflow.start.debug
wli.bpm.server.workflowprocessor.debug
wli.common.server.errorlistener.debug

Messaging Bridge
-Dweblogic.debug.DebugMessagingBridgeStartup=true -Dweblogic.debug.DebugMessagingBridgeRuntime=true And two others for stdout and stderr : -Dweblogic.Stdout= -Dweblogic.Stderr=

SSL
-Dweblogic.security.SSL.verbose=true -Dssl.debug=true

-9.x available via console plus
webservices: -Dweblogic.wsee.verbose=*

6.x,7.x
http://wldj.sys-con.com/read/42733.htm

RMI debug flags
java.rmi.server.logCalls=true
sun.rmi.loader.logLevel=[BRIEF|VERBOSE]
sun.rmi.server.exceptionTrace
sun.rmi.server.logLevel=[BRIEF|VERBOSE]

Sunday, October 11, 2009

ZFS Cheat sheet



$ man zpool
$ man zfs
Get familiar with command structure and options
$ su
Password:
# cd /
# mkfile 100m disk1 disk2 disk3 disk5
# mkfile 50m disk4
# ls -l disk*
-rw------T 1 root root 104857600 Sep 11 12:15 disk1
-rw------T 1 root root 104857600 Sep 11 12:15 disk2
-rw------T 1 root root 104857600 Sep 11 12:15 disk3
-rw------T 1 root root 52428800 Sep 11 12:15 disk4
-rw------T 1 root root 104857600 Sep 11 12:15 disk5
Create some “virtual devices” or vdevs as described in the zpool documentation. These can also be real disk slices if you have them available.
# zpool create myzfs /disk1 /disk2
# zpool list
NAME SIZE USED AVAIL CAP HEALTH ALTROOT
myzfs 191M 94K 191M 0% ONLINE -
Create a storage pool and check the size and usage.
# zpool status -v
pool: myzfs
state: ONLINE
scrub: none requested
config:

NAME STATE READ WRITE CKSUM
myzfs ONLINE 0 0 0
/disk1 ONLINE 0 0 0
/disk2 ONLINE 0 0 0

errors: No known data errors
Get more detailed status of the zfs storage pool.
# zpool destroy myzfs
# zpool list
no pools available
Destroy a zfs storage pool
# zpool create myzfs mirror /disk1 /disk4
invalid vdev specification
use '-f' to override the following errors:
mirror contains devices of different sizes
Attempt to create a zfs pool with different size vdevs fails. Using -f options forces it to occur but only uses space allowed by smallest device.
# zpool create myzfs mirror /disk1 /disk2 /disk3
# zpool list
NAME SIZE USED AVAIL CAP HEALTH ALTROOT
myzfs 95.5M 112K 95.4M 0% ONLINE -
# zpool status -v
pool: myzfs
state: ONLINE
scrub: none requested
config:

NAME STATE READ WRITE CKSUM
myzfs ONLINE 0 0 0
mirror ONLINE 0 0 0
/disk1 ONLINE 0 0 0
/disk2 ONLINE 0 0 0
/disk3 ONLINE 0 0 0

errors: No known data errors
Create a mirrored storage pool. In this case, a 3 way mirrored storage pool.
# zpool detach myzfs /disk3
# zpool status -v
pool: myzfs
state: ONLINE
scrub: none requested
config:

NAME STATE READ WRITE CKSUM
myzfs ONLINE 0 0 0
mirror ONLINE 0 0 0
/disk1 ONLINE 0 0 0
/disk2 ONLINE 0 0 0

errors: No known data errors
Detach a device from a mirrored pool.
# zpool attach myzfs /disk1 /disk3
# zpool status -v
pool: myzfs
state: ONLINE
scrub: resilver completed with 0 errors on Tue Sep 11 13:31:49 2007
config:

NAME STATE READ WRITE CKSUM
myzfs ONLINE 0 0 0
mirror ONLINE 0 0 0
/disk1 ONLINE 0 0 0
/disk2 ONLINE 0 0 0
/disk3 ONLINE 0 0 0

errors: No known data errors
Attach device to pool. This creates a two-way mirror is the pool is not already a mirror, else it adds another mirror, in this case making it a 3 way mirror.
# zpool remove myzfs /disk3
cannot remove /disk3: only inactive hot spares can be removed
# zpool detach myzfs /disk3
Attempt to remove a device from a pool. In this case it’s a mirror, so we must use “zpool detach”.
# zpool add myzfs spare /disk3
# zpool status -v
pool: myzfs
state: ONLINE
scrub: none requested
config:

NAME STATE READ WRITE CKSUM
myzfs ONLINE 0 0 0
mirror ONLINE 0 0 0
/disk1 ONLINE 0 0 0
/disk2 ONLINE 0 0 0
spares
/disk3 AVAIL

errors: No known data errors
Add a hot spare to a storage pool.
# zpool remove myzfs /disk3
# zpool status -v
pool: myzfs
state: ONLINE
scrub: none requested
config:

NAME STATE READ WRITE CKSUM
myzfs ONLINE 0 0 0
mirror ONLINE 0 0 0
/disk1 ONLINE 0 0 0
/disk2 ONLINE 0 0 0

errors: No known data errors
Remove a hot spare from a pool.
# zpool offline myzfs /disk1
# zpool status -v
pool: myzfs
state: DEGRADED
status: One or more devices has been taken offline by the administrator.
Sufficient replicas exist for the pool to continue functioning
in a degraded state.
action: Online the device using 'zpool online' or replace the device
with 'zpool replace'.
scrub: resilver completed with 0 errors on Tue Sep 11 13:39:25 2007
config:

NAME STATE READ WRITE CKSUM
myzfs DEGRADED 0 0 0
mirror DEGRADED 0 0 0
/disk1 OFFLINE 0 0 0
/disk2 ONLINE 0 0 0

errors: No known data errors
Take the specified device offline. No attempt to read or write to the device will take place until it’s brought back online. Use the -t option to temporarily offline a device. A reboot will bring the device back online.
# zpool online myzfs /disk1
# zpool status -v
pool: myzfs
state: ONLINE
scrub: resilver completed with 0 errors on Tue Sep 11 13:47:14 2007
config:

NAME STATE READ WRITE CKSUM
myzfs ONLINE 0 0 0
mirror ONLINE 0 0 0
/disk1 ONLINE 0 0 0
/disk2 ONLINE 0 0 0

errors: No known data errors
Bring the specified device online.
# zpool replace myzfs /disk1 /disk3
# zpool status -v
pool: myzfs
state: ONLINE
scrub: resilver completed with 0 errors on Tue Sep 11 13:25:48 2007
config:

NAME STATE READ WRITE CKSUM
myzfs ONLINE 0 0 0
mirror ONLINE 0 0 0
/disk3 ONLINE 0 0 0
/disk2 ONLINE 0 0 0

errors: No known data errors
Replace a disk in a pool with another disk, for example when a disk fails
# zpool scrub myzfs
Perform a scrub of the storage pool to verify that it checksums correctly. On mirror or raidz pools, ZFS will automatically repair any damage.
WARNING: scrubbing is I/O intensive.
# zpool export myzfs
# zpool list
no pools available
Export a pool from the system for importing on another system.
# zpool import -d / myzfs
# zpool list
NAME SIZE USED AVAIL CAP HEALTH ALTROOT
myzfs 95.5M 114K 95.4M 0% ONLINE -
Import a previously exported storage pool. If -d is not specified, this command searches /dev/dsk. As we’re using files in this example, we need to specify the directory of the files used by the storage pool.
# zpool upgrade
This system is currently running ZFS pool version 8.

All pools are formatted using this version.
# zpool upgrade -v
This system is currently running ZFS pool version 8.

The following versions are supported:

VER DESCRIPTION
--- --------------------------------------------------------
1 Initial ZFS version
2 Ditto blocks (replicated metadata)
3 Hot spares and double parity RAID-Z
4 zpool history
5 Compression using the gzip algorithm
6 pool properties
7 Separate intent log devices
8 Delegated administration
For more information on a particular version, including supported
releases, see:

http://www.opensolaris.org/os/community/zfs/version/N

Where 'N' is the version number.
Display pools format version. The -v flag shows the features supported by the current version. Use the -a flag to upgrade all pools to the latest on-disk version. Pools that are upgraded will no longer be accessible to any systems running older versions.
# zpool iostat 5
capacity operations bandwidth
pool used avail read write read write
---------- ----- ----- ----- ----- ----- -----
myzfs 112K 95.4M 0 4 26 11.4K
myzfs 112K 95.4M 0 0 0 0
myzfs 112K 95.4M 0 0 0 0
Get I/O statistics for the pool
# zfs create myzfs/colin
# df -h
Filesystem kbytes used avail capacity Mounted on
...
myzfs/colin 64M 18K 63M 1% /myzfs/colin
Create a file system and check it with standard df -h command. File systems are automatically mounted by default under the /zfs location. See the Mountpoints section of the zfs man page for more details.
# zfs list
NAME USED AVAIL REFER MOUNTPOINT
myzfs 139K 63.4M 19K /myzfs
myzfs/colin 18K 63.4M 18K /myzfs/colin
List current zfs file systems.
# zpool add myzfs /disk1
invalid vdev specification
use '-f' to override the following errors:
mismatched replication level: pool uses mirror and new vdev is file
Attempt to add a single vdev to a mirrored set fails
# zpool add myzfs mirror /disk1 /disk5
# zpool status -v
pool: myzfs
state: ONLINE
scrub: none requested
config:

NAME STATE READ WRITE CKSUM
myzfs ONLINE 0 0 0
mirror ONLINE 0 0 0
/disk3 ONLINE 0 0 0
/disk2 ONLINE 0 0 0
mirror ONLINE 0 0 0
/disk1 ONLINE 0 0 0
/disk5 ONLINE 0 0 0

errors: No known data errors
Add a mirrored set of vdevs
# zfs create myzfs/colin2
# zfs list
NAME USED AVAIL REFER MOUNTPOINT
myzfs 172K 159M 21K /myzfs
myzfs/colin 18K 159M 18K /myzfs/colin
myzfs/colin2 18K 159M 18K /myzfs/colin2
Create a second file system. Note that both file system show 159M available because no quotas are set. Each “could” grow to fill the pool.
# zfs set reservation=20m myzfs/colin
# zfs list -o reservation
RESERV
none
20M
none
Reserve a specified amount of space for a file system ensuring that other users don’t take up all the space.
# zfs set quota=20m myzfs/colin2
# zfs list -o quota myzfs/colin myzfs/colin2
QUOTA
none
20M
Set and view quotas
# zfs set compression=on myzfs/colin2
# zfs list -o compression
COMPRESS
off
off
on
Turn on and verify compression
# zfs set sharenfs=on myzfs/colin2
# zfs get sharenfs myzfs/colin2
NAME PROPERTY VALUE SOURCE
myzfs/colin2 sharenfs on local
Share a filesystem over NFS. There is no need to modify the /etc/dfs/dfstab as the filesystem will be share automatically on boot.
# zfs set sharesmb=on myzfs/colin2
# zfs get sharesmb myzfs/colin2
NAME PROPERTY VALUE SOURCE
myzfs/colin2 sharesmb on local
Share a filesystem over CIFS/SMB. This will make your ZFS filesystem accessible to Windows users.
# zfs snapshot myzfs/colin@test
# zfs list
NAME USED AVAIL REFER MOUNTPOINT
myzfs 20.2M 139M 21K /myzfs
myzfs/colin 18K 159M 18K /myzfs/colin
myzfs/colin@test 0 - 18K -
myzfs/colin2 18K 20.0M 18K /myzfs/colin2
Create a snapshot called test.
# zfs rollback myzfs/colin@test
Rollback to a snapshot.
# zfs clone myzfs/colin@test myzfs/colin3
# zfs list
NAME USED AVAIL REFER MOUNTPOINT
myzfs 20.2M 139M 21K /myzfs
myzfs/colin 18K 159M 18K /myzfs/colin
myzfs/colin@test 0 - 18K -
myzfs/colin2 18K 20.0M 18K /myzfs/colin2
myzfs/colin3 0 139M 18K /myzfs/colin3
A snapshot is not directly addressable. A clone must be made. The target dataset can be located anywhere in the ZFS hierarchy, and will be created as the same type as the original.
# zfs destroy myzfs/colin2
# zfs list
NAME USED AVAIL REFER MOUNTPOINT
myzfs 20.1M 139M 22K /myzfs
myzfs/colin 18K 159M 18K /myzfs/colin
myzfs/colin@test 0 - 18K -
myzfs/colin3 0 139M 18K /myzfs/colin3
Destroy a filesystem
# zfs destroy myzfs/colin
cannot destroy 'myzfs/colin': filesystem has children
use '-r' to destroy the following datasets:
myzfs/colin@test
Attempt to destroy a filesystem that had a child. In this case, the snapshot filesystem. We must either remove the snapshot, or make a clone and promote the clone.
# zfs promote myzfs/colin3
# zfs list
NAME USED AVAIL REFER MOUNTPOINT
myzfs 20.1M 139M 21K /myzfs
myzfs/colin 0 159M 18K /myzfs/colin
myzfs/colin3 18K 139M 18K /myzfs/colin3
myzfs/colin3@test 0 - 18K -
# zfs destroy myzfs/colin
# zfs list
NAME USED AVAIL REFER MOUNTPOINT
myzfs 147K 159M 21K /myzfs
myzfs/colin3 18K 159M 18K /myzfs/colin3
myzfs/colin3@test 0 - 18K -
Promte a clone filesystem to no longer be a dependent on it’s “origin” snapshot. This now associates makes the snapshot a child of the cloned filesystem. We can then delete the original filesystem.
# zfs rename myzfs/colin3 myzfs/bob
# zfs list
NAME USED AVAIL REFER MOUNTPOINT
myzfs 153K 159M 21K /myzfs
myzfs/bob 18K 159M 18K /myzfs/bob
myzfs/bob@test 0 - 18K -
# zfs rename myzfs/bob@test myzfs/bob@newtest
# zfs list
NAME USED AVAIL REFER MOUNTPOINT
myzfs 146K 159M 20K /myzfs
myzfs/bob 18K 159M 18K /myzfs/bob
myzfs/bob@newtest 0 - 18K -
Rename a filesystem, and separately rename the snapshot.
# zfs get all
NAME PROPERTY VALUE SOURCE
myzfs type filesystem -
myzfs creation Tue Sep 11 14:21 2007 -
myzfs used 146K -
myzfs available 159M -
myzfs referenced 20K -
[...]
Display properties for the given datasets. This can be refined further using options.
# zpool destroy myzfs
cannot destroy 'myzfs': pool is not empty
use '-f' to force destruction anyway
Can’t destroy a pool with active filesystems.
# zfs unmount myzfs/bob
# df -h
myzfs 159M 20K 159M 1% /myzfs
Unmount a ZFS file system
# zfs mount myzfs/bob
# df -h
myzfs 159M 20K 159M 1% /myzfs
myzfs/bob 159M 18K 159M 1% /myzfs/bob
Mount a ZFS filesystem. This is usually automatically done on boot.
# zfs send myzfs/bob@newtest | ssh localhost zfs receive myzfs/backup
# zfs list
NAME USED AVAIL REFER MOUNTPOINT
myzfs 172K 159M 20K /myzfs
myzfs/backup 18K 159M 18K /myzfs/backup
myzfs/backup@newtest 0 - 18K -
myzfs/bob 18K 159M 18K /myzfs/bob
myzfs/bob@newtest 0 - 18K -
Create a stream representation of the snapshot and redirect it to zfs receive. In this example I’ve redirected to the localhost for illustration purposes. This can be used to backup to a remote host, or even to a local file.
# zpool history
History for 'myzfs':
2007-09-11.15:35:50 zpool create myzfs mirror /disk1 /disk2 /disk3
2007-09-11.15:36:00 zpool detach myzfs /disk3
2007-09-11.15:36:10 zpool attach myzfs /disk1 /disk3
2007-09-11.15:36:53 zpool detach myzfs /disk3
2007-09-11.15:36:59 zpool add myzfs spare /disk3
2007-09-11.15:37:09 zpool remove myzfs /disk3
2007-09-11.15:37:18 zpool offline myzfs /disk1
2007-09-11.15:37:27 zpool online myzfs /disk1
2007-09-11.15:37:37 zpool replace myzfs /disk1 /disk3
2007-09-11.15:37:47 zpool scrub myzfs
2007-09-11.15:37:57 zpool export myzfs
2007-09-11.15:38:05 zpool import -d / myzfs
2007-09-11.15:38:52 zfs create myzfs/colin
2007-09-11.15:39:27 zpool add myzfs mirror /disk1 /disk5
2007-09-11.15:39:38 zfs create myzfs/colin2
2007-09-11.15:39:50 zfs set reservation=20m myzfs/colin
2007-09-11.15:40:18 zfs set quota=20m myzfs/colin2
2007-09-11.15:40:35 zfs set compression=on myzfs/colin2
2007-09-11.15:40:48 zfs snapshot myzfs/colin@test
2007-09-11.15:40:59 zfs rollback myzfs/colin@test
2007-09-11.15:41:11 zfs clone myzfs/colin@test myzfs/colin3
2007-09-11.15:41:25 zfs destroy myzfs/colin2
2007-09-11.15:42:12 zfs promote myzfs/colin3
2007-09-11.15:42:26 zfs rename myzfs/colin3 myzfs/bob
2007-09-11.15:42:57 zfs destroy myzfs/colin
2007-09-11.15:43:23 zfs rename myzfs/bob@test myzfs/bob@newtest
2007-09-11.15:44:30 zfs receive myzfs/backup
Display the command history of all storage pools. This can be limited to a single pool by specifying its name on the command line. The history is only stored for existing pools. Once you’ve destroyed the pool, you’ll no longer have access to it’s history.
# zpool destroy -f myzfs
# zpool status -v
no pools available
Use the -f option to destroy a pool with files systems created.

Wednesday, October 7, 2009

Solaris performance monitoring commands


iostat
vmstat
netstat

iostat

syntax:

iostat [options] interval count

  • option – let you specify the device for which information is needed like disk , cpu or terminal. (-d, -c, t or -tdc ). x options gives the extended statistics.
  • interval – is time period in seconds between two samples. iostat 4 will give data at each 4 seconds interval.
  • count – is the number of times the data is needed. iostat 4 5 will give data at 4 seconds interval 5 times
  • example:

     $ iostat -xtc 5 2
    extended disk statistics tty cpu
    disk r/s w/s Kr/s Kw/s wait actv svc_t %w %b tin tout us sy wt id
    sd0 2.6 3.0 20.7 22.7 0.1 0.2 59.2 6 19 0 84 3 85 11 0
    sd1 4.2 1.0 33.5 8.0 0.0 0.2 47.2 2 23
    sd2 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0 0
    sd3 10.2 1.6 51.4 12.8 0.1 0.3 31.2 3 31

    The fields have the following meanings:
    disk name of the disk
    r/s reads per second
    w/s writes per second
    Kr/s kilobytes read per second
    Kw/s kilobytes written per second
    wait average number of transactions waiting for service (Q length)
    actv average number of transactions actively being serviced (removed from the queue but not yet completed)
    %w percent of time there are transactions waiting for service (queue non-empty)
    %b percent of time the disk is busy (transactions in progress)

    The values to look from the iostat output are:

  • Reads/writes per second (r/s, w/s)
  • Percentage busy (%b) (%b > 5 is bad)
  • Service time (svc_t) (svc_t > 30ms is bad)
  • If a disk shows consistently high reads/writes along with , the percentage busy (%b) of the disks is greater than 5 percent, and the average service time (svc_t) is greater than 30 milliseconds, then one of the following action needs to be taken-

    1. Tune the application to use disk i/o more efficiently by modifying the disk queries and using available cache facilities of application servers.
    2. Spread the file system of the disk on to two or more disk using disk striping feature of volume manager /disksuite etc.
    3. Increase the system parameter values for inode cache, ufs_ninode, which is Number of inodes to be held in memory. Inodes are cached globally (for UFS), not on a per-file system basis.
    4. Move the file system to another faster disk /controller or replace existing disk/controller to a faster one.

    vmstat

    syntax:

    vmstat [options] interval count
  • option – let you specify the type of information needed such as paging -p, cache -c, interrupt -i etc. if no option is specified information about process, memory, paging, disk, interrupts & cpu is displayed.
  • interval – is time period in seconds between two samples. vmstat 4 will give data at each 4 seconds interval.
  • count – is the number of times the data is needed. vmstat 4 5 will give data at 4 seconds interval 5 times.
  • example:

    $vmstat 5
    procs memory page disk faults cpu
    r b w swap free re mf pi p fr de sr s0 s1 s2 s3 in sy cs us sy id
    0 0 0 11456 4120 1 41 19 1 3 0 2 0 4 0 0 48 112 130 4 14 82
    0 0 1 10132 4280 0 4 44 0 0 0 0 0 23 0 0 211 230 144 3 35 62
    0 0 1 10132 4616 0 0 20 0 0 0 0 0 19 0 0 150 172 146 3 33 64
    0 0 1 10132 5292 0 0 9 0 0 0 0 0 21 0 0 165 105 130 1 21 78

    procs
    r in run queue
    b blocked for resources I/O, paging etc.
    w swapped

    memory (in Kbytes)
    swap - amount of swap space currently available
    free - size of the free list

    page (in units per second).
    re page reclaims - see -S option for how this field is modified.
    mf minor faults - see -S option for how this field is modified.
    pi kilobytes paged in
    po kilobytes paged out
    fr kilobytes freed
    de anticipated short-term memory shortfall (Kbytes)
    sr pages scanned by clock algorithm

    disk (operations per second).
    There are slots for up to four disks, labeled with a single letter and number.
    The letter indicates the type of disk (s = SCSI, i = IPI, etc). The number is
    the logical unit number.

    faults
    in (non clock) device interrupts
    sy system calls
    cs CPU context switches

    cpu breakdown of percentage usage of CPU time. On multiprocessors this is an average across all processors.
    us user time
    sy system time
    id idle time

    CPU issues:

    Following columns has to be watched to determine if there is any cpu issue:
    1. Processes in the run queue (procs r)
    2. User time (cpu us)
    3. System time (cpu sy)
    4. Idle time (cpu id)
         procs      cpu
    r b w us sy id
    0 0 0 4 14 82
    0 0 1 3 35 62
    0 0 1 3 33 64
    0 0 1 1 21 78
    Problem symptoms:
    1. If the number of processes in run queue (procs r) are consistently greater than the number of CPUs on the system it will slow down system as there are more processes then available CPUs.
    2. if this number is more than four times the number of available CPUs in the system then system is facing shortage of cpu power and will greatly slow down the processes on the system.
    3. If the idle time (cpu id) is consistently 0 and if the system time (cpu sy) is double the user time (cpu us) system is facing shortage of CPU resources.

    Resolution to these kind of issues involves tuning of application procedures to make efficient use of cpu and as a last resort increasing the cpu power or adding more cpu to the system.

    Memory Issues:

    Memory bottlenecks are determined by the scan rate (sr) . The scan rate is the pages scanned by the clock algorithm per second. If the scan rate (sr) is continuously over 200 pages per second then there is a memory shortage.
    Resolution:
    1. Tune the applications & servers to make efficient use of memory and cache.
    2. Increase system memory.
    3. Implement priority paging in s in pre Solaris 8 versions by adding line “set priority paging=1″ in /etc/system. Remove this line if upgrading from Solaris 7 to 8 & retaining old /etc/system file.

    netstat

    syntax:

    netstat [option/s]
    Options
    -a - displays the state of all sockets.
    -r - shows the system routing tables
    -i - gives statistics on a per-interface basis.
    -m - displays information from the network memory buffers. On Solaris, this shows statistics for streams
    -p [proto] - retrieves statistics for the specified protocol
    -s - shows per-protocol statistics. (some implementations allow -ss to remove fileds with a value of 0 (zero) from the display.)
    -D - display the status of DHCP configured interfaces.
    -n - do not lookup hostnames, display only IP addresses.
    -d - (with -i) displays dropped packets per interface.
    -I [interface] - retrieve information about only the specified interface.
    -v - be verbose
    interval - number for continuous display of statictics.

    example:

    $netstat -rn

    Routing Table: IPv4
    Destination Gateway Flags Ref Use Interface
    -------------------- -------------------- ----- ----- ------ ---------
    192.168.1.0 192.168.1.11 U 1 1444 le0
    224.0.0.0 192.168.1.11 U 1 0 le0
    default 192.168.1.1 UG 1 68276
    127.0.0.1 127.0.0.1 UH 1 10497 lo0
    This shows the output on a Solaris machine who’s IP address is 192.168.1.11 with a default router at 192.168.1.1
    Network availability
    The command as above is mostly useful in troubleshooting network accessibility issues. When outside network is not accessible from a machine check the following
    1. if the default router ip address is correct.
    2. you can ping it from your machine.
    3. If router address is incorrect it can be changed with route add command. See man route for more info.
      route command examples:
      $route add default [hostname]
      $route add 192.0.2.32 [gateway_name]
    If the router address is correct but still you can’t ping it there may be some network cable /hub/switch problem and you have to try and eliminate the faulty component.
    Network Response
    $ netstat -i
    Name Mtu Net/Dest Address Ipkts Ierrs Opkts Oerrs Collis Queue
    lo0 8232 loopback localhost 77814 0 77814 0 0 0
    hme0 1500 server1 server1 10658566 3 4832511 0 279257 0
    This option is used to diagnose the network problems when the connectivity is there but it is slow in response.
    Values to look at:
  • Collisions (Collis)
  • Output packets (Opkts)
  • Input errors (Ierrs)
  • Input packets (Ipkts)
  • The above values will give information to workout.

    Network collision rate as follows:

    Network collision rate = Output collision counts / Output packets
    Network-wide collision rate greater than 10 percent will indicate
  • Overloaded network,
  • Poorly configured network,
  • Hardware problems.
  • Input packet error rate as follows:

    Input Packet Error Rate = Ierrs / Ipkts
    If the input error rate is high (over 0.25 percent), the host is dropping packets. Hub/switch cables etc needs to be checked for potential problems.

    Network socket & TCP Cconnection state

    netstat gives important information about network socket and tcp state. This is very useful in finding out the open, closed and waiting network tcp connection.
    Network states returned by netstat are following:
    LISTEN ---- Listening for incoming connections.
    SYN_SENT ---- Actively trying to establish connection.
    SYN_RECEIVED ---- Initial synchronization of the connection under way.
    ESTABLISHED ---- Connection has been established.
    FIN_WAIT_1 ---- Socket closed; shutting down connection.
    FIN_WAIT_2 ---- Socket closed; waiting for shutdown from remote.
    CLOSE_WAIT ---- Remote shut down; waiting for the socket to close.
    CLOSING ---- Closed, then remote shutdown; awaiting acknowledgement.
    CLOSED ---- Closed. The socket is not being used.
    LAST_ACK ---- Remote shut down, then closed; awaiting acknowledgement.
    TIME_WAIT ---- Wait after close for remote shutdown retransmission.
    $netstat -a
    Local Address Remote Address Swind Send-Q Rwind Recv-Q State
    *.* *.* 0 0 24576 0 IDLE
    *.22 *.* 0 0 24576 0 LISTEN
    *.22 *.* 0 0 24576 0 LISTEN
    *.* *.* 0 0 24576 0 IDLE
    *.32771 *.* 0 0 24576 0 LISTEN
    *.4045 *.* 0 0 24576 0 LISTEN
    *.25 *.* 0 0 24576 0 LISTEN
    *.5987 *.* 0 0 24576 0 LISTEN
    *.898 *.* 0 0 24576 0 LISTEN
    *.32772 *.* 0 0 24576 0 LISTEN
    *.32775 *.* 0 0 24576 0 LISTEN
    *.32776 *.* 0 0 24576 0 LISTEN
    *.* *.* 0 0 24576 0 IDLE
    192.168.1.184.22 192.168.1.186.50457 41992 0 24616 0 ESTABLISHED
    192.168.1.184.22 192.168.1.186.56806 38912 0 24616 0 ESTABLISHED
    192.168.1.184.22 192.168.1.183.58672 18048 0 24616 0 ESTABLISHED
    If you see a lots of connections in FIN_WAIT state tcp/ip parameters have to be tuned because the connections are not being closed and they gets accumulating. After some time system may run out of resource. TCP parameter can be tuned to define a time out so that connections can be released and used by new connection.