Thursday, November 12, 2009

Steps to Troubleshoot Typical Java Problems

Steps To Troubleshoot Typical Java Problems


This section list some typical java problems and the actions we need to perform, the tools we need to use to collect more information and/or to analyse and debug the problem.


Hung, Deadlocked, or Looping Process

  • Print thread stack for all Java threads.
    • Control-\
  • Print thread trace for a process.
    • kill -QUIT pid
    • jstack -F pid (with -F option if pid does not respond)
  • Detect deadlocks.
    • JConsole tool, Threads tab
    • Print information on deadlocked threads: Control-\
    • Print lock information for a process: jstack -l pid
  • Get a heap histogram for a process.
    • Start Java process with -XX:+PrintClassHistogram, then Control-\
    • Start Java process with -XX:+PrintClassHistogram, then kill -QUIT pid
    • jmap -histo pid (with -F option if pid does not respond)
  • Dump Java heap for a process in binary format to file.
    • jmap -dump:format=b,file=filename pid (with -F option if pid does not respond)
  • Print shared object mappings for a process.
    • jmap pid
  • Print heap summary for a process.
    • jmap -heap pid
  • Print finalization information for a process.
    • jmap -finalizerinfo pid
  • Attach the command-line debugger to a process.
    • jdb -connect sun.jvm.hotspot.jdi.SAPIDAttachingConnector:pid=pid
  • Attach the command-line debugger to a core file on the same machine.
    • jdb -connect sun.jvm.hotspot.jdi.SACoreAttachingConnector:javaExecutable=path,core=corefile
  • Attach the command-line debugger to a core file on a different machine.
    • On the machine with the core file: jsadebugd path corefile
      and on the machine with the debugger: jdb -connect sun.jvm.hotspot.jdi.SADebugServerAttachingConnector:debugServerName=machine

Back to top

Post-mortem Diagnostics

  • Examine the fatal error log file. Default file name is hs_err_pidpid.log in the working-directory.
  • Create a heap dump.
    • HPROF: java -agentlib:hprof=file=file,format=b application; then Control-\
    • HPROF: java -agentlib:hprof=heap=dump application
    • JConsole tool, MBeans tab
    • Start VM with -XX:+HeapDumpOnOutOfMemoryError; if OutOfMemoryError is thrown, VM generates a heap dump.
  • Browse Java heap dump.
    • jhat heap-dump-file
  • Dump Java heap from core file in binary format to a file.
    • jmap -dump:format=b,file=filename corefile
  • Get a heap histogram from a core file.
    • jmap -histo corefile
  • Print shared object mappings from a core file.
    • jmap corefile
  • Print heap summary ffrom a core file.
    • jmap -heap corefile
  • Print finalization information from a core file.
    • jmap -finalizerinfo corefile
  • Print Java configuration information from a core file.
    • jinfo corefile
  • Print thread trace from a core file.
    • jstack corefile
  • Print lock information from a core file.
    • jstack -l corefile

Back to top

Monitoring

  • Print statistics on the class loader.
      jstat -classvmID
  • Print statistics on the compiler.
    • Compiler behavior: jstat -compiler vmID
    • Compilation method statistics: jstat -printcompilation vmID
  • Print statistics on garbage collection.
    • Summary of statistics: jstat -gcutil vmID
    • Summary of statistics, with causes: jstat -gccause vmID
    • Behavior of the gc heap: jstat -gc vmID
    • Capacities of all the generations: jstat -gccapacity vmID
    • Behavior of the new generation: jstat -gcnew vmID
    • Capacity of the new generation: jstat -gcnewcapacity vmID
    • Behavior of the old and permanent generations: jstat -gcold vmID
    • Capacity of the old generation: jstat -gcoldcapacity vmID
    • Capacity of the permanent generation: jstat -gcpermcapacity vmID
  • Monitor objects awaiting finalization
    • JConsole tool, Summary tab
    • jmap -finalizerinfo pid
    • getObjectPendingFinalizationCount method in java.lang.management.MemoryMXBean class
  • Monitor memory
    • Heap allocation profiles via HPROF: java -agentlib:hprof=heap=sites
    • JConsole tool, Memory tab
    • Control-\ prints generation information.
  • Monitor CPU usage.
    • By thread stack: java -agentlib:hprof=cpu=samples application
    • By method: java -agentlib:hprof=cpu=times application
    • JConsole tool, Overview and Summary tabs
  • Monitor thread activity: JConsole tool, Threads tab
  • Monitor class activity: JConsole tool, Classes tab

Back to top

Actions on a Remote Debug Server

First, attach the debug daemon jsadebugd, then execute the command.

  • Dump Java heap in binary format to a file: jmap -dump:format=b,file=filename hostID
  • Print shared object mappings: jmap hostID
  • Print heap summary : jmap -heap hostID
  • Print finalization information : jmap -finalizerinfo hostID
  • Print lock information : jstack -l hostID
  • Print thread trace : jstack hostID
  • Print Java configuration information: jinfo hostID

Back to top

Other Functions

  • Interface with the instrumented Java virtual machines.
    • Monitor for the creation and termination of instrumented VMs: jstatd daemon
    • List the instrumented VMs: jps
    • Provide interface between remote mointoring tools and local VMs: jstatd daemon
  • Print Java configuration information from a running process.
    • jinfo pid
  • Dynamically set, unset, or change the value of certain Java VM flags for a process.
    • jinfo -flag flag
  • Pass a Java VM flag to the virtual machine.
    • jhat -Jflag ...
    • jmap -Jflag ...
  • Print statistics of permanent generation of Java heap, by class loader.
    • jmap -permstat
  • Report on monitor contention.
    • java -agentlib:hprof=monitor=y application
  • Evaluate or execute a script in interactive or batch mode.
    • jrunscript
  • Interface dynamically with an MBean, via JConsole tool, MBean tab:
    • Show tree structure.
    • Set an attribute value.
    • Invoke an operation.
    • Subscribe to notification.

Find Swap usage on Solaris

In Solaris, swap and physical memory can be interchanged, that is physical memory can be
reserved for swap instead of using the swap device.

1. swap -lh
2. swap -sh
3. prtconf | grep size
4. ps -A -o vsz,rss | awk 'BEGIN {size = 0; rss = 0;} {size += $1; rss
+= $2} END{printf("Size = %d kb, RSS = %d kb\n", size, rss);}'

The first shows how much of the physical swap device has been used. The
second looks at how much swap is used and reserved, versus available.
The third is just a sanity check for the physical memory. The fourth
looks at all of the processes in the system and allows us to see the
total difference between the vsize and the rss.


Now to figure out per process usage of swap space, we can do the following


1) bash-3.00$ top -b -o size
load averages: 2.99, 2.94, 3.02; up 199+21:07:21 12:06:32
103 processes: 99 sleeping, 4 on cpu
CPU states: 88.7% idle, 9.0% user, 2.3% kernel, 0.0% iowait, 0.0% swap
Memory: 64G phys mem, 4834M free mem, 16G total swap, 16G free swap

PID USERNAME LWP PRI NICE SIZE RES STATE TIME CPU COMMAND
12510 uxbea 57 33 4 2313M 963M sleep 199:53 0.08% java
10790 uxbea 208 59 4 2309M 1258M sleep 33.8H 0.01% java
17942 uxbea 153 22 0 2273M 1098M sleep 27:32 0.04% java
26082 uxbea 130 50 0 1981M 729M sleep 426:29 0.07% java
23365 uxbea 130 59 0 1976M 723M cpu/4 332:06 0.13% java
24662 uxbea 129 56 0 1963M 751M sleep 186:11 0.03% java
9353 uxbea 150 24 4 1778M 1365M sleep 182:53 1.81% java
18747 uxbea 130 1 0 1486M 610M sleep 84.2H 0.03% java
19334 uxbea 130 1 0 1441M 1063M cpu/20 33:17 3.18% java
25622 uxbea 125 53 0 1431M 583M sleep 183:50 0.04% java
7565 uxbea 97 54 0 1274M 673M sleep 38:27 0.02% java
24810 uxbea 285 53 0 1226M 1091M sleep 27.6H 0.09% java
17895 uxbea 230 1 4 1098M 820M sleep 25.0H 0.06% java
10901 uxbea 164 55 4 1066M 693M sleep 21:10 0.03% java
12056 uxbea 227 43 4 1061M 606M sleep 24.4H 0.08% java


looks like pid 12510 is using most virtual memory.. to figure out how much of i
is swap, use pmap's help

2) bash-3.00$ pmap -S 12510 | tail -1
total Kb 2368032 2022672

PID 12510 is using 2.02G of swap.






Friday, November 6, 2009

Built-in Variables for Shell Scripts

When doing scripting in shell, like bash, there are a few built-in variables that we can use to optimize our script. Below are a few useful ones:

  • $$ = The PID number of the process executing the shell.
  • $? = Exit status variable.
  • $0 = The name of the command you used to call a program.
  • $1 = The first argument on the command line.
  • $2 = The second argument on the command line.
  • $n = The nth argument on the command line. n = 0-9
  • $* = All the arguments on the command line.
  • $# = The number of command line arguments.

Tuesday, November 3, 2009

Solaris System Level Debugging using truss

What is truss?

From the Solaris man pages, "the truss utility executes the specified command and produces a trace of the system calls it performs, the signals it receives, and the machine faults it incurs. Each line of the trace output reports either the fault or signal name or the system call name with its arguments and return value(s)".

We will use truss with the following switches

  • -o produces an output file
  • -f follows all children created by fork() and vfork() and includes their signals,faults and system calls in the trace output. Normally, only the first level command or process is traced. When -f is specified, the process-id is included with each line of the trace output to indicate which process executed the system call or received the signal
  • -p indicates the pid which is being traced

To trace a PID issue the following command
bash#truss -vall -fall -p $pid -o /tmp/mailcheck.out

To trace a command issue the following command
bash#truss -vall -fall -o /tmp/commandtrace.out telnet


In the truss output the code ENOENT is shown very frequently; its meaning can be found in the system include file errno.h

bash#grep ENOENT /usr/include/sys/*
/usr/include/sys/errno.h:#define ENOENT 2 /* No such file or directory */

We therefore know that in the output of truss the ENOENT means "no such file or directory"

Sunday, November 1, 2009

Steps for Creating Vertical Websphere portal cluster

IBM recommends that you should user vertical clustering i.e. having more than one portal JVM's running on single machine to make full use of the resources. Also it looks like setting up vertical cluster is much easier than the horizontal cluster.

I did setup a vertical cluster on my VMWare by following these steps

Start the Deployment manager if it is not already started
Login into the WAS Admin Console of the deployment manager by going to https://localhost:9043/ibm/console URL.
In the WAS Admin Console Go to Server -> Clusters. And select the cluster in which you want to add new member


Click on the Cluster members, it will list out the members of that cluster. Click on New. It will open a Create additional cluster member page like this


On this page enter name of the new cluster and select the node where this vertical node should be added. Dont forget to check the Generate Unique HTTP ports check box. Click on next couple of pages and when you say finish it will create set of configuration files for the new server
Important Note :You must update the virtual host entries for the new port created when adding a cluster member. You can do this by updating the default_host virtual host in the administrative console and adding a new alias entry for the port number (use an asterisk [*] wildcard character for the host name).

Once the new vertical cluster member is created you can verify it by going to Cluster topology


Next step would be to configure dynamic cache so that the cache entries created on this newly created server get copied to other members of the cluster. You can do that by going to Application Server -> Newly created server -> Dynamic Cache service.
On this page check Enable cache replication check box, select name of your cluster as Full Group replication domain and set Replication Type as pull only


Next go to the wp_profile/ConfigEngine.sh directory on the target machine and execute ./ConfigEngine.sh cluster-node-config-vertical-cluster-setup -DServerName=WebSphere_Portal_V1 task to clean up the server-scoped resources, caches, and resource providers
Restart the newly added vertical cluster member
In the Admin Console change the value of WCM_HOST and WCM_PORT websphere variables scoped at the server level to point to the web server which will be used to serve WCM content
Re Synchronize changes from Deployment manager to the cluster members
Regenerate the Web Server plugin to include the newly added vertical cluster and copy the newly generated plugin-cfg.xml file to web server and restart web server so that it can start forwarding request to newly added vertical cluster member

No Downtime Production redeployment in WebLogic

Upgrading a running application in a J2EE production environment isn't easy. You either have to undeploy the old version of the application and deploy the new one—causing a temporary outage—or you may have to set up a redundant server/cluster to route the new requests.

BEA WebLogic Server 9.0 supports a production redeployment feature that provides a way to seamlessly upgrade an application in a production environment without affecting application availability. After redeploying a new version of the application, all new client connection requests go to the new application. The existing client connections continue to use the old application that will be undeployed/retired after all the existing connections are closed. The two application versions are completely isolated from each other and do not share any resources. Alternatively, the old version of the application can be retired by specifying a retire timeout for the application.

This article uses a sample application to demonstrate this functionality.

Requirements

Currently, WebLogic Server 9.0 supports this production redeployment feature only for Web application (WAR) modules and enterprise applications (EARs). All other types of archives (EJB JAR, JCA RAR, WebServices archives, JMS, or JDBC standalone modules) are not supported. EARs can contain all supported module types, except WebServices archives. Production redeployment only supports HTTP clients; Java clients are not supported. Attempting to perform production redeployment with an unsupported archive type will result in an error. To redeploy such modules, remove their version identifiers and explicitly redeploy the modules.

In addition, only versioned applications can be redeployed using this feature. A versioned application is an application that has an application archive version specified in the manifest of the application archive.

A deployed application must specify a version number before you can perform subsequent production redeployment operations on the application. In other words, you cannot deploy a non-versioned application and later perform production redeployment with a new version of the application.

WebLogic Server 9.0 can host a maximum of two different versions of an application at any one time. Also, when you redeploy a new version of an application, you cannot change the application's deployment targets, security model, or persistent store settings. To change any of the above features, you must first undeploy the active version of the application.

Application Version Information

The application version information can be specified in the MANIFEST.MF file's WebLogic-Application-Version property. The manifest is a special file that can contain information about the files packaged in a JAR file. By tailoring this "meta" information that the manifest contains, you enable the JAR file to serve a variety of purposes.

For example, an application archive whose application archive version is "v1" could have the following manifest content:

Manifest-Version: 1.0



Created-By: 1.4.1_05-b01 (Sun Microsystems Inc.)



WebLogic-Application-Version: v1

The application archive version is a string that can only contain the following characters: alphanumeric ("A"-"Z," "a"-"z," "0"-"9"), period ("."), underscore ("_"), and hyphen ("-"). The length of the application archive version should be less than 215 characters. Additionally, the application archive version string cannot be "." or "..". You can either specify a version for an application using the manifest file, or assign one when using the Deployer tool's -appversion option. The value specified in MANIFEST.MF file will take precedence over the -appversion value.

The version number is important when considering the redeployment of versionable applications. If a new application archive version is specified, WebLogic Server will perform production redeployment with version isolation. If the same application archive version is specified, WebLogic Server will perform in-place redeployment.

The Sample Versioned Application

The attached sample application (VersionedApp1) contains a Web application, which has three JSP files:

  • The versionedjsp.jsp file contains a simple print statement.
  • The invalidatesession.jsp contains a session.invalidate() command to invalidate all the sessions.
  • The timeoutsession.jsp file contains code to set the timeout value for the session.

The other application, VersionedApp2, contains the same set of files, but the versionedjsp.jsp file will print a different message.

We'll use these applications to demonstrate the versioning process.

Deploying the Application

The sample applications provided along with this article do not have version information in their manifest files. If you want to use production redeployment with an application that does not include a version string in the manifest file, the Deployer tool allows you to manually specify a unique version string using the -appversion option when deploying or redeploying an application. Run this command to deploy the application with a version of version1:

java weblogic.Deployer -adminurl http://localhost:8802

-username weblogic -password weblogic -name VersionedApp

-targets adminServer

-deploy -source C:/tmp/VersionedApp1 -appversion version1

Deployer is a Java-based deployment tool that provides a command-line interface to the WebLogic Server deployment API. Deployer is intended for administrators and developers who want to perform interactive, command line-based deployment operations.

Note that the version string specified with -appversion is applied only when the deployment source files do not specify a version string in MANIFEST.MF. For applications with version information in the manifest files, you need not provide the -appversion option.

You can also display version information for deployed applications from the command line using the Deployer -listapps command. So for example, after deploying the above application you can run this command to list the application:

java weblogic.Deployer -adminurl http://localhost:8802 

-user weblogic -password weblogic -listapps

Redeploying a New Version of the Application

Now that we've deployed the application, let's look at redeploying it. Since our deployment files do not contain version information in the manifest files, we perform redeploy with the -appversion option as mentioned above:

java weblogic.Deployer -adminurl http://localhost:8802

-username weblogic -password weblogic -name VersionedApp

-targets adminServer -redeploy -source

C:/tmp/VersionedApp2 -appversion version2

If you want to specify a fixed time period after which the older version of the application is undeployed (regardless of whether clients finish their work), use the -retiretimeout option with the -redeploy command. (-retiretimeout specifies the number of seconds after which the older version of the application is retired):

java weblogic.Deployer -adminurl http://localhost:8802

-username weblogic -password weblogic -name VersionedApp

-targets adminServer -redeploy -source

C:/tmp/VersionedApp2 -appversion version2 -retiretimeout 300

If WebLogic Server has not yet retired an application version, you can immediately undeploy the application version without waiting for retirement to complete. This may be necessary if, for example, an application remains in the retiring state with only one or two long-running client sessions that you do not want to preserve. To force the undeployment of a retiring version of an application, use the -undeploy command and specify the application version:

java weblogic.Deployer -adminurl http://localhost:8802

-username weblogic -password weblogic -name VersionedApp

-targets adminServer -undeploy -name VersionedApp

-appversion version1

If you do not explicitly specify an application version with the -appversion option, WebLogic Server undeploys the active version and all retired versions of the application.

Verifying the Deployment

After deploying the first version of the application, open a browser and invoke the versionedjsp.jsp:

http://localhost:8802/VersionedApp/versionedjsp.jsp

That should establish an HTTP session to the VersionedApp1 application. In the browser window we should see the "Output from VersionedApp1 JSP" message. After deploying the VersionedApp2 application, open another browser window and invoke the versionedjsp.jsp. Now we should see the "Output from VersionedApp2 JSP" message from the VersionedApp2 application. At this time, both the versions of application should be alive.

Now invoke the invalidatesession.jsp from the first browser window:

http://localhost:8802/VersionedApp/invalidatesession.jsp

This will invalidate all the established sessions to the VersionedApp1 application. Take a look at the server console window. The retirement process should have started now. Wait a few moments for the retirement process to complete and invoke versionedjsp.jsp from the first browser window:

http://localhost:8802/VersionedApp/versionedjsp.jsp

This time you should see the "Output from VersionedApp2 JSP" message from the VersionedApp2 application.

Rolling Back the Production Redeployment Process

Reversing the production redeployment process switches the state of the active and retiring applications and redirects new client connection requests accordingly. Reverting the production redeployment process may be necessary if you detect a problem with a newly deployed version of an application, and you want to stop clients from accessing it.

To roll back the production redeployment process, issue a second -redeploy command and specify the deployment source files for the older version:

java weblogic.Deployer -adminurl http://localhost:8802

-user weblogic -password weblogic -redeploy

-name VersionedApp C:/tmp/VersionedApp1

-retiretimeout 300

Conclusion

Production redeployment is a very powerful functionality. With this functionality, customers get the ability to roll out application upgrades in a production environment transparently, without disruption to clients. Production redeployment not only requires fewer hardware resources but also provides more flexibility and control of application availability. Administrators should definitely consider using this functionality in production environments, which will not only make their tasks easier, but also provide minimal disruption to the end user.

Wednesday, October 21, 2009

Enable Tracing for WebSphere Plugin Related Issues

WebServer Plugin writes a log, by default its named as http-plugin.log, by default placed under PLUGIN_HOME/logs/
Plugin writes Error messages into this log. The attribute which deals with this is
<> in the plugin-cfg.xml
Ex.,
< loglevel="Error" name="/usr/IBM/WebSphere/Plugins/logs/http_plugin.log">

According to above line all Error messages will be written into http-plugin.log.

How to enable trace in the plugin-cfg.xml? if that is the question, do like this -

< loglevel="Trace" name="/usr/IBM/WebSphere/Plugins/logs/http_plugin.log">

From the InfoCenter -
Plug-in Problem Determination Steps
The plug-in provides very readable tracing which can be beneficial in helping to figure out the problem. By setting the LogLevel attribute in the config/plugin-cfg.xml file to Trace, you can follow the request processing to see what is going wrong.
Note: If you are using a Veritas File System with large file support enabled, file sizes up to two terabytes are allowed. In this case, if you set the LogLevel attribute in the plugin-cfg.xml file to LogLevel=Trace, then the http_plugin.log file might grow quickly and consume all available space on your file system. Therefore, you should set the value of the LogLevel attribute to ERROR or DEBUG to prevent high CPU utilization.
At a high level, complete these steps.
The plug-in gets a request.
The plug-in checks the routes defined in the plugin-cfg.xml file.
It finds the server group.
It finds the server.
It picks the transport protocol, HTTP or HTTPS.
It sends the request.
It reads the response.
It writes it back to the client.

Wednesday, October 14, 2009

Weblogic 8.1 debug flags

Enable these WebLogic System Level Flags using

the Following

-Dweblogic.debug=debug-parm1,debugparm2,..


Example: For JDBC ConnectionPool debugging use

-Dweblogic.debug=weblogic.JDBCConn,weblogic.JDBCConnStackTrace


IIOP

weblogic.iiop.ots
weblogic.iiop.transport
weblogic.iiop.marshal
weblogic.iiop.startup

JDBC/Data Sources
weblogic.JDBCConn
weblogic.JDBCSQL
weblogic.JDBCConnStackTrace

JMX/Deployment
weblogic.commoAdmin
weblogic.commoProxy
weblogic.deployerRuntime
weblogic.MasterDeployer
weblogic.deployTask
weblogic.deployHelper
weblogic.MasterDeployer
weblogic.OamDelta
weblogic.OamVersion
weblogic.slaveDeployer.semaphore
weblogic.slaveDeployer
weblogic.ConfigMBean
weblogic.ConfigMBeanEncrypt
weblogic.ConfigMBeanSetAttribute
weblogic.management.DynamicMBeanImpl
weblogic.management.DynamicMBeanImpl.setget
weblogic.mbeanProxyCache
weblogic.mbeanDelete
weblogic.mbeanQuery
weblogic.MBeanInteropList
weblogic.mbeanProxy
weblogic.registerMBean
weblogic.getMBeanInfo
weblogic.getMBeanAttributes
weblogic.addDependenciesRecursively
weblogic.MBeanListener
weblogic.application
weblogic.deployer
weblogic.appPoller
weblogic.appManager
weblogic.BootstrapServlet
weblogic.fileDistributionServlet

Application Deployment
weblogic.J2EEApplication
weblogic.application
weblogic.appPoller
weblogic.appManager

JTA
weblogic.JTAGateway
weblogic.JTAGatewayStackTrace
weblogic.JTA2PC
weblogic.JTA2PCStackTrace
weblogic.JTAHealth
weblogic.JTAPropagate
weblogic.JTARecovery
weblogic.JTAXA
weblogic.JTAXAStackTrace
weblogic.JTAResourceHealth
weblogic.JTAMigration
weblogic.JTARecoveryStackTrace
weblogic.JTANaming
weblogic.JTATLOG
weblogic.JTALifecycle

EJB
weblogic.ejb.cache.debug
weblogic.ejb.cache.verbose
ejb.enableCacheDump
weblogic.ejb20.cmp.rdbms.debug
weblogic.ejb20.cmp.rdbms.verbose
weblogic.ejb20.persistence.debug
weblogic.ejb20.persistence.verbose
weblogic.ejb20.compliance.debug
weblogic.ejb20.compliance.verbose
weblogic.ejb.deployment.debug
weblogic.ejb.deployment.verbose
weblogic.ejb20.dd.xml
weblogic.ejb.deployer.debug
weblogic.ejb.deployer.verbose
weblogic.ejb.verbose.deployment
weblogic.ejb20.ejbc.debug
weblogic.ejb20.ejbc.verbose
weblogic.ejb.runtime.debug
weblogic.ejb.runtime.verbose
weblogic.ejb20.jms.poll.debug
weblogic.ejb20.jms.poll.verbose
weblogic.ejb20.security.debug
weblogic.ejb20.security.verbose
weblogic.ejb.locks.debug
weblogic.ejb.locks.verbose
weblogic.ejb.bean.manager.debug
weblogic.ejb.bean.manager.verbose
weblogic.ejb.pool.InstancePool.debug
weblogic.ejb.pool.InstancePool.verbose
weblogic.ejb.swap.debug
weblogic.ejb.swap.verbose
weblogic.j2ee.dd.xml

General
weblogic.debug
weblogic.kernel.debug
weblogic.debug.DebugConnection
weblogic.debug.DebugRouting
weblogic.debug.DebugMessaging
weblogic.debug.isLogRemoteExceptionsEnabled
weblogic.StdoutDebugEnabled

WLI
wlc.debug.signature
wli.bpm.client.security.debug
wli.bpm.studio.timeprocessor.debug
wli.bpm.studio.debug
wli.bpm.server.common.timedevent.debug
wli.bpm.server.common.xmltemplate.debug
wli.bpm.server.eventprocessor.addrmsgdebug
wli.bpm.server.eventprocessor.debug
wli.bpm.server.jms.debug
wli.bpm.server.plugin.debug
wli.bpm.server.workflow.debug
wli.bpm.server.businesscalendar.debug
wli.bpm.server.busop.debug
wli.bpm.server.workflow.action.taskduedate.debug
wli.bpm.server.workflow.timedevent.debug
wli.bpm.server.xml.debug
wli.bpm.server.xslt.debug
wli.bpm.server.workflow.start.debug
wli.bpm.server.workflowprocessor.debug
wli.common.server.errorlistener.debug

Messaging Bridge
-Dweblogic.debug.DebugMessagingBridgeStartup=true -Dweblogic.debug.DebugMessagingBridgeRuntime=true And two others for stdout and stderr : -Dweblogic.Stdout= -Dweblogic.Stderr=

SSL
-Dweblogic.security.SSL.verbose=true -Dssl.debug=true

-9.x available via console plus
webservices: -Dweblogic.wsee.verbose=*

6.x,7.x
http://wldj.sys-con.com/read/42733.htm

RMI debug flags
java.rmi.server.logCalls=true
sun.rmi.loader.logLevel=[BRIEF|VERBOSE]
sun.rmi.server.exceptionTrace
sun.rmi.server.logLevel=[BRIEF|VERBOSE]

Sunday, October 11, 2009

ZFS Cheat sheet



$ man zpool
$ man zfs
Get familiar with command structure and options
$ su
Password:
# cd /
# mkfile 100m disk1 disk2 disk3 disk5
# mkfile 50m disk4
# ls -l disk*
-rw------T 1 root root 104857600 Sep 11 12:15 disk1
-rw------T 1 root root 104857600 Sep 11 12:15 disk2
-rw------T 1 root root 104857600 Sep 11 12:15 disk3
-rw------T 1 root root 52428800 Sep 11 12:15 disk4
-rw------T 1 root root 104857600 Sep 11 12:15 disk5
Create some “virtual devices” or vdevs as described in the zpool documentation. These can also be real disk slices if you have them available.
# zpool create myzfs /disk1 /disk2
# zpool list
NAME SIZE USED AVAIL CAP HEALTH ALTROOT
myzfs 191M 94K 191M 0% ONLINE -
Create a storage pool and check the size and usage.
# zpool status -v
pool: myzfs
state: ONLINE
scrub: none requested
config:

NAME STATE READ WRITE CKSUM
myzfs ONLINE 0 0 0
/disk1 ONLINE 0 0 0
/disk2 ONLINE 0 0 0

errors: No known data errors
Get more detailed status of the zfs storage pool.
# zpool destroy myzfs
# zpool list
no pools available
Destroy a zfs storage pool
# zpool create myzfs mirror /disk1 /disk4
invalid vdev specification
use '-f' to override the following errors:
mirror contains devices of different sizes
Attempt to create a zfs pool with different size vdevs fails. Using -f options forces it to occur but only uses space allowed by smallest device.
# zpool create myzfs mirror /disk1 /disk2 /disk3
# zpool list
NAME SIZE USED AVAIL CAP HEALTH ALTROOT
myzfs 95.5M 112K 95.4M 0% ONLINE -
# zpool status -v
pool: myzfs
state: ONLINE
scrub: none requested
config:

NAME STATE READ WRITE CKSUM
myzfs ONLINE 0 0 0
mirror ONLINE 0 0 0
/disk1 ONLINE 0 0 0
/disk2 ONLINE 0 0 0
/disk3 ONLINE 0 0 0

errors: No known data errors
Create a mirrored storage pool. In this case, a 3 way mirrored storage pool.
# zpool detach myzfs /disk3
# zpool status -v
pool: myzfs
state: ONLINE
scrub: none requested
config:

NAME STATE READ WRITE CKSUM
myzfs ONLINE 0 0 0
mirror ONLINE 0 0 0
/disk1 ONLINE 0 0 0
/disk2 ONLINE 0 0 0

errors: No known data errors
Detach a device from a mirrored pool.
# zpool attach myzfs /disk1 /disk3
# zpool status -v
pool: myzfs
state: ONLINE
scrub: resilver completed with 0 errors on Tue Sep 11 13:31:49 2007
config:

NAME STATE READ WRITE CKSUM
myzfs ONLINE 0 0 0
mirror ONLINE 0 0 0
/disk1 ONLINE 0 0 0
/disk2 ONLINE 0 0 0
/disk3 ONLINE 0 0 0

errors: No known data errors
Attach device to pool. This creates a two-way mirror is the pool is not already a mirror, else it adds another mirror, in this case making it a 3 way mirror.
# zpool remove myzfs /disk3
cannot remove /disk3: only inactive hot spares can be removed
# zpool detach myzfs /disk3
Attempt to remove a device from a pool. In this case it’s a mirror, so we must use “zpool detach”.
# zpool add myzfs spare /disk3
# zpool status -v
pool: myzfs
state: ONLINE
scrub: none requested
config:

NAME STATE READ WRITE CKSUM
myzfs ONLINE 0 0 0
mirror ONLINE 0 0 0
/disk1 ONLINE 0 0 0
/disk2 ONLINE 0 0 0
spares
/disk3 AVAIL

errors: No known data errors
Add a hot spare to a storage pool.
# zpool remove myzfs /disk3
# zpool status -v
pool: myzfs
state: ONLINE
scrub: none requested
config:

NAME STATE READ WRITE CKSUM
myzfs ONLINE 0 0 0
mirror ONLINE 0 0 0
/disk1 ONLINE 0 0 0
/disk2 ONLINE 0 0 0

errors: No known data errors
Remove a hot spare from a pool.
# zpool offline myzfs /disk1
# zpool status -v
pool: myzfs
state: DEGRADED
status: One or more devices has been taken offline by the administrator.
Sufficient replicas exist for the pool to continue functioning
in a degraded state.
action: Online the device using 'zpool online' or replace the device
with 'zpool replace'.
scrub: resilver completed with 0 errors on Tue Sep 11 13:39:25 2007
config:

NAME STATE READ WRITE CKSUM
myzfs DEGRADED 0 0 0
mirror DEGRADED 0 0 0
/disk1 OFFLINE 0 0 0
/disk2 ONLINE 0 0 0

errors: No known data errors
Take the specified device offline. No attempt to read or write to the device will take place until it’s brought back online. Use the -t option to temporarily offline a device. A reboot will bring the device back online.
# zpool online myzfs /disk1
# zpool status -v
pool: myzfs
state: ONLINE
scrub: resilver completed with 0 errors on Tue Sep 11 13:47:14 2007
config:

NAME STATE READ WRITE CKSUM
myzfs ONLINE 0 0 0
mirror ONLINE 0 0 0
/disk1 ONLINE 0 0 0
/disk2 ONLINE 0 0 0

errors: No known data errors
Bring the specified device online.
# zpool replace myzfs /disk1 /disk3
# zpool status -v
pool: myzfs
state: ONLINE
scrub: resilver completed with 0 errors on Tue Sep 11 13:25:48 2007
config:

NAME STATE READ WRITE CKSUM
myzfs ONLINE 0 0 0
mirror ONLINE 0 0 0
/disk3 ONLINE 0 0 0
/disk2 ONLINE 0 0 0

errors: No known data errors
Replace a disk in a pool with another disk, for example when a disk fails
# zpool scrub myzfs
Perform a scrub of the storage pool to verify that it checksums correctly. On mirror or raidz pools, ZFS will automatically repair any damage.
WARNING: scrubbing is I/O intensive.
# zpool export myzfs
# zpool list
no pools available
Export a pool from the system for importing on another system.
# zpool import -d / myzfs
# zpool list
NAME SIZE USED AVAIL CAP HEALTH ALTROOT
myzfs 95.5M 114K 95.4M 0% ONLINE -
Import a previously exported storage pool. If -d is not specified, this command searches /dev/dsk. As we’re using files in this example, we need to specify the directory of the files used by the storage pool.
# zpool upgrade
This system is currently running ZFS pool version 8.

All pools are formatted using this version.
# zpool upgrade -v
This system is currently running ZFS pool version 8.

The following versions are supported:

VER DESCRIPTION
--- --------------------------------------------------------
1 Initial ZFS version
2 Ditto blocks (replicated metadata)
3 Hot spares and double parity RAID-Z
4 zpool history
5 Compression using the gzip algorithm
6 pool properties
7 Separate intent log devices
8 Delegated administration
For more information on a particular version, including supported
releases, see:

http://www.opensolaris.org/os/community/zfs/version/N

Where 'N' is the version number.
Display pools format version. The -v flag shows the features supported by the current version. Use the -a flag to upgrade all pools to the latest on-disk version. Pools that are upgraded will no longer be accessible to any systems running older versions.
# zpool iostat 5
capacity operations bandwidth
pool used avail read write read write
---------- ----- ----- ----- ----- ----- -----
myzfs 112K 95.4M 0 4 26 11.4K
myzfs 112K 95.4M 0 0 0 0
myzfs 112K 95.4M 0 0 0 0
Get I/O statistics for the pool
# zfs create myzfs/colin
# df -h
Filesystem kbytes used avail capacity Mounted on
...
myzfs/colin 64M 18K 63M 1% /myzfs/colin
Create a file system and check it with standard df -h command. File systems are automatically mounted by default under the /zfs location. See the Mountpoints section of the zfs man page for more details.
# zfs list
NAME USED AVAIL REFER MOUNTPOINT
myzfs 139K 63.4M 19K /myzfs
myzfs/colin 18K 63.4M 18K /myzfs/colin
List current zfs file systems.
# zpool add myzfs /disk1
invalid vdev specification
use '-f' to override the following errors:
mismatched replication level: pool uses mirror and new vdev is file
Attempt to add a single vdev to a mirrored set fails
# zpool add myzfs mirror /disk1 /disk5
# zpool status -v
pool: myzfs
state: ONLINE
scrub: none requested
config:

NAME STATE READ WRITE CKSUM
myzfs ONLINE 0 0 0
mirror ONLINE 0 0 0
/disk3 ONLINE 0 0 0
/disk2 ONLINE 0 0 0
mirror ONLINE 0 0 0
/disk1 ONLINE 0 0 0
/disk5 ONLINE 0 0 0

errors: No known data errors
Add a mirrored set of vdevs
# zfs create myzfs/colin2
# zfs list
NAME USED AVAIL REFER MOUNTPOINT
myzfs 172K 159M 21K /myzfs
myzfs/colin 18K 159M 18K /myzfs/colin
myzfs/colin2 18K 159M 18K /myzfs/colin2
Create a second file system. Note that both file system show 159M available because no quotas are set. Each “could” grow to fill the pool.
# zfs set reservation=20m myzfs/colin
# zfs list -o reservation
RESERV
none
20M
none
Reserve a specified amount of space for a file system ensuring that other users don’t take up all the space.
# zfs set quota=20m myzfs/colin2
# zfs list -o quota myzfs/colin myzfs/colin2
QUOTA
none
20M
Set and view quotas
# zfs set compression=on myzfs/colin2
# zfs list -o compression
COMPRESS
off
off
on
Turn on and verify compression
# zfs set sharenfs=on myzfs/colin2
# zfs get sharenfs myzfs/colin2
NAME PROPERTY VALUE SOURCE
myzfs/colin2 sharenfs on local
Share a filesystem over NFS. There is no need to modify the /etc/dfs/dfstab as the filesystem will be share automatically on boot.
# zfs set sharesmb=on myzfs/colin2
# zfs get sharesmb myzfs/colin2
NAME PROPERTY VALUE SOURCE
myzfs/colin2 sharesmb on local
Share a filesystem over CIFS/SMB. This will make your ZFS filesystem accessible to Windows users.
# zfs snapshot myzfs/colin@test
# zfs list
NAME USED AVAIL REFER MOUNTPOINT
myzfs 20.2M 139M 21K /myzfs
myzfs/colin 18K 159M 18K /myzfs/colin
myzfs/colin@test 0 - 18K -
myzfs/colin2 18K 20.0M 18K /myzfs/colin2
Create a snapshot called test.
# zfs rollback myzfs/colin@test
Rollback to a snapshot.
# zfs clone myzfs/colin@test myzfs/colin3
# zfs list
NAME USED AVAIL REFER MOUNTPOINT
myzfs 20.2M 139M 21K /myzfs
myzfs/colin 18K 159M 18K /myzfs/colin
myzfs/colin@test 0 - 18K -
myzfs/colin2 18K 20.0M 18K /myzfs/colin2
myzfs/colin3 0 139M 18K /myzfs/colin3
A snapshot is not directly addressable. A clone must be made. The target dataset can be located anywhere in the ZFS hierarchy, and will be created as the same type as the original.
# zfs destroy myzfs/colin2
# zfs list
NAME USED AVAIL REFER MOUNTPOINT
myzfs 20.1M 139M 22K /myzfs
myzfs/colin 18K 159M 18K /myzfs/colin
myzfs/colin@test 0 - 18K -
myzfs/colin3 0 139M 18K /myzfs/colin3
Destroy a filesystem
# zfs destroy myzfs/colin
cannot destroy 'myzfs/colin': filesystem has children
use '-r' to destroy the following datasets:
myzfs/colin@test
Attempt to destroy a filesystem that had a child. In this case, the snapshot filesystem. We must either remove the snapshot, or make a clone and promote the clone.
# zfs promote myzfs/colin3
# zfs list
NAME USED AVAIL REFER MOUNTPOINT
myzfs 20.1M 139M 21K /myzfs
myzfs/colin 0 159M 18K /myzfs/colin
myzfs/colin3 18K 139M 18K /myzfs/colin3
myzfs/colin3@test 0 - 18K -
# zfs destroy myzfs/colin
# zfs list
NAME USED AVAIL REFER MOUNTPOINT
myzfs 147K 159M 21K /myzfs
myzfs/colin3 18K 159M 18K /myzfs/colin3
myzfs/colin3@test 0 - 18K -
Promte a clone filesystem to no longer be a dependent on it’s “origin” snapshot. This now associates makes the snapshot a child of the cloned filesystem. We can then delete the original filesystem.
# zfs rename myzfs/colin3 myzfs/bob
# zfs list
NAME USED AVAIL REFER MOUNTPOINT
myzfs 153K 159M 21K /myzfs
myzfs/bob 18K 159M 18K /myzfs/bob
myzfs/bob@test 0 - 18K -
# zfs rename myzfs/bob@test myzfs/bob@newtest
# zfs list
NAME USED AVAIL REFER MOUNTPOINT
myzfs 146K 159M 20K /myzfs
myzfs/bob 18K 159M 18K /myzfs/bob
myzfs/bob@newtest 0 - 18K -
Rename a filesystem, and separately rename the snapshot.
# zfs get all
NAME PROPERTY VALUE SOURCE
myzfs type filesystem -
myzfs creation Tue Sep 11 14:21 2007 -
myzfs used 146K -
myzfs available 159M -
myzfs referenced 20K -
[...]
Display properties for the given datasets. This can be refined further using options.
# zpool destroy myzfs
cannot destroy 'myzfs': pool is not empty
use '-f' to force destruction anyway
Can’t destroy a pool with active filesystems.
# zfs unmount myzfs/bob
# df -h
myzfs 159M 20K 159M 1% /myzfs
Unmount a ZFS file system
# zfs mount myzfs/bob
# df -h
myzfs 159M 20K 159M 1% /myzfs
myzfs/bob 159M 18K 159M 1% /myzfs/bob
Mount a ZFS filesystem. This is usually automatically done on boot.
# zfs send myzfs/bob@newtest | ssh localhost zfs receive myzfs/backup
# zfs list
NAME USED AVAIL REFER MOUNTPOINT
myzfs 172K 159M 20K /myzfs
myzfs/backup 18K 159M 18K /myzfs/backup
myzfs/backup@newtest 0 - 18K -
myzfs/bob 18K 159M 18K /myzfs/bob
myzfs/bob@newtest 0 - 18K -
Create a stream representation of the snapshot and redirect it to zfs receive. In this example I’ve redirected to the localhost for illustration purposes. This can be used to backup to a remote host, or even to a local file.
# zpool history
History for 'myzfs':
2007-09-11.15:35:50 zpool create myzfs mirror /disk1 /disk2 /disk3
2007-09-11.15:36:00 zpool detach myzfs /disk3
2007-09-11.15:36:10 zpool attach myzfs /disk1 /disk3
2007-09-11.15:36:53 zpool detach myzfs /disk3
2007-09-11.15:36:59 zpool add myzfs spare /disk3
2007-09-11.15:37:09 zpool remove myzfs /disk3
2007-09-11.15:37:18 zpool offline myzfs /disk1
2007-09-11.15:37:27 zpool online myzfs /disk1
2007-09-11.15:37:37 zpool replace myzfs /disk1 /disk3
2007-09-11.15:37:47 zpool scrub myzfs
2007-09-11.15:37:57 zpool export myzfs
2007-09-11.15:38:05 zpool import -d / myzfs
2007-09-11.15:38:52 zfs create myzfs/colin
2007-09-11.15:39:27 zpool add myzfs mirror /disk1 /disk5
2007-09-11.15:39:38 zfs create myzfs/colin2
2007-09-11.15:39:50 zfs set reservation=20m myzfs/colin
2007-09-11.15:40:18 zfs set quota=20m myzfs/colin2
2007-09-11.15:40:35 zfs set compression=on myzfs/colin2
2007-09-11.15:40:48 zfs snapshot myzfs/colin@test
2007-09-11.15:40:59 zfs rollback myzfs/colin@test
2007-09-11.15:41:11 zfs clone myzfs/colin@test myzfs/colin3
2007-09-11.15:41:25 zfs destroy myzfs/colin2
2007-09-11.15:42:12 zfs promote myzfs/colin3
2007-09-11.15:42:26 zfs rename myzfs/colin3 myzfs/bob
2007-09-11.15:42:57 zfs destroy myzfs/colin
2007-09-11.15:43:23 zfs rename myzfs/bob@test myzfs/bob@newtest
2007-09-11.15:44:30 zfs receive myzfs/backup
Display the command history of all storage pools. This can be limited to a single pool by specifying its name on the command line. The history is only stored for existing pools. Once you’ve destroyed the pool, you’ll no longer have access to it’s history.
# zpool destroy -f myzfs
# zpool status -v
no pools available
Use the -f option to destroy a pool with files systems created.

Wednesday, October 7, 2009

Solaris performance monitoring commands


iostat
vmstat
netstat

iostat

syntax:

iostat [options] interval count

  • option – let you specify the device for which information is needed like disk , cpu or terminal. (-d, -c, t or -tdc ). x options gives the extended statistics.
  • interval – is time period in seconds between two samples. iostat 4 will give data at each 4 seconds interval.
  • count – is the number of times the data is needed. iostat 4 5 will give data at 4 seconds interval 5 times
  • example:

     $ iostat -xtc 5 2
    extended disk statistics tty cpu
    disk r/s w/s Kr/s Kw/s wait actv svc_t %w %b tin tout us sy wt id
    sd0 2.6 3.0 20.7 22.7 0.1 0.2 59.2 6 19 0 84 3 85 11 0
    sd1 4.2 1.0 33.5 8.0 0.0 0.2 47.2 2 23
    sd2 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0 0
    sd3 10.2 1.6 51.4 12.8 0.1 0.3 31.2 3 31

    The fields have the following meanings:
    disk name of the disk
    r/s reads per second
    w/s writes per second
    Kr/s kilobytes read per second
    Kw/s kilobytes written per second
    wait average number of transactions waiting for service (Q length)
    actv average number of transactions actively being serviced (removed from the queue but not yet completed)
    %w percent of time there are transactions waiting for service (queue non-empty)
    %b percent of time the disk is busy (transactions in progress)

    The values to look from the iostat output are:

  • Reads/writes per second (r/s, w/s)
  • Percentage busy (%b) (%b > 5 is bad)
  • Service time (svc_t) (svc_t > 30ms is bad)
  • If a disk shows consistently high reads/writes along with , the percentage busy (%b) of the disks is greater than 5 percent, and the average service time (svc_t) is greater than 30 milliseconds, then one of the following action needs to be taken-

    1. Tune the application to use disk i/o more efficiently by modifying the disk queries and using available cache facilities of application servers.
    2. Spread the file system of the disk on to two or more disk using disk striping feature of volume manager /disksuite etc.
    3. Increase the system parameter values for inode cache, ufs_ninode, which is Number of inodes to be held in memory. Inodes are cached globally (for UFS), not on a per-file system basis.
    4. Move the file system to another faster disk /controller or replace existing disk/controller to a faster one.

    vmstat

    syntax:

    vmstat [options] interval count
  • option – let you specify the type of information needed such as paging -p, cache -c, interrupt -i etc. if no option is specified information about process, memory, paging, disk, interrupts & cpu is displayed.
  • interval – is time period in seconds between two samples. vmstat 4 will give data at each 4 seconds interval.
  • count – is the number of times the data is needed. vmstat 4 5 will give data at 4 seconds interval 5 times.
  • example:

    $vmstat 5
    procs memory page disk faults cpu
    r b w swap free re mf pi p fr de sr s0 s1 s2 s3 in sy cs us sy id
    0 0 0 11456 4120 1 41 19 1 3 0 2 0 4 0 0 48 112 130 4 14 82
    0 0 1 10132 4280 0 4 44 0 0 0 0 0 23 0 0 211 230 144 3 35 62
    0 0 1 10132 4616 0 0 20 0 0 0 0 0 19 0 0 150 172 146 3 33 64
    0 0 1 10132 5292 0 0 9 0 0 0 0 0 21 0 0 165 105 130 1 21 78

    procs
    r in run queue
    b blocked for resources I/O, paging etc.
    w swapped

    memory (in Kbytes)
    swap - amount of swap space currently available
    free - size of the free list

    page (in units per second).
    re page reclaims - see -S option for how this field is modified.
    mf minor faults - see -S option for how this field is modified.
    pi kilobytes paged in
    po kilobytes paged out
    fr kilobytes freed
    de anticipated short-term memory shortfall (Kbytes)
    sr pages scanned by clock algorithm

    disk (operations per second).
    There are slots for up to four disks, labeled with a single letter and number.
    The letter indicates the type of disk (s = SCSI, i = IPI, etc). The number is
    the logical unit number.

    faults
    in (non clock) device interrupts
    sy system calls
    cs CPU context switches

    cpu breakdown of percentage usage of CPU time. On multiprocessors this is an average across all processors.
    us user time
    sy system time
    id idle time

    CPU issues:

    Following columns has to be watched to determine if there is any cpu issue:
    1. Processes in the run queue (procs r)
    2. User time (cpu us)
    3. System time (cpu sy)
    4. Idle time (cpu id)
         procs      cpu
    r b w us sy id
    0 0 0 4 14 82
    0 0 1 3 35 62
    0 0 1 3 33 64
    0 0 1 1 21 78
    Problem symptoms:
    1. If the number of processes in run queue (procs r) are consistently greater than the number of CPUs on the system it will slow down system as there are more processes then available CPUs.
    2. if this number is more than four times the number of available CPUs in the system then system is facing shortage of cpu power and will greatly slow down the processes on the system.
    3. If the idle time (cpu id) is consistently 0 and if the system time (cpu sy) is double the user time (cpu us) system is facing shortage of CPU resources.

    Resolution to these kind of issues involves tuning of application procedures to make efficient use of cpu and as a last resort increasing the cpu power or adding more cpu to the system.

    Memory Issues:

    Memory bottlenecks are determined by the scan rate (sr) . The scan rate is the pages scanned by the clock algorithm per second. If the scan rate (sr) is continuously over 200 pages per second then there is a memory shortage.
    Resolution:
    1. Tune the applications & servers to make efficient use of memory and cache.
    2. Increase system memory.
    3. Implement priority paging in s in pre Solaris 8 versions by adding line “set priority paging=1″ in /etc/system. Remove this line if upgrading from Solaris 7 to 8 & retaining old /etc/system file.

    netstat

    syntax:

    netstat [option/s]
    Options
    -a - displays the state of all sockets.
    -r - shows the system routing tables
    -i - gives statistics on a per-interface basis.
    -m - displays information from the network memory buffers. On Solaris, this shows statistics for streams
    -p [proto] - retrieves statistics for the specified protocol
    -s - shows per-protocol statistics. (some implementations allow -ss to remove fileds with a value of 0 (zero) from the display.)
    -D - display the status of DHCP configured interfaces.
    -n - do not lookup hostnames, display only IP addresses.
    -d - (with -i) displays dropped packets per interface.
    -I [interface] - retrieve information about only the specified interface.
    -v - be verbose
    interval - number for continuous display of statictics.

    example:

    $netstat -rn

    Routing Table: IPv4
    Destination Gateway Flags Ref Use Interface
    -------------------- -------------------- ----- ----- ------ ---------
    192.168.1.0 192.168.1.11 U 1 1444 le0
    224.0.0.0 192.168.1.11 U 1 0 le0
    default 192.168.1.1 UG 1 68276
    127.0.0.1 127.0.0.1 UH 1 10497 lo0
    This shows the output on a Solaris machine who’s IP address is 192.168.1.11 with a default router at 192.168.1.1
    Network availability
    The command as above is mostly useful in troubleshooting network accessibility issues. When outside network is not accessible from a machine check the following
    1. if the default router ip address is correct.
    2. you can ping it from your machine.
    3. If router address is incorrect it can be changed with route add command. See man route for more info.
      route command examples:
      $route add default [hostname]
      $route add 192.0.2.32 [gateway_name]
    If the router address is correct but still you can’t ping it there may be some network cable /hub/switch problem and you have to try and eliminate the faulty component.
    Network Response
    $ netstat -i
    Name Mtu Net/Dest Address Ipkts Ierrs Opkts Oerrs Collis Queue
    lo0 8232 loopback localhost 77814 0 77814 0 0 0
    hme0 1500 server1 server1 10658566 3 4832511 0 279257 0
    This option is used to diagnose the network problems when the connectivity is there but it is slow in response.
    Values to look at:
  • Collisions (Collis)
  • Output packets (Opkts)
  • Input errors (Ierrs)
  • Input packets (Ipkts)
  • The above values will give information to workout.

    Network collision rate as follows:

    Network collision rate = Output collision counts / Output packets
    Network-wide collision rate greater than 10 percent will indicate
  • Overloaded network,
  • Poorly configured network,
  • Hardware problems.
  • Input packet error rate as follows:

    Input Packet Error Rate = Ierrs / Ipkts
    If the input error rate is high (over 0.25 percent), the host is dropping packets. Hub/switch cables etc needs to be checked for potential problems.

    Network socket & TCP Cconnection state

    netstat gives important information about network socket and tcp state. This is very useful in finding out the open, closed and waiting network tcp connection.
    Network states returned by netstat are following:
    LISTEN ---- Listening for incoming connections.
    SYN_SENT ---- Actively trying to establish connection.
    SYN_RECEIVED ---- Initial synchronization of the connection under way.
    ESTABLISHED ---- Connection has been established.
    FIN_WAIT_1 ---- Socket closed; shutting down connection.
    FIN_WAIT_2 ---- Socket closed; waiting for shutdown from remote.
    CLOSE_WAIT ---- Remote shut down; waiting for the socket to close.
    CLOSING ---- Closed, then remote shutdown; awaiting acknowledgement.
    CLOSED ---- Closed. The socket is not being used.
    LAST_ACK ---- Remote shut down, then closed; awaiting acknowledgement.
    TIME_WAIT ---- Wait after close for remote shutdown retransmission.
    $netstat -a
    Local Address Remote Address Swind Send-Q Rwind Recv-Q State
    *.* *.* 0 0 24576 0 IDLE
    *.22 *.* 0 0 24576 0 LISTEN
    *.22 *.* 0 0 24576 0 LISTEN
    *.* *.* 0 0 24576 0 IDLE
    *.32771 *.* 0 0 24576 0 LISTEN
    *.4045 *.* 0 0 24576 0 LISTEN
    *.25 *.* 0 0 24576 0 LISTEN
    *.5987 *.* 0 0 24576 0 LISTEN
    *.898 *.* 0 0 24576 0 LISTEN
    *.32772 *.* 0 0 24576 0 LISTEN
    *.32775 *.* 0 0 24576 0 LISTEN
    *.32776 *.* 0 0 24576 0 LISTEN
    *.* *.* 0 0 24576 0 IDLE
    192.168.1.184.22 192.168.1.186.50457 41992 0 24616 0 ESTABLISHED
    192.168.1.184.22 192.168.1.186.56806 38912 0 24616 0 ESTABLISHED
    192.168.1.184.22 192.168.1.183.58672 18048 0 24616 0 ESTABLISHED
    If you see a lots of connections in FIN_WAIT state tcp/ip parameters have to be tuned because the connections are not being closed and they gets accumulating. After some time system may run out of resource. TCP parameter can be tuned to define a time out so that connections can be released and used by new connection.