Stopping Resin: -stop, -shutdown or -kill?
Regardless of the operation system you’re running, Resin includes 3 distinct commands for the purpose of stopping the application server. In this tip of the month I’ll explain each command in details, discuss how they differ, and provide some advice on under what circumstance it is appropriate to use each.
Before explaining the different stop commands, I need to start with a short review of how Resin is started and the 2 separate processes that make up a running Resin instance. When started from the command line, bin/resin.sh uses lib/resin.jar to run the java class com.caucho.boot.ResinBoot. In most cases, this class simply parses command line arguments, starts the Watchdog, and then exits.
In any discussion involving starting or stopping Resin, it’s important to understand the Watchdog and it’s relationship to the running application server. The Watchdog is an entirely separate process from the Resin process, designed to ensure reliability and security. It serves to monitor, maintain, and communicate with one or more Resin process. It monitors the health of the application server and restarts Resin automatically if it stops unexpectedly. You can distinguish the processes by examining the main class they are running. The Watchdog process is com.caucho.boot.WatchdogManager. The application server process is com.caucho.server.resin.Resin. (This is true even if you are using Resin Pro.) Refer to the Resin documentation for additional information on the Watchdog and what involved in running multiple Resin servers on the same machine.
The Watchdog starts the Resin process, knows it’s PID, and is connected to it over a persistent socket connection. In general, Resin will not run without the Watchdog, and the Watchdog will not run without a Resin server to monitor. However a single Watchdog process can monitor multiple Resin servers. In most cases, the watchdog reads the resin.xml, configures itself automatically, and runs silently. When more than one Resin instance is started from the same configuration on the same machine, a new Watchdog is generally NOT created. Resin commands are automatically directed to the Watchdog, and then to the appropriate Resin server. This is why most of the time you don’t need to pay attention to the Watchdog at all.
At this point you should have a good idea of why stopping Resin inherently involves the Watchdog. Consider the following running Resin server:
501 574 1 0 0:00.46 ttys000 0:04.01 /Library/Java/1.6.0/bin/java -Dresin.watchdog=a -Djava.util.logging.manager=com.caucho.log.LogManagerImpl -Djavax.management.builder.initial=com.caucho.jmx.MBeanServerBuilderImpl -Djava.awt.headless=true -Dresin.home=/opt/resin/ -Dresin.root=/opt/resin/ -Xrs -Xss256k -Xmx32m -d64 -server com.caucho.boot.WatchdogManager -server a start --log-directory /opt/resin/log 501 585 574 0 0:00.14 ttys000 0:02.07 /Library/Java/1.6.0/bin/java -Dresin.server=2 -Djava.util.logging.manager=com.caucho.log.LogManagerImpl -Djava.system.class.loader=com.caucho.loader.SystemClassLoader -Djava.endorsed.dirs=/Library/Java/1.6.0/lib/endorsed:/opt/resin//endorsed -Djavax.management.builder.initial=com.caucho.jmx.MBeanServerBuilderImpl -Djava.awt.headless=true -Dresin.home=/opt/resin/ -Xss1m -Xmx256m -Dresin.watchdog=a -Djava.util.logging.manager=com.caucho.log.LogManagerImpl -Djavax.management.builder.initial=com.caucho.jmx.MBeanServerBuilderImpl -Djava.awt.headless=true -Dresin.home=/opt/resin/ -Dresin.root=/opt/resin/ -d64 -server -Dresin.watchdog=a -Djava.util.logging.manager=com.caucho.log.LogManagerImpl -Djavax.management.builder.initial=com.caucho.jmx.MBeanServerBuilderImpl -Djava.awt.headless=true -Dresin.home=/opt/resin/ -Dresin.root=/opt/resin/ com.caucho.server.resin.Resin --root-directory /opt/resin/ -conf /opt/resin/conf/resin.xml -socketwait 52892 -server a start --log-directory /opt/resin/log
The Watchdog is com.caucho.boot.WatchdogManager, pid 574. The Resin server is com.caucho.server.resin.Resin, pid 585, and a child of pid 574. What happens if you issue an operation system kill command; kill 585? The Watchdog interprets this as an abnormal termination of the Resin server, and immediately starts it back up again. You can see this behavior by examining the Watchdog’s log file, log/watchdog-manager.log. You should see something similar to this:
[2011/08/29 11:31:06.315] {watchdog-a} Watchdog detected close of Resin[a,pid=585]
exit reason: SIGTERM (signal=15)
[2011/08/29 11:31:06.317] {watchdog-a} WatchdogChild[a] starting
[2011/08/29 11:31:06.324] {watchdog-a} Watchdog starting Resin[a]
So now we see that killing the Resin process results in only a messy restart. Resin is resilient and will recover, but shutdown and cleanup tasks will not have a chance to run.
What happens if we kill the Watchdog; kill 574? The Watchdog and the Resin server exit. Resin immediately detects that the Watchdog died because the persistent socket was broken, and initiates a shutdown. The shutdown in the case is slightly less messy than killing the Resin process directly. The Watchdog is simpler and has few shutdown tasks. Resin itself will initiate a clean shutdown when it notices the socket connection is broken, so shutdown and cleanup tasks will have a chance to run.
-stop
-stop is the proper way to stop Resin under normal circumstances.
./bin/resin.sh -server a stop
After reading to this point, the sequences of calls this initiates should not come as a surprise:
- resin.sh executes com.caucho.boot.ResinBoot
- ResinBoot parses the command line and calls into the Watchdog
- The Watchdog find the appropriate Resin server and tells it to stop
The calls are made using HMTP, which we have discussed a number of times previously in this blog.
Does the Watchdog exit also upon a stop command? In this case, if server “a” was the only running Resin instance, then the answer is yes. But it depends entirely on if there are any other running Resin server instances. The Watchdog will not bother to run after the last Resin instance is shutdown.
You need to pass -server when there is more than one <server> defined in resin.xml, or the server id is anything other than empty, which we refer to as the “default” server. Following is a snippet of the default resin.xml that ships with Resin:
<!-- define the servers in the cluster --> <server id="" address="127.0.0.1" port="6800"> </server>
Notice the id parameter is empty. So in this case the following commands will work:
./bin/resin.sh stop
Server naming is required in a Resin Pro clustered configuration server, so -server is more typical using Resin Pro than Resin Open Source.
The other 2 commands initiate a somewhat less graceful shutdown, similar to an operating system kill as discussed above.
-shutdown
The -shutdown command stops the Watchdog process, which results in all Resin servers stopping also. Shutdown does NOT require a -server parameter. This can be slightly confusing to new users, since in the default configuration with only one server, -stop and -shutdown appear to produce the same results. The difference however is apparent in a clustered configuration with multiple Resin servers running from the same configuration. -shutdown will result in ALL servers going down, whereas -stop will bring down only the intended Resin server. So -shutdown is the equivalent of killing the Watchdog process. -shutdown will result in a clean shutdown of Resin.
-kill
The -kill command is similar to -stop, in that it work on a single Resin server, and requires the -server parameter in a clustered configuration but not in the default configuration. The Watchdog knows the pid of each Resin server and destroys the process at the operating system level. However unlike calling kill on the Resin pid, the Watchdog will NOT restart the server. So -kill is the proper way to issue a kill command to a Resin server without it being automatically restarted. -kill is a messy way to shutdown, since cleanup tasks will not have a chance to run. It is intended only for unusual cases where Resin is not responding properly to -stop.

October 20th, 2011 at 2:24 pm
[...] the server down. This commands work similarly with differences important enough to be outlined in a separate post made by Paul. The list of arguments for stop and kill commands are [...]