In this post I’ll discuss zero-downtime deployments using unicorn and supervisord. There’s a lot more to zero-downtime deployments then just keeping your website available. Listen to Ruby Rogues Ep. 71 or search google for a broader discussion of the problems involved.
When running a web application in production you should strive for 100% reachability. Down-times are normally perceived as errors in your application; and rightfully so. If you deploy often your users might stop using your app because of 502er they encounter.
Since I like to use supervisord in my production setup the most widely used unicorn setup for zero-downtime deployments does not work out of the box.
Supervisord requires the unicorn process to not daemonize. Also sending SIGUSR2
to unicorn causes the old master to die.
Since supervisord watches the old master this will cause it to consider the application as exited, even tho it’s running with a new process id.
Finally Supervisor will try to restart the application, and fail to do so because all sockets are in use by the new unicorn master.
Luckily, there’s an utility called unicornherder. Unicornherder does not daemonize itself and keeps an eye on the unicorn pid file to check if unicorn is still alive. All messages sent to unicornherder are forwarded to the unicorn process. If unicorn quits, unicornherder quits too.
So, in order to use SIGUSR2
and preload_app
for zero-downtime deployments we need to install unicornherder.
# assuming you are running Ubuntu:
$ sudo apt-get install python-dev
$ pip install unicornherder
$ which unicornherder # => /usr/local/bin/unicornherder
Unicornherder itself does not require an additional configuration file. All required arguments are passed to the command line.
Next we need to configure supervisord:
Supervisord
Supervisord watches unicornherder, and unicornherder starts unicorn as a daemon. So all we need to do is to properly start unicornherder and make sure it keeps running.
Here’s a sample supervisord configuration file I generated using foreman export
[program:myapp-unicornherder-1]
command=/home/webapp/.rvm/bin/app_bundle exec unicornherder -u unicorn -p tmp/pids/unicorn.pid -- -c config/unicorn.rb
autostart=true
autorestart=true
stopsignal=QUIT
stdout_logfile=/home/webapp/shared/log/unicornherder-1.log
stderr_logfile=/home/webapp/shared/log/unicornherder-1.error.log
user=webapp
directory=/home/webapp/current
environment=RAILS_ENV="production",APP_PATH="/home/webapp/current",SHARED_PATH="/home/webapp/shared",TEMP_PATH="/home/webapp/shared/tmp",PORT="8619"
[group:myapp]
programs=myapp-unicornherder-1
The details:
- unicornherder is passed the path to the unicorn pidfile using the -p flag
- supervisord will send the
QUIT
signal to unicornherder if we want to stop unicorn. - unicorn is executed in an RVM managed environment, and I’m using a RVM wrapper to load the correct ruby version and gemset.
- basic unicorn configuration settings are exported into the environment
Unicorn
The unicorn configuration follows:
worker_processes ((ENV['RAILS_ENV'] == 'development') ? 2 : 8)
working_directory ENV["APP_PATH"]
listen ENV["PORT"].to_i, :tcp_nopush => true
timeout 30
pid (ENV["TEMP_PATH"] + "/pids/unicorn.pid")
stderr_path ENV["SHARED_PATH"] + "/log/unicorn.stderr.log"
stdout_path ENV["SHARED_PATH"] + "/log/unicorn.stdout.log"
preload_app true
before_fork do |server, worker|
if defined?(ActiveRecord::Base)
ActiveRecord::Base.connection.disconnect!
end
old_pid = ENV["TEMP_PATH"] + '/pids/unicorn.pid.oldbin'
if File.exists?(old_pid) && server.pid != old_pid
begin
Process.kill("QUIT", File.read(old_pid).to_i)
rescue Errno::ENOENT, Errno::ESRCH
# someone else did our job for us
end
end
end
after_fork do |server, worker|
if defined?(ActiveRecord::Base)
ActiveRecord::Base.establish_connection
end
end
The important points here is that we close any connections to external resources as the master has no use for them; Also note that we kill the old master as soon as the preloading is done.
Mina
If we deploy using Mina, we can use the following configuration to perform a zero-downtime deploy:
desc "Deploys the current version to the server."
task :deploy => :environment do
deploy do
# omitted
to :launch do
queue %[kill -s USR2 $(sudo supervisorctl status | grep unicornherder | cut -d' ' -f7 | cut -d',' -f1)]
end
end
end
and starting, stopping of unicorn is handled with supervisord:
desc "stop the application"
task :down do
queue "sudo supervisorctl stop myapp:*"
end
desc "start the application"
task :up do
queue "sudo supervisorctl start myapp:*"
end
Verify we got a zero-downtime deployment
Now it’s time to verify our setup is actually working.
Running ab -c 2 -n 100 http://www.example.com/
while restarting our application should not result in ANY dropped connections. Note that this largly depends on how long your application needs to start up.
We could further amplify the effects by adding fake calls to sleep
in our application.rb.
Anyway, here it goes:
With restarts
This is ApacheBench, Version 2.3 <$Revision: 655654 $>
Copyright 1996 Adam Twiss, Zeus Technology Ltd, http://www.zeustech.net/
Licensed to The Apache Software Foundation, http://www.apache.org/
Benchmarking example.com (be patient).....done
Server Software: nginx/1.2.4
Server Hostname: example.com
Server Port: 80
Document Path: /
Document Length: 22527 bytes
Concurrency Level: 2
Time taken for tests: 10.947 seconds
Complete requests: 100
Failed requests: 0
Write errors: 0
Total transferred: 2319600 bytes
HTML transferred: 2252700 bytes
Requests per second: 9.13 [#/sec] (mean)
Time per request: 218.949 [ms] (mean)
Time per request: 109.475 [ms] (mean, across all concurrent requests)
Transfer rate: 206.92 [Kbytes/sec] received
Connection Times (ms)
min mean[+/-sd] median max
Connect: 55 58 1.8 57 69
Processing: 137 160 27.4 145 263
Waiting: 69 82 21.0 74 148
Total: 193 218 27.4 204 320
Percentage of the requests served within a certain time (ms)
50% 204
66% 215
75% 242
80% 249
90% 265
95% 271
98% 274
99% 320
100% 320 (longest request)
Without restarts
This is ApacheBench, Version 2.3 <$Revision: 655654 $>
Copyright 1996 Adam Twiss, Zeus Technology Ltd, http://www.zeustech.net/
Licensed to The Apache Software Foundation, http://www.apache.org/
Benchmarking example.com (be patient).....done
Server Software: nginx/1.2.4
Server Hostname: example.com
Server Port: 80
Document Path: /
Document Length: 22527 bytes
Concurrency Level: 2
Time taken for tests: 10.584 seconds
Complete requests: 100
Failed requests: 0
Write errors: 0
Total transferred: 2319600 bytes
HTML transferred: 2252700 bytes
Requests per second: 9.45 [#/sec] (mean)
Time per request: 211.686 [ms] (mean)
Time per request: 105.843 [ms] (mean, across all concurrent requests)
Transfer rate: 214.02 [Kbytes/sec] received
Connection Times (ms)
min mean[+/-sd] median max
Connect: 55 58 1.5 58 65
Processing: 137 153 18.3 145 207
Waiting: 68 76 5.8 75 102
Total: 195 211 18.4 202 265
Percentage of the requests served within a certain time (ms)
50% 202
66% 204
75% 215
80% 219
90% 248
95% 251
98% 252
99% 265
100% 265 (longest request)
No failed requests. It works! And the response times with multiple restarts are only slightly worse. Great!
I hope this blog post helped clarifing how to use unicorn and supervisord together while using zero-downtime deployments of your app server to keep serving requests.
Wrapping up:
- unicorn requires unicornherder for zero-downtime deployments, if you are using supervisord
- unicorn spawns a second master when sent
SIGUSR2
which means you’ll be running twice as mean workers as you specified during restarts
That’s it! Happy hacking!