In my original post about zero downtime deployments I wanted to use puma. At the time of writing puma did not support zero downtime restarts: while the connection was kept alive, all workers were killed at once so no requests could be served until the new workers had fully started up.
This changed as of puma v2.0.0.b6. Now you can send SIGUSR1
to the puma master process and puma will phase out
old workers while starting new workers one at a time.
Note that this process takes longer than unicorns SIGUSR2
+ preload_app
restarts because unicorn spawns your new workers all
at the same time; this means that puma requires roughly n-times your apps launch time to complete a phased restart.
During your relaunch process you’ll end up with workers running old and new code at the same time. Just make sure you don’t break your old workers by running incompatible migrations immediatly :)
If you are using supervisord, foreman and mina, here’s a short description on how I got it working:
Setup
First, you’ll need to be running puma in clustered mode. In this example I’ll spawn one master process and three worker processes:
# Procfile
app: puma -p 8619 --workers 3
The good thing about puma is that we do not need a wrapper like unicornherder
to handle changes in PID since
the master always stays around.
Running foreman export supervisord
will leave use with something like this:
# /etc/supervisor/conf.d/app.conf
[program:app-1]
command=bundle exec puma -p 8619 --workers 3 --dir /home/app/current
autostart=true
autorestart=true
stdout_logfile=/home/app/shared/log/website-1.log
stderr_logfile=/home/app/shared/log/website-1.error.log
user=app
directory=/home/app/current
environment=RAILS_ENV="production"
[group:app]
programs=app-1
As you can see I just generated a supervisord configuration which directly starts puma in clustered mode.
Note It’s important that you add the --dir /path/to/current
option, since puma won’t pick up changes to your code base otherwise.
Deployment with mina
Assuming this is not our very first deployment, we need to restart workers using minas to :launch
directive.
To issue a phased restart we need to do the following:
-
finding pumas master pid by: - listing all ruby processes via
ps -C ruby -F
-grep
ing for/puma
(only the process spawned by supervisor will contain this line) - using awk to get the process id viaawk {'print $2'}
-
sending
SIGUSR1
to the puma master process to initiate a rolling restart
Minas deploy
task looks like this:
task :deploy => :environment do
deploy do
invoke :'git:clone'
invoke :'deploy:link_shared_paths'
invoke :'bundle:install'
invoke :'rails:assets_precompile'
to :launch do
queue %[kill -s SIGUSR1 $(ps -C ruby -F | grep '/puma' | awk {'print $2'})]
end
end
end
That’s it.
Let’s try out our setup using ab
:
This is ApacheBench, Version 2.3 <$Revision: 655654 $>
Copyright 1996 Adam Twiss, Zeus Technology Ltd, http://www.zeustech.net/
Licensed to The Apache Software Foundation, http://www.apache.org/
Benchmarking blog.nicolai86.eu (be patient).....done
Server Software: nginx/1.2.4
Server Hostname: blog.nicolai86.eu
Server Port: 80
Document Path: /
Document Length: 10817 bytes
Concurrency Level: 6
Time taken for tests: 3.430 seconds
Complete requests: 100
Failed requests: 0
Write errors: 0
Total transferred: 1163452 bytes
HTML transferred: 1096866 bytes
Requests per second: 29.15 [#/sec] (mean)
Time per request: 205.801 [ms] (mean)
Time per request: 34.300 [ms] (mean, across all concurrent requests)
Transfer rate: 331.25 [Kbytes/sec] received
Connection Times (ms)
min mean[+/-sd] median max
Connect: 56 60 3.1 58 73
Processing: 130 142 9.5 140 173
Waiting: 70 81 9.0 78 111
Total: 189 202 10.4 200 237
Percentage of the requests served within a certain time (ms)
50% 200
66% 203
75% 206
80% 209
90% 218
95% 225
98% 230
99% 237
100% 237 (longest request)
As you can see - it works.
Also note the log output when sending SIGUSR1
to puma:
started with pid 38012
[38012] Puma 2.0.0.b6 starting in cluster mode...
[38012] * Process workers: 3
[38012] * Min threads: 0, max threads: 16
[38012] * Environment: development
[38012] * Listening on tcp://0.0.0.0:8619
[38012] Use Ctrl-C to stop
[38012] - Worker 38016 booted, phase: 0
[38012] - Worker 38015 booted, phase: 0
[38012] - Worker 38017 booted, phase: 0
[38012] - Starting phased worker restart, phase: 1
[38012] - Stopping 38015 for phased upgrade...
[38012] - Worker 38119 booted, phase: 1
[38012] - Stopping 38016 for phased upgrade...
[38012] - Worker 38132 booted, phase: 1
[38012] - Stopping 38017 for phased upgrade...
[38012] - Worker 38146 booted, phase: 1
If you see similar output when sending SIGUSR1
your phased restarts using puma are working as expected!
Wrapping up:
- puma supports phased restarts in clustered mode since v2.0.0.b6
- using phased restarts we can achieve zero downtime deployments with puma
- phased restarts work by replacing workers one by one, which takes some time to complete if you are running many workers. But it also consumes less memory than having twice the number of workers running
- if a worker fails to start up puma master tries to restart the worker. Only if the new worker starts up successfully will puma replace the old worker.
That’s it! Happy hacking!