Using Packer, systemd and deployment script to build a Rails application into an AMI and launch it into production as a Spot Instance - saving up to 90% of the on-demand price.
This post is the next installment in the series on Immutable Servers. In the previous post we looked at the infrastructure needed to launch web applications into production as spot instances. In this post we will bake a Rails application into an AMI and produce a deployment script that will allow our build process to compile an AMI and launch it into production
In the first post in the series we used Packer our base server images. We'll take the rails-base image and build a new image on top of this that includes our application and a systemd service script to automatically launch the web application on boot.
Packaging with Packer and systemd
I'll assume that you have an existing Rails application that you want to bundle as an immutable server, and that you've been following the previous posts in the series.
Baking the Rails App
So I'll assume the root of your application has a structure that looks a little like this:
.
├── app
├── bin
├── config
├── config.ru
├── db
├── Gemfile
├── Gemfile.lock
├── lib
├── package.json
├── public
├── Rakefile
├── README.md
├── test
└── vendor
In the root of your application we'll create a build
folder. In this folder, we'll put everything that we need for Packer and the script that will deploy our application.
All the files we create from here on will live in the build
folder.
Let's first create the outline of our Packer template that will use the rails-base
image we produced previously.
{
"builders": [
{
"type": "amazon-ebs",
"access_key": "{{user `aws_access_key`}}",
"secret_key": "{{user `aws_secret_key`}}",
"region": "eu-west-1",
"instance_type": "t3.small",
"ssh_username": "ubuntu",
"source_ami_filter": {
"filters": {
"name": "rails-base*"
},
"owners": ["self"],
"most_recent": true
},
"ami_name": "demo-web-app {{timestamp}}",
"associate_public_ip_address": true,
"tags": {
"Name": "demo-web-app",
"Project": "demo-web-app",
"Commit": "{{user `git_commit`}}"
}
}
],
"provisioners": [
...
],
"post-processors": [
{
"output": "manifest.json",
"strip_path": true,
"type": "manifest"
}
]
}
The first thing that we'll want to do is create a folder for our web app to reside in, to do this we'll create a packer-init.sh
:
#!/bin/bash
set -e
mkdir -p /srv/demowebapp
chown ubuntu: /srv/demowebapp
We can now update the provisioners in our packer.json
template to run the packer-init.sh
and copy our entire application into this folder:
{
"builders": [
...
],
"provisioners": [
{
"type": "shell",
"execute_command": "echo 'vagrant' | {{.Vars}} sudo -S -E bash '{{.Path}}'",
"scripts": [
"packer-init.sh"
]
},
{
"type": "file",
"source": "../",
"destination": "/srv/demowebapp"
},
...
],
"post-processors": [
[
{
"output": "manifest.json",
"strip_path": true,
"type": "manifest"
}
]
]
}
The file provisioner will copy the entire root of our application into /srv/demowebapp
Installing dependencies, testing and running with systemd
Copying our code into our server image isn't enough to get it to run:
- We'll need to install our dependencies
- We should run our tests to ensure that the image will do what it's supposed to do when we launch it in to production
- We need to setup systemd to run the application upon boot
We'll add another script (packer-configure.sh
) which will run after we've copied the code into the server image. This script will take care of the dependency installation, test execution and systemd setup.
Create a packer-configure.sh
script file and add another script provisioner:
{
"builders": [
...
],
"provisioners": [
{
"type": "shell",
"execute_command": "echo 'vagrant' | {{.Vars}} sudo -S -E bash '{{.Path}}'",
"scripts": [
"packer-init.sh"
]
},
{
"type": "file",
"source": "../",
"destination": "/srv/demowebapp"
},
{
"type": "shell",
"execute_command": "echo 'vagrant' | {{.Vars}} sudo -S -E bash '{{.Path}}'",
"scripts": [
"packer-configure.sh"
]
}
],
"post-processors": [
[
{
"output": "manifest.json",
"strip_path": true,
"type": "manifest"
}
]
]
}
Installing Dependencies
The first thing we'll do in our packer-configure.sh
script is install our dependencies. When packer copied our code over using the file provisioner, the files in the server images were created with root
ownership, we should change this to use our webapp
user.
#!/bin/bash
set -e
gem install bundler
sudo chown -R webapp:webapp /srv/demowebapp
(
cd /srv/demowebapp
sudo -u webapp bundle install --path /srv/demowebapp/.bundle
)
Running Tests
Running tests in our server image as we build it will help us reduce the number of potential environmental issues and give us a high degree of confidence that our application will function as we'd expect.
Once the dependencies have been installed, we'll run the tests in a subshell. This subshell will be executed as the webapp user and will ensure any environment variables used for testing are isolated to that subshell:
#!/bin/bash
set -e
gem install bundler
sudo chown -R webapp:webapp /srv/demowebapp
(
cd /srv/demowebapp
sudo -u webapp bundle install --path /srv/demowebapp/.bundle
)
sudo -u webapp bash <<"EOF"
cd /srv/demowebapp
export RAILS_ENV=test
bundle exec rails db:environment:set
bundle exec rake db:drop db:create db:migrate
bundle exec rspec --format documentation --format RspecJunitFormatter --out rspec.xml
git rev-parse HEAD > REVISION
EOF
If any of our tests fail during the build process the test failure will fail the entire building, preventing us from shipping broken code.
Systemd
Systemd will be used to run our application as a service and will automatically launch the service whenever the server image is booted. This will require us to create a service manifest and a run script that systemd can invoke.
Let's first create a run script that will be responsible for pre-compiling assets, running migrations and starting puma. We'll put this script in the build
folder.
#!/bin/bash
cd "$(dirname "$0")/.." || exit
./bin/bundle exec rails assets:precompile
./bin/bundle exec rake db:migrate
./bin/bundle exec puma -C config/puma.rb
Within the build
folder create a system.d
folder that will contain our service manifest:
[Unit]
Description=Demo Web App
Requires=network.target
[Service]
Type=simple
User=webapp
Group=webapp
Environment=RAILS_ENV=production
WorkingDirectory=/srv/demowebapp
ExecStart=/bin/bash -lc '/srv/demowebapp/build/run.sh'
TimeoutSec=30
RestartSec=15s
Restart=always
[Install]
WantedBy=multi-user.target
Note, that the ExecStart
command uses bash to invoke our run script as it would be located in the server image.
The final step of the packer-configure.sh
script should be to add the demowebapp
service to systemd:
#!/bin/bash
set -e
gem install bundler
sudo chown -R webapp:webapp /srv/demowebapp
(
cd /srv/demowebapp
sudo -u webapp bundle install --path /srv/demowebapp/.bundle
)
sudo -u webapp bash <<"EOF"
cd /srv/demowebapp
export RAILS_ENV=test
bundle exec rails db:environment:set
bundle exec rake db:drop db:create db:migrate
bundle exec rspec --format documentation --format RspecJunitFormatter --out rspec.xml
git rev-parse HEAD > REVISION
EOF
sudo mkdir -p /usr/lib/systemd/system
cp /srv/demowebapp/build/system.d/demowebapp.service /usr/lib/systemd/system/demowebapp.service
cp /srv/demowebapp/build/user-data.sh /etc/rc.local
chmod +x /etc/rc.local
systemctl enable demowebapp.service
You'll notice in the final step we also add a user-data.sh
script as /etc/rc.local
, this is an optional step where I am using /etc/rc.local
(which runs on boot) to customise the hostname into a standard format for this specific server image:
#!/bin/bash
set -e
INSTANCE_ID=$(curl -s http://169.254.169.254/latest/meta-data/instance-id)
HOSTNAME="demo-web-app-${INSTANCE_ID}"
# Hostname
echo -n "${HOSTNAME}" > /etc/hostname
hostname -F /etc/hostname
Whenever I launch an instance of this server image, the hostname will match the pattern: demo-web-app-*
, e.g.:
demo-web-app-i-0804f202925fd084a
Build Process
Now that we have a packer template fully configured, we can put together a simple build.sh
in our build
folder that will invoke packer for us.
In the build process, there is little value in packaging up any logs, code coverage results or any other files that we wouldn't want as part of the build.
When we looked at building our base server images with packer we baked the git commit hash into the AMI as a tag. Knowing exactly what code went into any kind of build is always useful, so we'll do this again. This time round we'll do this without using jq
and by using variables in packer instead.
You will notice when we created our packer template this time the Commit
tag was set to {{user `git_commit`}}
which will pull in the user vairbale git_commit
which can be passed to packer via command line argments
{
"builders": [
{
...
"tags": {
"Name": "demo-web-app",
"Project": "demo-web-app",
"Commit": "{{user `git_commit`}}"
}
}
],
...
}
Our build.sh
will look as follows:
#!/bin/bash
set -e
cd "$(dirname "$0")" || exit
rm -f ../REVISION
rm -rf ../coverage/
rm -rf ../log/*.log
rm -f manifest.json
packer build -var "git_commit=$(git rev-parse HEAD)" packer.json
We also use a manifest post-processor in our packer template which produces a manifest.json
, which will look something like this:
{
"builds": [
{
"name": "amazon-ebs",
"builder_type": "amazon-ebs",
"build_time": 1553103589,
"files": null,
"artifact_id": "eu-west-1:ami-12e0bca147d8846e3",
"packer_run_uuid": "fa5e8a22-0f35-6c83-7381-a04b15f8917b"
}
],
"last_run_uuid": "fa5e8a22-0f35-6c83-7381-a04b15f8917b"
}
The manifest.json
file will be important when we put together our deploy script, as we'll be able to extract the AMI ID of the image image we've just built from the artifact_id
attribute.
We now have a fully fledged build process that can package our application as an immutable server.
Deployment Script
As I'm writing an example Rails app to demonstrate this concept I'll stick with using Ruby for my deployment script
When we defined our immutable infrastructure, we created a Target Group on an Application Load Balancer that we can attach our spot instances to. Our script will discover the Target Group and other relevant infrastructure, and deploy our applications AMI using a spot fleet request.
Whenever our script needs to reference our infrastructure (such as target groups and security groups), we will determine the IDs of the resources based on how they are named. For example, we'll look for the target group named "website", rather than the target group with as specific ARN. This is so that we decouple our deployment script from the specific instance of our resource. In the future, we may need to go back to our terraform infrastructure and change a resource attribute that causes terraform to re-create the resource with a new ID. If we hardcode IDs everywhere this will cause our script to break, instead we'll derive IDs and ARNs based on the name of resources.
At a highlevel, our deployment script will:
- Identify existing spot fleet requests that are being used to host the current in-production version of our application
- Launch a new spot fleet request againstr the Target Group using the new AMI that we want to deploy
- If the AMI deploys successfully, we'll retire any old versions of the application that were identified in Step 1. Alternatively, if the deploy fails we'll just cancel the new spot fleet request we tried to launch
We'll place our deploy.rb
script in our build
folder and declare a separate Gemfile
specifically for the deploy process.
source 'https://rubygems.org'
git_source(:github) do |repo_name|
repo_name = "#{repo_name}/#{repo_name}" unless repo_name.include?('/')
"https://github.com/#{repo_name}.git"
end
gem 'aws-sdk'
We'll invoke the deploy.rb
script we're about to write with a lightweight deploy.sh
wrapper:
#!/bin/bash
cd "$(dirname "$0")" || exit
bundle install --path ./bundle
bundle exec ruby deploy.rb
Now let's make a start on the deploy.rb
script
require 'aws-sdk'
require 'logger'
require 'json'
require 'net/http'
$stdout.sync = true
logger = Logger.new($stdout)
aws_region = begin
JSON.parse(Net::HTTP.get(URI("http://169.254.169.254/latest/dynamic/instance-identity/document")))["region"]
rescue Errno::EHOSTUNREACH
logger.info("No route to host for AWS meta-data (169.254.169.254), assuming running as localhost and defaulting to eu-west-1 region")
'eu-west-1'
end
In this first snippet, I've done some very basic setup. We'll declare the dependencies which we need, which are largely built-in libraries with the exception of the AWS SDK. I've setup a logger and used $stdout.sync
to foce the stdout
buffer to immediately flush whenever it receives any new data. This can be really helpful when running the script through a CI/CD tool as these tools typically require the buffer to flush for them to be able to show you logs.
Using a rescue block I attempt to determine the region of the EC2 instance using the AWS Instance Metadata. I've done this incase you end up running your deployment process on a CI server within your AWS account. In a future blog post I plan to cover how you can setup an entire Jenkins environment running on spot instances. As a fallback if the GET
request fails we'll assume we're in the eu-west-1
region. This can be useful when you're testing or using the deploy script locally or outside of AWS.
Next, we'll create a load of API clients which will be important for determining the IDs and ARNs of resources. We will also get the AMI ID from the manifest.json
that's created as part of our build process.
ec2 = Aws::EC2::Client.new(region: aws_region)
elbv2 = Aws::ElasticLoadBalancingV2::Client.new(region: aws_region)
iam = Aws::IAM::Client.new(region: aws_region)
packer_manifest = JSON.parse(File.read('manifest.json'))
ami_id = packer_manifest['builds'][0]['artifact_id'].split(':')[1]
logger.info("AMI ID: #{ami_id}")
WEBSITE_TARGET_GROUP_ARN = elbv2.describe_target_groups({names: ['website']}).target_groups[0].target_group_arn
logger.info("Using ELB target group: #{WEBSITE_TARGET_GROUP_ARN}")
iam_fleet_role = iam.get_role({role_name: 'aws-ec2-spot-fleet-tagging-role'}).role.arn
default_sg_id = ec2.describe_security_groups({
filters: [
{
name: "description",
values: [
"default VPC security group",
],
},
],
}).security_groups[0].group_id
rails_app_sg_id = ec2.describe_security_groups({
filters: [
{
name: "tag:Name",
values: [
"Rails App",
],
},
],
}).security_groups[0].group_id
logger.info("IAM Fleet Role ARN: #{iam_fleet_role}")
logger.info("Default Security Group: #{default_sg_id}")
logger.info("Rails App Security Group: #{rails_app_sg_id}")
Before we start deploying our application we should use the EC2 API to determine the IDs of the spot fleet requests that are running the current in-production version of our application.
existing_website_spot_fleet_request_ids = []
ec2.describe_spot_fleet_requests.each do |resps|
resps.spot_fleet_request_configs.each do |fleet_request|
if fleet_request.spot_fleet_request_state == 'active' || fleet_request.spot_fleet_request_state == 'modifying'
target_groups_config = fleet_request.spot_fleet_request_config.load_balancers_config.target_groups_config
if target_groups_config.target_groups.all? {|tg| tg.arn == WEBSITE_TARGET_GROUP_ARN}
existing_website_spot_fleet_request_ids << fleet_request.spot_fleet_request_id
end
end
end
end
logger.info("Existing website fleet requests: #{existing_website_spot_fleet_request_ids}")
We can then create our new spot fleet request that will launch our newly built application AMI.
response = ec2.request_spot_fleet({
spot_fleet_request_config: {
allocation_strategy: 'lowestPrice',
on_demand_allocation_strategy: "lowestPrice",
excess_capacity_termination_policy: "noTermination",
fulfilled_capacity: 1.0,
on_demand_fulfilled_capacity: 1.0,
iam_fleet_role: iam_fleet_role,
launch_specifications: [
{
security_groups: [
{
group_id: default_sg_id
},
{
group_id: rails_app_sg_id
}
],
iam_instance_profile: {
name: "website",
},
image_id: ami_id,
instance_type: "t3.micro",
key_name: "demo",
tag_specifications: [
{
resource_type: "instance",
tags: [
{
key: "Name",
value: "demo-web-app",
},
{
key: "Project",
value: "demo-web-app",
},
],
}
],
},
],
target_capacity: 2,
type: 'maintain',
valid_from: Time.now,
replace_unhealthy_instances: false,
instance_interruption_behavior: 'terminate',
load_balancers_config: {
target_groups_config: {
target_groups: [
{
arn: WEBSITE_TARGET_GROUP_ARN
},
],
},
},
},
})
logger.info("Launching spot instance request: '#{response.spot_fleet_request_id}'")
We will then want to wait for the spot fleet request to be provisioned, and for the instances to become available.
spot_provisioned = false
begin
ec2.describe_spot_fleet_requests({spot_fleet_request_ids: [response.spot_fleet_request_id]}).each do |resps|
resps.spot_fleet_request_configs.each do |fleet_request|
if fleet_request.activity_status == 'fulfilled'
spot_provisioned = true
end
if fleet_request.activity_status == 'error'
logger.error("Provisioning spot instance request '#{response.spot_fleet_request_id}' has failed!")
exit 1
end
logger.info("Spot instance request '#{response.spot_fleet_request_id}' has activity status: '#{fleet_request.activity_status}'")
sleep 10
end
end
end until spot_provisioned
logger.info("Launched spot instance request: '#{response.spot_fleet_request_id}' !")
sleep 10
When the spot fleet request launches our instrances it will initially have a state of initial
, once the health checks on the load balancer have called out to our new instances they'll then transition to healthy
or unhealthy
.
The next stage of our script will wait for all instances to become healthy or abort is any instance moves to an unhealthy state. By using a spot fleet request cleaning up instances is really easy, we just have to cancel the spot fleet request and AWS will take care of terminating the instances and removing them from the target group.
target_group_resp = elbv2.describe_target_health({target_group_arn: WEBSITE_TARGET_GROUP_ARN})
until target_group_resp.target_health_descriptions.all? {|thd| thd.target_health.state == 'healthy'}
total_instances = target_group_resp.target_health_descriptions.size
healthy_instances = target_group_resp.target_health_descriptions.count {|thd| thd.target_health.state == 'healthy'}
unhealthy_instances = target_group_resp.target_health_descriptions.count {|thd| thd.target_health.state == 'unhealthy'}
logger.info("#{total_instances} total instances in target group. #{healthy_instances} healthy instances...")
if unhealthy_instances > 0
logger.error("#{unhealthy_instances} unhealthy instances! aborting...")
ec2.cancel_spot_fleet_requests(spot_fleet_request_ids: [response.spot_fleet_request_id], terminate_instances: true)
logger.error("Cancelled new fleet request (id: #{response.spot_fleet_request_id})")
end
if total_instances != healthy_instances
sleep 10
target_group_resp = elbv2.describe_target_health({target_group_arn: WEBSITE_TARGET_GROUP_ARN})
total_instances = target_group_resp.target_health_descriptions.size
healthy_instances = target_group_resp.target_health_descriptions.count {|thd| thd.target_health.state == 'healthy'}
unhealthy_instances = target_group_resp.target_health_descriptions.count {|thd| thd.target_health.state == 'unhealthy'}
logger.info("#{total_instances} total instances in target group. #{healthy_instances} healthy instances...")
end
end
sleep 10
If our new AMI instances launched into the load balancer successfully with a healthy
state, we can now terminate any spot fleet requests that were serving up older versions of our application:
unless existing_website_spot_fleet_request_ids.empty?
logger.info("Cancelling old spot instances: #{existing_website_spot_fleet_request_ids}")
ec2.cancel_spot_fleet_requests(spot_fleet_request_ids: existing_website_spot_fleet_request_ids, terminate_instances: true)
end
logger.info("Deployed!")
And that's it! We have a deploy script that will replace existing in-production instances with our new AMI.
Producing a new build of our application is as simple as:
./build/build.sh
We can then release it with:
./build/deploy.sh
The deployment and build scripts used here are simple and fairly basic. You could certainly make many improvements to the process to allow for more complex development practices such as being able to deploy branched builds to development and test environments. In this post, as with my other posts in the immutable series, I've focused on the simplest example to prove and explain the concept.
Conclusion
Over the course of my series on Immutable Servers we're looked at:
- how an existing configuration management setup (e.g. SaltStack) can be incorporated into machine images built by Packer
- how infrastructure can be structured to allow applications to be launched through immutable and ephemeral spot instances
- how Packer can be used to bake an application into a base image
EC2 Spot Instances can provide a cost saving of up to 90% off the on-demand price. Transitioning to an immutable model has it's challenges, but with some well placed tooling and architecture design you can leverage additional benefits and avoid common issues such as configuration drift.