June 5th | Implementing GenAI in practice | CDO Forum, Warsaw (pl)
April 20, 2024
September 19, 2018

Creating an AMI bakery to make time for actual baking

Time is more valuable than money, but cake trumps both. With automated (pre)baking of your AMIs you can have your cake — and eat it too.

Karol Junde

I don’t actually remember when, but I've once read that a way to reduce lead time is to eliminate ‘waste’, which is ‘waiting time’. Humans are a good example of the wave of frustration when they’re forced to wait for something. We should have written banner on our foreheads: “don’t come any closer”. Moreover, during the cooperation with customers we’ve noticed that the most common deployment strategy was to provision new nodes from top to bottom as the nodes were being launched. As you’re probably aware, running system updates, downloading packages and setting up configurations can take quite a long time. Everything looks nice unless some screwups occur while provisioning the package updates or a new release of your app, which in scenarios with rolling-updates (canary or blue/green are different story) might prolong the whole process dramatically. In Chaos Team we’re hiring believers in automation and the ‘everything fails’ principle and therefore, we pay strong attention to simplicity and exclude ourselves from being part of each repeatable task.

Having all this information in our minds we considered a different approach for one of our latest customers.

Here are some thoughts…

Immutable = do it once and do it right

According to the 12-Factor App methodology, immutable components are replaced for every deployment rather than being updated in-place. Following this method, you treat your instances as deployable, ephemeral artifacts. They should be standalone and self-executable among environments.

NOTE: Do not tag images. In accordance with 10th rule of 12-Factor App, ‘Dev/Prod parity’, make images consistent across your dev, qa, staging, prod or any other cloud environment. Believe me, you’re going to save a lot of time.

We all agreed that the application instances should have no dependency on external services which could be package repositories or configuration services. Literally, the AMI must be discrete and hermetic.

What kind of cake do you want?

Honestly, there is no “best” answer to the question regarding AMI baking but you can check which way works best for you:

  1. Would you like to bake the software, configuration and your code into the AMI (Netflix-style)?
  2. Would you like to bake only the software and configuration and then download the app code during the instance launch time?
  3. Would you like to use a clean OS AMI then do everything on boot with usage of Ansible, Chef, Salt or simple ‘user-data’ feature in AWS?

We’ve chosen method number one. The reason was trivial — do it once and do it right. We were obligated to decrease the time of each release to the minimum (haven’t reached the final goal yet — containers needed to go in under 5 minutes). Our AMI has to go through a candidate test and release process every two weeks. We’ve divided the whole mechanism into several steps:

  1. Foundation AMI — AMI prepared by AWS.
  2. Base AMI — the one we’re talking about in this article with all necessary OS packages and tools.
  3. Base App AMI — image with all necessary packages required for proper app launch. It can be Tomcat, Apache, Python or anything else.
  4. Last step — mount the volume with app files and update them in terms of new release.

Packer, Aminator or maybe… something else?

After making the decision how to ease our pain, we came across the next obstacle: which was the simplest tool to maintain (remember I was writing that we don’t necessarily want to participate in each process — it’s called ‘rule of autonomy’) and which gave us the opportunity to bake AMI preparation.

Generally, we decided to take 3 “bakers” under investigation:

Aminator

Written in python and designed by Netflix Team after long period of using the Bakery (which was the predecessor of current Aminator). Bakery had some drawbacks as it had been customized for CentOS base AMI and allowed to experiment with other Linux distributions just out-of-the-box. This is why the Netflix Team had rewritten Bakery into Aminator. It supports Debian, Linux, RedHat and many other distributions and, what is more important, because of it is structured using a plugin architecture it can be extended for other cloud providers, operating systems, or packaging formats.

So, it works like this:

  1. Create a volume from the snapshot of the base AMI.
  2. Attach and mount the volume.
  3. Chroot into the mounted volume.
  4. Provision application onto mounted volume using rpm or deb package.
  5. Unmount the volume and create a snapshot.
  6. Register the snapshot as an AMI.

Packer

Allows you to use a framework such as Chef, Ansible or Puppet, to install and configure the software within your Packer-made images. An important aspect of Packer is that it creates identical images for multiple platforms, you can run production in AWS, staging/QA or even in a private cloud. After a machine image is built, that machine image can be quickly launched, and smoke tested to verify that things are working.

AWS Systems Manager

AWS Systems Manager is quite a different tool, combining several features, like built-in safety controls, allowing you to incrementally roll out new changes and automatically halt the roll-out when errors occur. Additionally, it presents a detailed view of system configurations, operating system patch levels, software installations, application configurations. In terms of security, AWS SSM maintains security and compliance by scanning your instances against your patch, configuration and custom policies. Last but not least, it supports hybrid environments containing Windows and Linux instances. Of course, we’re talking about AMI baking but having some extra feature in one place especially in terms of patching or maintenance for me sounds at least encouraging.We have experience with Packer and Aminator as well but… the winner was AWS Systems Manager.

The key benefits, which helped us select the most suitable solution

So, how does this AWS SSM bake this cake?

In general, SSM uses an agent, installed on each instance you want to maintain/monitor, and an IAM role required for EC2 instances’ management. This particular ManagedRole is called ‘AmazonEC2RoleforSSM’. Apart from that we also leveraged Maintenance Windows feature, which is really cool, because it allows you to specify a recurring time window during which Run Command and other tasks are executed. SSM wasn’t the only AWS service we’ve used to bring this baking to the end therefore we’ve add following ones:

First one: looking for the newest available Foundation AMI (AMI provided by AWS) based on proper input parameters,

Second one: deleting old images in accordance with retention policy (CleanUp-Ami-Images),

Then, to make the magic real, we decided to implement the whole process as a continuous one and then, SSM brought hand in need with its Maintenance Windows feature, which according to the documentation “lets you define a schedule for when to perform potentially disruptive actions on your instances such as patching an operating system, updating drivers, installing software or patches”. Below you can see a screenshot from our maintenance windows and under the url sample document which has been used for AmazonLinux AMI baking. We’re using quite similar one for Ubuntu.

Maintenance Window — configuration part 1
Maintenance Window — configuration part 2

11 steps of baking — AWS SSM maintenance window

As you’ve probably noticed in the picture above, we’ve set 11 steps to prepare a single base AMI image. Let me shortly describe you what each step does:

  1. Invoke the Lambda Function, which then is looking for the newest Foundation AMI in selected region, regarding set parameters like: AMI_Name, Owner,
  2. Launch of a temporary EC2 instances from AMI selected in step 1,
  3. Verification if the installed SSM agent on the EC2 instance was done correctly,
  4. Update of all mandatory system packages is launched,
  5. Installation of additional packages like aws-cli or ansible,
  6. Download and launch of Ansible playbooks (formerly prepared) from S3 bucket,
  7. Creation of AMI images from EC2 instance (from step 2),
  8. Adding tags to make the image easy identifiable,
  9. Termination of Instance from step 2,
  10. Deletion of the Instance from AWS,
  11. Another Lambda Function is invoked and it’s deleting all old images (older than value set in the parameter).

With these 11 easy steps we received up-to-date image with all the required packages, updates and remedies for security vulnerabilities.

The finale

You can either provision all packages manually, via Ansible/Chef or anything else, even AWS SSM. It generally doesn’t matter. We’ve chosen to put repeatable tasks into one place and invoke the pipeline once per pre-set time.

At the end of the day you’ll find yourself at a point where you should ask yourself: “Do I want to be a clog in the machine or automate my time to have more time for other cool things”. It’s up to you but remember — no matter which way you choose “do not reinvent the wheel, just adjust it to your requirements”. Our team has gained confidence and decreased the possibility of mistakes occurring during manual package provisions (previously done via Ansible). This can happen many times, especially when you’re in a hurry doing many things at once. And last but not least, if time is money then we managed to save both. We are all aware of the saying that “time is money”. We believe that reduction of waste time which we spend on repeatable tasks, allows us to accelerate our activities in other areas, therefore earn money. Keep in mind that in the contemporary world time is the most valuable currency.

It’s high time to Tame your Chaos!

Technologies

SSM Agent
SSM Agent
AWS Lambda
AWS Lambda
Amazon S3
Amazon S3
AWS IAM
AWS IAM
AWS CloudFormation
AWS CloudFormation

Series

Remaining chapters

No items found.
Insights

Related articles

Let's talk about your project

We'd love to answer your questions and help you thrive in the cloud.

Thank you! Your submission has been received!
Oops! Something went wrong while submitting the form.