Terraform in Anger Part 1: AWS S3 Access

David Layton

David / 5 May 2020

Terraform allows you to set up infrastructure across a wide variety of vendors and platforms. I love it because it's declarative---meaning you just say what it is your want and Terraform figures out how to get there. There are many good introductions to Terraform out there, but they lack that real-word-project feel. What's it like to use Terraform in anger? I want to explore that with a real project example built from the ground up.

To that end, one need I frequently come across in my freelance data science work is secure data transfer. Often clients need to send sensitive, possibly medical or top-secret, data. In this series, we'll deliver a really slick experience for three different types of users on AWS. With the first being secure upload to S3 via the AWS CLI. Later installments will look at giving the client's Business and IT users access via AWS Console (browser) and SFTP, respectively.

Defining The Problem

Clearly we can't just create a public bucket because this data is sensitive. Equally, we can't give the client access to our entire platform. Though challenging, locking the client down to a secure bucket without access to anything else is the only feasible option for a real project. And we'll need to do this while delivering a pleasant user experience.

Although we should create our own browser portal for data transfer, it's a lot of work. It would be fine for me to invest that time for my own business, but I'd think it negligent charging a client for something so bespoke unless core to their business. I am about delivering value for money, not building cathedrals.

Save Yourself some Typing!
Download The Complete Example Code
For The Whole Series Now

The Plan

terraform and aws

  1. First, we'll create a bucket that doesn't expose any client information
  2. Then we'll create a user for the client that has access to the AWS via an Access key
  3. After that, we need to define a policy that limits a user to only accessing S3 and only that bucket
  4. And finally, we'll associate that policy to the user

Now, this is still a toy example because, on a real project, you would probably need to support several clients---each with many users. We'll address that in part II of this tutorial, but keep it simple for now. Later we can refactor using some more advanced features of Terraform to enable multiple users. This may seem forced, but I always recommend programming iteratively: starting with something simple that works and building out the functionality through refactoring.

Installing Terraform

Being written in Go, Terraform, consequently, installs easily. Simply fetch it from downloads (for your system), unzip it, and move it to a directory included in your system's PATH. Finally, check the version you're are using. Here's mine for reference.

dataunbound$ terraform --version Terraform v0.12.24

From here on, I'm assuming you have an AWS account, an access key, and awscli working on your machine. So if you haven't done that, you should do it now.

Terraform HELLO WORLD

To kick things off, we'll define AWS as a provider and nothing more.

WhY Copy & Paste?
Download The Complete Example Code
For The Whole Series Now

This block states that we'll be using AWS and want eu-west-2 to be our default region; provider is a keyword. Nothing could be simpler.

Now let's "apply". What could go wrong?!

dataunbound$ terraform apply

Error: Could not satisfy plugin requirements

Plugin reinitialization required. Please run "terraform init".

Plugins are external binaries that Terraform uses to access and manipulate resources. The configuration provided requires plugins which can't be located, don't satisfy the version constraints, or are otherwise incompatible.

Terraform automatically discovers provider requirements from your configuration, including providers used in child modules. To see the requirements and constraints from each module, run "terraform providers".

Error: provider.aws: no suitable version installed version requirements: "(any version)" versions installed: none

Fantastic! An asinine message that contains its own solution---i.e. initialise the project with terraform init first.

Luckily, I use thefuck and so should you. It corrects simple mistakes from the previous console command---unnecessary, but fun. Have a look:

dataunbound$ fuck terraform init && terraform apply [enter/↑/↓/ctrl+c]

Initializing the backend...

Initializing provider plugins...

  • Checking for available provider plugins...
  • Downloading plugin for provider "aws" (hashicorp/aws) 2.60.0...

The following providers do not have any version constraints in configuration, so the latest version was installed.

To prevent automatic upgrades to new major versions that may contain breaking changes, it is recommended to add version = "..." constraints to the corresponding provider blocks in configuration, with the constraint strings suggested below.

* provider.aws: version = "~> 2.60"

Terraform has been successfully initialized!

You may now begin working with Terraform. Try running "terraform plan" to see any changes that are required for your infrastructure. All Terraform commands should now work.

If you ever set or change modules or backend configuration for Terraform, rerun this command to reinitialize your working directory. If you forget, other commands will detect it and remind you to do so if necessary.

Apply complete! Resources: 0 added, 0 changed, 0 destroyed.

So what happened here? Well, first thefuck ran terraform init and discovered we're using AWS as a provider. Consequently, it installed the necessary dependencies (to our local Terraform). Then terraform apply looked for what needs to be done and found nothing, hence:

Resources: 0 added, 0 changed, 0 destroyed

That's because we have declared any resources. Let's change that by declaring a bucket for our client's data.

Creating a Secure Bucket in S3

This is our first resource, [aws_s3_bucket](https://www.terraform.io/docs/providers/aws/d/s3_bucket.html), and we named it test_client_bucket. When we apply, Terraform creates a private s3 bucket name "test-client-bucket-x130099".

dataunbound$ terraform apply

An execution plan has been generated and is shown below. Resource actions are indicated with the following symbols:

  • create

Terraform will perform the following actions:

# aws_s3_bucket.test_client_bucket will be created

  • resource "aws_s3_bucket" "test_client_bucket" {
    • acceleration_status = (known after apply)

    • acl = "private"

    • arn = (known after apply)

    • bucket = "test-client-bucket-x130099"

    • bucket_domain_name = (known after apply)

    • bucket_regional_domain_name = (known after apply)

    • force_destroy = false

    • hosted_zone_id = (known after apply)

    • id = (known after apply)

    • region = "eu-west-2"

    • request_payer = (known after apply)

    • website_domain = (known after apply)

    • website_endpoint = (known after apply)

    • versioning {

      • enabled = (known after apply)
      • mfa_delete = (known after apply) } }

Plan: 1 to add, 0 to change, 0 to destroy.

This first bit above is the plan. Notice that there are several attributes marked as (known after apply). We'll discuss this in a minute.

But first, we should ensure everything in this bucket is encrypted server-side. We'll use AES256 like so:

If you apply now, you'll see that Terraform only changes the existing bucket rather than destroying and recreating it.

Plan: 0 to add, 1 to change, 0 to destroy.

But this won't always be the case. When unsure, use terraform plan to see a dry run.

As an admin, you can explore this new bucket. When you're done, we'll create a user whose sole ability is to view this bucket and administer its contents.

Creating a Restricted S3 bucket User

We need to create a user and, moreover, restrict their knowledge and control to this bucket. There are many ways to do this; but for now, we'll use an [aws_aim_user](https://www.terraform.io/docs/providers/aws/d/iam_user.html) and an [aws_iam_user_policy](https://www.terraform.io/docs/providers/aws/d/iam_policy.html) with a separate [aws_iam_policy_document](https://www.terraform.io/docs/providers/aws/d/iam_policy_document.html).

The User

Nothing complex here---just a name, Alice.

But to access AWS via the cli, the user will need an access key.

Now we finally encounter something interesting. By aws_iam_user.test_client.name, we are asking for the value of the name attribute on whatever gets created by the resource "aws_iam_access_key" "test_client" block. It's invaluable to note that the Terraform documentation (which I've been linking to as we go along) lists the attributes for each resource.

dataunbound$ terraform apply aws_s3_bucket.test_client_bucket: Refreshing state... [id=test-client-bucket-x130099]

An execution plan has been generated and is shown below. Resource actions are indicated with the following symbols:

  • create

Terraform will perform the following actions:

# aws_iam_access_key.test_client will be created

  • resource "aws_iam_access_key" "test_client" {
    • encrypted_secret = (known after apply)
    • id = (known after apply)
    • key_fingerprint = (known after apply)
    • secret = (sensitive value)
    • ses_smtp_password = (sensitive value)
    • ses_smtp_password_v4 = (sensitive value)
    • status = (known after apply)
    • user = "alice" }

# aws_iam_user.test_client will be created

  • resource "aws_iam_user" "test_client" {
    • arn = (known after apply)
    • force_destroy = false
    • id = (known after apply)
    • name = "alice"
    • path = "/"
    • unique_id = (known after apply) }

Plan: 2 to add, 0 to change, 0 to destroy.

Do you want to perform these actions? Terraform will perform the actions described above. Only 'yes' will be accepted to approve.

Enter a value: yes

aws_iam_user.test_client: Creating... aws_iam_user.test_client: Creation complete after 1s [id=alice] aws_iam_access_key.test_client: Creating... aws_iam_access_key.test_client: Creation complete after 1s [id=AKIA6MFJDG3VVFMKP4EF]

Even cooler, when Terraform sees this reference, it will flag it as a dependency. When we run apply, resources will be created in the required order, and in parallel where possible. Here Terraform figured out it was just a literal which is why "alice" was included in the plan for the access key.

So we could have used the literal "alice" ourselves, but then we would be repeating ourselves---which is never good. But professional standards aside, you don't always know the value of the property before it is created. Remember all the attributes in the output labeled (known after apply)? Imagine, for instance, we needed the ARN rather than the name.

What are the ARNS for the resources we've made anyway? Well, after apply, you can see all current values with terraform show.

dataunbound$ terraform show

aws_iam_access_key.test_client:

resource "aws_iam_access_key" "test_client" { id = "AKIA6MFJDG3VVFMKP4EF" secret = (sensitive value) ses_smtp_password = (sensitive value) ses_smtp_password_v4 = (sensitive value) status = "Active" user = "alice" }

aws_iam_user.test_client:

resource "aws_iam_user" "test_client" { arn = "arn:aws:iam::988197107435:user/alice" force_destroy = false id = "alice" name = "alice" path = "/" unique_id = "AIDA6MFJDG3VRPQBSAB4R" }

aws_s3_bucket.test_client_bucket:

resource "aws_s3_bucket" "test_client_bucket" { acl = "private" arn = "arn:aws:s3:::test-client-bucket-x130099" bucket = "test-client-bucket-x130099" bucket_domain_name = "test-client-bucket-x130099.s3.amazonaws.com" bucket_regional_domain_name = "test-client-bucket-x130099.s3.eu-west-2.amazonaws.com" force_destroy = false hosted_zone_id = "Z3GKZC51ZF0DB4" id = "test-client-bucket-x130099" region = "eu-west-2" request_payer = "BucketOwner" tags = {}

server\_side\_encryption\_configuration {
    rule {
        apply\_server\_side\_encryption\_by\_default {
            sse\_algorithm = "AES256"
        }
    }
}

versioning {
    enabled    = false
    mfa\_delete = false
}

}

Neat! It's worth noting here that Terraform does not output-sensitive values to the screen; the (sensative value)sections are not edits on my part.

But where did the access key go? And when you find it, how are you going to get it to the user? We'll cover this in more detail in a later segment, but suffice to say look for a file in your working directory called terraform.tfstate. You can find it there.

Now let's give Alice some access.

The Policy

Policies in AWS are defined as JSON. Most tutorials inline the JSON to define such policies. However, this leads to hard-coding identifiers. These can change when a resource is recreated and cause problems. I don't know how the other authors sleep at night, but I'll break from the norm and provide a civilized example. I'll use a separate [aws_iam_policy_document](https://www.terraform.io/docs/providers/aws/d/iam_policy_document.html).

This policy allows access to the contents of aws_s3_bucket.test_client_bucket.arn. This policy document is not a resource like our other blocks. Instead, it is a data source and will provide reusability and some protection against fat-finger mistakes.

With it, we'll can create a user policy like so.

The .json converts our data source to literal JSON. Now let's take it for a spin.

Testing

After apply, you can test your new user. I did this by adding the access key information from terraform.tfstate to my ~/.aws/credentials file in a new section [alice]. You can then call awscli --profile alice followed by any of your usual commands.

We may look at automated testing in a future post, but not here.

Next Steps with Terraform

Stay tuned by subscribing and you'll get notified immediately when part II is available. In Part II we will refactor aws-hello-world.tf into separate files and modules and adding optional aws console access.

Can't Wait?
Download The Complete Example Code
For The Whole Series Now