Managing Terraform State for AWS workloads with v1.10.0-beta1

7 minute read

Overview

When working with Terraform in a team environment or production setup, it’s crucial to store the state file remotely and implement state locking. If you read up on the current documentation and any examples out there in the internet, you would find something similar to the below paragraphs.

Current implementation

AWS S3 provides reliable storage for the state file, while DynamoDB enables state locking to prevent concurrent modifications.

Backend Configuration

To configure Terraform to use S3 and DynamoDB as a backend, add the following configuration to your Terraform files:

1terraform {
2  backend "s3" {
3    bucket         = "tfstatebucket"
4    key            = "locks/terraform.tfstate"
5    region         = "us-east-1"
6    encrypt        = true
7    dynamodb_table = "terraform-state-lock"
8  }
9}

What is changing ?

As of Aug 20, 2024 Amazon S3 added support for conditional writes that can check for the existence of an object before creating it. This capability can help you more easily prevent applications from overwriting any existing objects when uploading data.

Lets break this down a bit. Imagine you and your colleague are both trying to save a document with the same name to a shared directory or S3 bucket:

Without conditional writes:

  • Both of you can save the file
  • Whoever saves last overwrites the other person’s work
  • There’s no built-in way to prevent this conflict

With conditional writes:

  • Before saving, S3 checks if a file with that name already exists
  • If it exists, S3 will reject the save operation
  • If it doesn’t exist, the save operation proceeds

Now, why is this important with Terraform state ? Terraform State file as you know is the single most crucuial element which helps Terraform understand the current infrastructure state and helps any subsequent operations know what to do when encountering a change in configuration. Currently with the backend configuration above showed, we use DynamoDb to lock that file down in case of multiple entities ( human, machine) trying to access or update the file. With the conditional writes feature, we can leverage S3’s native capabilities and eliminate the need for an additional component like DynamoDB in your backend infrastructure.

More details on the changes here.

Updated Backend configuration

Pre-requisites

  • You need to be on the beta version of v1.10 of Terraform to be able to leverage the changes in S3 backend configuration. This is still a beta and so I wouldn’t recommend you perform these changes on your production infrastructure.

Release change log : v1.10.0-beta1

Confguration

Based on the documentation, you can remove the dynamodb_table attribute and add the experimental attribute use_lockfile and set it to true.

1terraform {
2  backend "s3" {
3    bucket       = "tfstatebucket"
4    key          = "locks/terraform.tfstate"
5    region       = "us-east-1"
6    encrypt      = true
7    use_lockfile = true
8  }
9}

Existing infrastructure

Lets take the case of existing infrastructure you have provisioned with S3/DynamoDb combination for state management. Assuming below is the configuration you have.

1terraform {
2  backend "s3" {
3    bucket         = "tfstatebucket"
4    key            = "locks/terraform.tfstate"
5    region         = "us-east-1"
6    encrypt        = true
7    dynamodb_table = "terraform-state-lock"
8  }
9}

Let’s take a look at these step by step.

Update your terraform backend configuration to add the input use_lockfile = true.

When used with DynamoDB-based locking, locks will be acquired from both sources. In a future minor release of Terraform the DynamoDB locking mechanism and associated arguments will be deprecated.

 1terraform {
 2  backend "s3" {
 3    bucket         = "tfstatebucket"
 4    key            = "locks/terraform.tfstate"
 5    region         = "us-east-1"
 6    encrypt        = true
 7    use_lockfile   = true
 8    dynamodb_table = "terraform-state-lock"
 9  }
10}

On your next plan or apply, you will be prompted to reconfigure the backend as Terraform notices a change in the configuration.

│ Error: Backend initialization required: please run "terraform init"
│
│ Reason: Backend configuration block has changed
│
│ The "backend" is the interface that Terraform uses to store state,
│ perform operations, etc. If this message is showing up, it means that the
│ Terraform configuration you're using is using a custom configuration for
│ the Terraform backend.
│
│ Changes to backend configurations require reinitialization. This allows
│ Terraform to set up the new configuration, copy existing state, etc. Please run
│ "terraform init" with either the "-reconfigure" or "-migrate-state" flags to
│ use the current configuration.
│
│ If the change reason above is incorrect, please verify your configuration
│ hasn't changed and try again. At this point, no changes to your existing
│ configuration or state have been made.
  • Run the command terraform init -reconfigure
Initializing the backend...

Successfully configured the backend "s3"! Terraform will automatically
use this backend unless the backend configuration changes.
  • Run terraform apply. If you are able to check your S3 bucket and DynamoDb table you would see these additional lock metadata added.

    • Info under DynamoDb which looks like below.
    {
    "ID": "d8590779-bcf6–728e-9897–4ba20e60b5e1",
    "Operation": "OperationTypeApply",
    "Info": "",
    "Who": "manu@xyz.com",
    "Version": "1.10.0",
    "Created": "2024–11–09T19:03:41.666663Z",
    "Path": "tfstatebucket/locks/terraform.tfstate"
    

} ``` - terraform.tfstate.tflock file adjacent to your terraform.tfstate file in the S3 bucket.

If you have multiple entities trying to acquire the lock, it continues to behave as you expect it to:

│ Error: Error acquiring the state lock
│
│ Error message: operation error S3: PutObject, https response error StatusCode: 412,
│ RequestID: SEJ4FAEDKAZ36XCP, HostID:
│ w4VEpne8lEszQmPW98EI13suUuK7iJqMKN1mE3mQKvO0i94SyzI8AjThWNN1r0HkOkfNO7Y6S2Y=, api
│ error PreconditionFailed: At least one of the pre-conditions you specified did not
│ hold
│ Lock Info:
│ ID: 6568a719–6919-ffb7-f5ec-a2791ad9805a
│ Path: tfstatebucket/locks/terraform.tfstate
│ Operation: OperationTypeApply
│ Who: manu@xyz.com
│ Version: 1.10.0
│ Created: 2024–11–09 19:07:46.1006 +0000 UTC
│ Info:
│
│
│ Terraform acquires a state lock to protect the state from being written
│ by multiple users at the same time. Please resolve the issue above and try
│ again. For most commands, you can disable locking with the "-lock=false"
│ flag, but this is not recommended.

Removing the DynamoDb input

With the use_lockfile flag and DynamoDb table specified, you have both DynamoDb and S3 handling those references of the digest and lock for the time being. This will be supported initially in the v1.10.0 of terraform from the changelog.

At this point you are free to remove the dynamoDb table reference from the backend configuration. From the changelog,

In a future minor release of Terraform the DynamoDB locking mechanism and associated arguments will be deprecated

It is expected to have the dynamoDB requirements be removed in a later version of terraform binary. At this point this should behave similar to the new backend configuration below. Remember to perform a terraform init -reconfigure after removing DynamoDb table input.

1terraform {
2  backend "s3" {
3    bucket       = "tfstatebucket"
4    key          = "locks/terraform.tfstate"
5    region       = "us-east-1"
6    encrypt      = true
7    use_lockfile = true
8  }
9}
  • Run terraform apply. You should see that the S3 bucket continues to function with the tflock file for the time apply is being processed.

If you add a small change to you configuration and perform a terraform apply, you can verify that the digest stored in DynamoDb from the earlier run is no longer updated.If you try to add the DynamoDb table configuration back into the configuration, you can expect the initialization to fail as DynamoDb is not holding the latest information about state after it was removed from the configuration.

Initializing the backend...
Successfully configured the backend "s3"! Terraform will automatically use this
 backend unless the backend configuration changes.

Error: Error refreshing state: state data in S3 does not have the expected 
content.
The checksum calculated for the state stored in S3 does not match the checksum 
stored in DynamoDB.
Bucket: tfstatebucket
Key:locks/terraform.tfstate
Calculated checksum: 221adc89fcf1c98ab0eedf577cb12adb
Stored checksum:8c45801e7413d235bde77f6d6a18beec
This may be caused by unusually long delays in S3 processing a previous state 
update. Please wait for a minute or two and try again.If this problem persists,
 and neither S3 nor DynamoDB are experiencing an outage, you may need to 
manually verify the remote state and update the Digest value stored in the 
DynamoDB table to the following value: 221adc89fcf1c98ab0eedf577c b12adb

New Infrastructure

I do not recommend doing this currently for the production infrastructure. But if you are testing the beta version, all you need is to provide a backend configuration as below and run your terraform workflow. You have one less infrastructure component to manage to manage your state.

1terraform {
2  backend "s3" {
3    bucket       = "tfstatebucket"
4    key          = "path/to/terraform.tfstate"
5    region       = "us-east-1"
6    encrypt      = true
7    use_lockfile = true
8  }
9}

Conclusion

For teams using Amazon S3/DynamoDb combination to manage their Terraform state, this is a big step in reducing the backend configuration to have just one component ( S3) going forward. Give it a try and let the HashiCorp team know if you had any issues with this.

References