UP | HOME

Hosting with AWS S3 and CloudFront

Table of Contents

There are many posts already out about how to host static sites in S3 and CloudFront. However, I would like to add to the discussion a small contribution of how to do this by creating the resources in CloudFormation, and specifically, how to ensure all content is served via CloudFront and is strictly not available via S3.

Introduction

There are many, many posts on hosting static sites via S3 and CloudFront. There are even existing CloudFormation templates available that can be used to make this process repeatable and consistent.

This post will simply add to this knowledge and I will attempt to capture any notes or changes that have occurred since I ventured to migrate from managing a virtual machine that hosts the content of this blog into S3 served via CloudFront.

Overview

There has been a particular evolution to hosting static sites via AWS since the company and its offerings have changed themselves. For example, a few years ago, the only solution was to use S3 directly, which meant the S3 bucket needed to be public and the traffic could not be encrypted. Then, with the introduction of CloudFront, it became possible to leverage Amazon's infrastructure to more readily distribute content without making huge investments in private content delivery networks (CDNs), furthermore, it soon became possible to encrypt the traffic as well. Finally, one of the more recent developments is the ability to have Lambda functions distributed globally with CloudFront, making it possible to customize or more appropriate serve the desired content. This blog uses these Lambda@Edge functions to help serve content via CloudFront with the files residing in S3.

The necessity of the Lambda@Edge functionallity is properly motivated by S3's inability to set default documents for each directory. For example, if we want to serve a post like https://example.com/blog/year/month/post-slug/, we are unable to since there is no object in S3 by that "key". Therefore, we can use the Lambda function to rewrite the URL from CloudFront between S3 to retreive the index.html or /index.html object of the folder correctly.

Another change from other posts is that this set of configuration will force the content to be served from CloudFront. It will not be, or at least should not be, possible to access the content from the S3 bucket directly. This serves a few purposes, the bucket does not have to be public in any way, therefore, accidental write access is not possible by acciendental misconfiguration. Furthermore, all content can be served quickly and securely because of the configuration of CloudFront.

CloudFormation

We'll briefly go through the various resources needed for hosting a static site via S3 and CloudFront.

S3 Bucket

Obviously, we will need a bucket to house the content.

"BlogContentBucket": {
    "Type": "AWS::S3::Bucket",
    "Properties": {
        "AccessControl": "Private",
        "BucketName": {"Ref": "BlogBucketName"},
        "LifecycleConfiguration": {
            "Rules": [
                {
                    "NoncurrentVersionExpirationInDays": 90,
                    "Status": "Enabled"
                }
            ]
        },
        "VersioningConfiguration": {
            "Status": "Enabled"
        },
        "WebsiteConfiguration": {
            "IndexDocument": "index.html",
            "ErrorDocument": "404.html"
        }
    }
}

I have added a lifecycle policy to automatically remove older versions after 90 days. Feel free to remove or change this as desired.

We also want to make sure that the CloudFront distribution will be the only resource (other than ourselves) that can access objects from the bucket. Therefore, we need to setup a bucket policy and an Origin Access ID.

"OriginAccessId": {
    "Type": "AWS::CloudFront::CloudFrontOriginAccessIdentity",
    "Properties": {
        "CloudFrontOriginAccessIdentityConfig": {
            "Comment": "S3 Bucket Access"
        }
    }
},
"BlogContentBucketPolicy": {
    "Type": "AWS::S3::BucketPolicy",
    "Properties": {
        "Bucket": {"Ref": "BlogContentBucket"},
        "PolicyDocument": {
            "Statement": [
                {
                    "Action": ["s3:GetObject"],
                    "Effect": "Allow",
                    "Resource": [
                        {"Fn::Join": ["/", [
                            {"Fn::GetAtt": [
                                "BlogContentBucket", "Arn"]},
                            "*"
                        ]]}
                    ],
                    "Principal": {
                        "CanonicalUser": {"Fn::GetAtt": [
                            "OriginAccessId",
                            "S3CanonicalUserId"]}
                    }
                }
            ]
        }
    }
}

ACM

AWS Certificate Manager offers free certificates and these can be used with CloudFront pretty trivially, so we will set up this resource as well.

"SSLCertificate": {
    "Type": "AWS::CertificateManager::Certificate",
    "Properties": {
        "DomainName": {"Ref": "DomainName"}
    }
}

Ideally the validation could be done via DNS validation, however, this can be tricky when done via CloudFormation.

Route53

Since this blog is hosted under the "naked" domain, it's best to use Route53 for mapping the alias of CloudFront to the A record of the domain. Therefore, we will create the hosted zone and then an alias record set in the freshly created hosted zone.

"HostedZone": {
    "Type": "AWS::Route53::HostedZone",
    "Properties": {
        "Name": {"Ref": "DomainName"}
    }
}
"BlogAliasRecord": {
    "Type": "AWS::Route53::RecordSet",
    "Properties": {
        "AliasTarget": {
            "DNSName": {"Fn::GetAtt": ["CFDistribution", "DomainName"]},
            "HostedZoneId": {"Ref": "CloudFrontHostedZone"}
        },
        "HostedZoneId": {"Ref": "HostedZone"},
        "Name": {"Ref": "DomainName"},
        "Type": "A"
    }
}

If using a non-naked domain, such as www, this could defined to be a CNAME record to the CloudFront distribution.

Lambda@Edge

Of all the resources, this will actually be the most complicated.

First, we need to create a role and policy for the function's permissions.

"URIRewriteLambdaRole": {
    "Type": "AWS::IAM::Role",
    "Properties": {
        "AssumeRolePolicyDocument": {
            "Version": "2012-10-17",
            "Statement": [
                {
                    "Effect": "Allow",
                    "Action": "sts:AssumeRole",
                    "Principal": {
                        "Service": [
                            "edgelambda.amazonaws.com",
                            "lambda.amazonaws.com"
                        ]
                    }
                }
            ]
        },
        "Policies": [
            {
                "PolicyName": "GrantCloudwatchLogAccess",
                "PolicyDocument": {
                    "Version": "2012-10-17",
                    "Statement": [
                        {
                            "Effect": "Allow",
                            "Action": [
                                "logs:CreateLogGroup",
                                "logs:CreateLogStream",
                                "logs:PutLogEvents"
                            ],
                            "Resource": [
                                "*"
                            ]
                        }
                    ]
                }
            }
        ]
    }
}

Restricting the permissions is possible, but requires that the role is first created with more open permissions since it is not possible to directly tell a Lambda function to use a specific LogGroup. See this commit for more information.

Next, we can create the Lambda function resource.

"URIRewriteLambdaFunction": {
    "Type": "AWS::Lambda::Function",
    "Properties": {
        "Description": "Lambda Function performing URI rewriting",
        "Code": {
            "ZipFile": {"Fn::Join": ["\n", [
                "def handler(event, _context):",
                "    whitelist = [",
                "        'asc',",
                "        'css',",
                "        'gif',",
                "        'html',",
                "        'ico',",
                "        'jpeg',",
                "        'jpg',",
                "        'js',",
                "        'json',",
                "        'map',",
                "        'md',",
                "        'ogg',",
                "        'pdf',",
                "        'png',",
                "        'pug',",
                "        'sass',",
                "        'scss',",
                "        'svg',",
                "        'txt',",
                "        'xml',",
                "    ]",
                "    request = event['Records'][0]['cf']['request']",
                "    extension = request['uri'].split('.')[-1]",
                "    if extension is None or extension not in whitelist:",
                "        if request['uri'][-1] == '/':",
                "            request['uri'] += 'index.html'",
                "        else:",
                "            request['uri'] += '/index.html'",
                "    return request"
            ]]}

        },
        "Handler": "index.handler",
        "MemorySize": 128,
        "Role": {"Fn::GetAtt": ["URIRewriteLambdaRole", "Arn"]},
        "Runtime": "python3.7",
        "Tags": [
            {"Key": "Domain", "Value": {"Ref": "DomainName"}}
        ]
    }
}

Fairly recently, Python 3.7 became available for Lambda@Edge.

A benefit of using Python Lambda runtime is that it still supports directly uploading code to the function via the "ZipFile" key.

Notice, this function is easy enough that directly wrapping it into JSON isn't too bad. However, a better approach under development is a simple utility that can perform the encoding at build time. A future post, perhaps.

Finally, to associate the function with CloudFront, we need to create a "version" alias of the function.

"URIRewriteLambdaVersion": {
    "Type": "AWS::Lambda::Version",
    "Properties": {
        "FunctionName": {"Fn::GetAtt": [
            "URIRewriteLambdaFunction", "Arn"]},
        "Description": "Lambda Function performing URI rewriting"
    }
}

CloudFront

Finally, we can put everything together into the CloudFront Distribution.

"CFDistribution": {
    "Type": "AWS::CloudFront::Distribution",
    "Properties": {
        "DistributionConfig": {
            "Aliases": [
                {"Ref": "DomainName"}
            ],
            "DefaultRootObject": "index.html",
            "Enabled": true,
            "IPV6Enabled": true,
            "HttpVersion": "http2",
            "DefaultCacheBehavior": {
                "TargetOriginId": {"Fn::Join": [".", [
                    "s3",
                    {"Ref": "BlogBucketName"}]]},
                "ViewerProtocolPolicy": "redirect-to-https",
                "MinTTL": 0,
                "DefaultTTL": 3600,
                "AllowedMethods": ["HEAD", "GET"],
                "CachedMethods": ["HEAD", "GET"],
                "ForwardedValues": {
                    "QueryString": true,
                    "Cookies": {
                        "Forward": "none"
                    }
                },
                "LambdaFunctionAssociations": [
                    {
                        "EventType": "origin-request",
                        "LambdaFunctionARN": {
                            "Ref": "URIRewriteLambdaVersion"
                        }
                    }
                ]
            },
            "Origins": [
                {
                    "S3OriginConfig": {
                        "OriginAccessIdentity": {"Fn::Join": ["/", [
                            "origin-access-identity/cloudfront",
                            {"Ref": "OriginAccessId"}
                        ]]}
                    },
                    "DomainName": {"Fn::Join": [".", [
                        {"Ref": "BlogBucketName"},
                        "s3.amazonaws.com"]]},
                    "Id": {"Fn::Join": [".", [
                        "s3",
                        {"Ref": "BlogBucketName"}]]}
                }
            ],
            "PriceClass": "PriceClass_100",
            "Restrictions": {
                "GeoRestriction": {
                    "RestrictionType": "none",
                    "Locations": []
                }
            },
            "ViewerCertificate": {
                "SslSupportMethod": "sni-only",
                "MinimumProtocolVersion": "TLSv1.2_2018",
                "AcmCertificateArn": {"Ref": "SSLCertificate"}
            }
        }
    }
}

Future Work

The CloudFormation template is not perfect. For example, I personally would like to have the ability to create Certificates with Domain Validation via CloudFormation, however, this does not, last I have checked, appear to be possible because of timing issues.

Another future feature could be to setup automatic build and deployments of the content to the bucket using more AWS services.

Cost Considerations

AWS is not known to be inexpensive. Arguably, their entire business is built around the very fact that just about every service within AWS has a level of accounting unheard of elsewhere. That said, this blog has relatively low traffic. Therefore, the most expensive aspect of hosting it right now is the hosted zone charge. The Lambda and CloudFront accounts for measly 9% of the charges.

However, if the content is very exciting or gathers a larger following, this can and will go up. For example, hosting a few hundred sites in a different AWS via CloudFront (not as described here), the cost is measured in hundreds of dollars.

Overall, it made the most cost sense for this blog's application. It may not for others.

Summary

It is the goal of this post to further describe how to host a static site in AWS using a few services that can make for really inexpensive hosting.

The code/template discussed for this blog is available online. I hope it can be useful to others and I encourage its usage or replication. Of course, if there are any issues with it, please let me know.