Hosting a Static Blog on AWS with Terraform: S3, CloudFront, and the Gotchas In Between
This blog runs on Jekyll, but Jekyll is just the build tool. The actual hosting stack is AWS: an S3 bucket behind a CloudFront distribution, with ACM for TLS and Route53 for DNS. The whole thing is managed with Terraform. This post walks through the architecture, the Terraform config, and the non-obvious problems I ran into along the way.
The Architecture
The stack is five components wired together:
Browser
└─ Route53 (A/AAAA alias → CloudFront)
└─ CloudFront (HTTPS, custom domain, TLS via ACM)
├─ CloudFront Function (URI rewrite, viewer-request)
└─ S3 Origin (private bucket, OAC)
- S3 stores the static files built by
jekyll build. The bucket is fully private – no public access, no static website hosting enabled. - CloudFront sits in front of S3, handles HTTPS termination, caches responses at edge, and enforces
redirect-to-https. - Origin Access Control (OAC) allows CloudFront to fetch from S3 without making the bucket public. It signs requests with SigV4.
- ACM provides the TLS certificate. It must be provisioned in
us-east-1regardless of where the rest of your infrastructure lives – CloudFront is a global service and only reads certificates from that region. - Route53 resolves the domain to CloudFront via alias records.
Provider Setup
Because ACM must be in us-east-1, I declare two providers:
provider "aws" {}
provider "aws" {
alias = "us_east_1"
region = "us-east-1"
}
The default provider inherits the region from the environment or AWS config (ap-southeast-2 in my case). The aliased provider is used only for the ACM certificate and its validation.
S3 Bucket
The bucket is private with all public access blocked:
resource "aws_s3_bucket" "blog" {
bucket = var.domain_name
}
resource "aws_s3_bucket_public_access_block" "blog" {
bucket = aws_s3_bucket.blog.id
block_public_acls = true
block_public_policy = true
ignore_public_acls = true
restrict_public_buckets = true
}
The bucket policy grants read access to CloudFront only, scoped to the specific distribution ARN via a condition:
resource "aws_s3_bucket_policy" "blog" {
bucket = aws_s3_bucket.blog.id
policy = jsonencode({
Version = "2012-10-17"
Statement = [
{
Sid = "AllowCloudFrontOAC"
Effect = "Allow"
Principal = { Service = "cloudfront.amazonaws.com" }
Action = "s3:GetObject"
Resource = "${aws_s3_bucket.blog.arn}/*"
Condition = {
StringEquals = {
"AWS:SourceArn" = aws_cloudfront_distribution.blog.arn
}
}
}
]
})
}
The AWS:SourceArn condition is important. Without it, any CloudFront distribution in any account could use your bucket as an origin if they guessed the name.
Origin Access Control
OAC replaced the older Origin Access Identity (OAI) mechanism. It uses SigV4 signing rather than a synthetic IAM principal:
resource "aws_cloudfront_origin_access_control" "blog" {
name = var.domain_name
origin_access_control_origin_type = "s3"
signing_behavior = "always"
signing_protocol = "sigv4"
}
signing_behavior = "always" means every request from CloudFront to S3 is signed. There is no option to conditionally sign, which is fine – the bucket is private and every request needs authorisation.
ACM Certificate
The certificate resource lives in the us_east_1 provider:
resource "aws_acm_certificate" "blog" {
provider = aws.us_east_1
domain_name = var.domain_name
validation_method = "DNS"
lifecycle {
create_before_destroy = true
}
}
create_before_destroy prevents Terraform from deleting the old certificate before the replacement is ready. Without it, a certificate replacement would cause downtime.
DNS validation creates CNAME records in Route53. Terraform handles this with a for_each over the certificate’s domain_validation_options:
resource "aws_route53_record" "cert_validation" {
for_each = {
for dvo in aws_acm_certificate.blog.domain_validation_options : dvo.domain_name => {
name = dvo.resource_record_name
record = dvo.resource_record_value
type = dvo.resource_record_type
}
}
zone_id = data.aws_route53_zone.main.zone_id
name = each.value.name
type = each.value.type
ttl = 300
records = [each.value.record]
}
resource "aws_acm_certificate_validation" "blog" {
provider = aws.us_east_1
certificate_arn = aws_acm_certificate.blog.arn
validation_record_fqdns = [for record in aws_route53_record.cert_validation : record.fqdn]
}
The aws_acm_certificate_validation resource doesn’t create anything in AWS – it’s a Terraform mechanism that blocks until ACM confirms the certificate is issued. The CloudFront distribution references aws_acm_certificate_validation.blog.certificate_arn rather than aws_acm_certificate.blog.arn to ensure it only gets the ARN once validation is complete.
CloudFront Distribution
The distribution wires everything together:
resource "aws_cloudfront_distribution" "blog" {
enabled = true
default_root_object = "index.html"
aliases = [var.domain_name]
origin {
domain_name = aws_s3_bucket.blog.bucket_regional_domain_name
origin_id = "s3-${var.domain_name}"
origin_access_control_id = aws_cloudfront_origin_access_control.blog.id
}
default_cache_behavior {
allowed_methods = ["GET", "HEAD"]
cached_methods = ["GET", "HEAD"]
target_origin_id = "s3-${var.domain_name}"
viewer_protocol_policy = "redirect-to-https"
cache_policy_id = data.aws_cloudfront_cache_policy.caching_optimized.id
function_association {
event_type = "viewer-request"
function_arn = aws_cloudfront_function.rewrite_uri.arn
}
}
custom_error_response {
error_code = 403
response_code = 404
response_page_path = "/404.html"
}
custom_error_response {
error_code = 404
response_code = 404
response_page_path = "/404.html"
}
viewer_certificate {
acm_certificate_arn = aws_acm_certificate_validation.blog.certificate_arn
ssl_support_method = "sni-only"
minimum_protocol_version = "TLSv1.2_2021"
}
}
A few things worth noting here.
bucket_regional_domain_name is used for the origin, not bucket_domain_name. The regional variant avoids a redirect loop that can occur with the global endpoint when the bucket and CloudFront are in different regions.
The cache policy references AWS’s managed Managed-CachingOptimized policy, which is appropriate for static sites. It caches aggressively and strips most headers from origin requests.
sni-only for ssl_support_method is the correct choice unless you need to support very old clients that don’t send SNI (effectively nothing built after 2010). The alternative (vip) costs around $600/month.
Two Gotchas I Hit
Subdirectory requests return 403
default_root_object = "index.html" only applies to the root path (/). A request to /about/ does not get rewritten to /about/index.html automatically – CloudFront forwards it to S3 literally, S3 has no object at that key, and it returns a 403.
The fix is a CloudFront Function on the viewer-request event that rewrites the URI before it reaches the origin:
resource "aws_cloudfront_function" "rewrite_uri" {
name = "rewrite-uri-index-html"
runtime = "cloudfront-js-2.0"
publish = true
code = <<-EOF
function handler(event) {
var request = event.request;
var uri = request.uri;
if (uri.endsWith('/')) {
request.uri += 'index.html';
} else if (!uri.includes('.')) {
request.uri += '/index.html';
}
return request;
}
EOF
}
This runs at the edge before the cache lookup, so rewritten URIs are cached correctly too.
S3 returns 403, not 404, for missing objects
When accessed via OAC, a private S3 bucket returns 403 Forbidden for keys that don’t exist – not 404 Not Found. The reason is that S3 doesn’t want to confirm or deny whether a key exists in a private bucket.
This means a custom 404 page configured only for HTTP 404 errors won’t fire. You need both:
custom_error_response {
error_code = 403
response_code = 404
response_page_path = "/404.html"
}
custom_error_response {
error_code = 404
response_code = 404
response_page_path = "/404.html"
}
The first block catches the 403 from S3 and converts it to a 404 response for the browser. Without it, visitors hitting a broken link see a generic 403, which is confusing and leaks implementation detail.
Route53 DNS
Two alias records point the domain at CloudFront – one for IPv4 (A) and one for IPv6 (AAAA):
resource "aws_route53_record" "blog_a" {
zone_id = data.aws_route53_zone.main.zone_id
name = var.domain_name
type = "A"
alias {
name = aws_cloudfront_distribution.blog.domain_name
zone_id = aws_cloudfront_distribution.blog.hosted_zone_id
evaluate_target_health = false
}
}
Alias records resolve at Route53 without a CNAME hop, which means they work at the zone apex (e.g. example.com rather than just www.example.com). Regular CNAME records cannot be set at the zone apex – this is a DNS spec constraint, not an AWS limitation.
Deployment
Once the Terraform is applied, deployment is three commands:
cd blog.jeakyl.com && bundle exec jekyll build
aws s3 sync blog.jeakyl.com/_site/ s3://blog.jeakyl.com --delete
aws cloudfront create-invalidation --distribution-id E1Z05CSWVMASDV --paths "/*"
The --delete flag on s3 sync removes objects from the bucket that no longer exist in the build output. Without it, deleted posts or renamed files persist in S3 indefinitely and remain accessible.
The CloudFront invalidation flushes the cache at all edge locations. Without it, CloudFront serves stale content until the TTL expires – which, with Managed-CachingOptimized, can be up to 24 hours.
Summary
The stack is deliberately simple: S3 for storage, CloudFront for edge delivery and HTTPS, ACM for certificates, Route53 for DNS. There are no servers to patch and nothing to scale. The two non-obvious parts are the URI rewrite function (needed because default_root_object only covers /) and the 403-to-404 custom error mapping (needed because S3 returns 403 for missing keys when accessed privately). Both are easy to miss and neither is well documented in the Terraform provider docs.