AWS S3 Website Pretty URLs (Solved with CloudFront Functions)
Table of Contents
Update—Pretty URLs with CloudFront Functions
I revisited this post because users1 who shared my posts via the Apple Share Button would unknowingly share a link that, when clicked by the recipient, resulted in an HTTP Access Denied
error.
I discovered that the jekyll-seo-tag
gem was to blame. It adds a canonical link to the <head>
section using the pretty URL format regardless of Jekyll settings;2 as you can see, there is no index.html
appended3 in the href
attribute.
<link rel="canonical" href="https://www.jsrowe.com/aws-s3-website-pretty-urls/">
I returned to ChatGPT to brainstorm solutions, at first hoping to configure the jekyll-seo-tag
gem and then to get help setting up a Lambda@Edge function as recommended by the AWS blog.4
I am not an infrastructure expert,5 and navigating the thousands of pages of Amazon docs is frankly overwhelming. Enter LLM; within this new chat session, ChatGPT suggested creating an AWS CloudFront Function. I had not found this solution before with traditional search, and now my site supports pretty URLs via CloudFront!
Steps to Support Pretty URLs with a CloudFront Function
- Create a CloudFront Function.
- Add the JavaScript to dynamically manage
uri
to the Function.JavaScript Courtesy of ChatGPT
function handler(event) { var request = event.request; var uri = request.uri; // If the request is for a directory without a slash, add it if (!uri.includes('.') && !uri.endsWith('/')) { uri += '/'; } // Ensure all directory requests explicitly get index.html if (uri.endsWith('/')) { uri += 'index.html'; } // Update the request URI request.uri = uri; return request; }
</pre> </details>
- Publish function and Associate it with your CloudFront distribution.
Original Post—Jekyll Site on AWS CloudFront+S3
Note: When I launched this site on AWS CloudFront and S3, I abandoned pretty URLs by appending index.html
to my Jekyll permalinks
attribute in the _config.yml
file. See above update for a real fix.
Jekyll pretty URLs work “out of the box” when using GitHub Pages for hosting, e.g., https://jsr6720.github.io/about/ properly loads the _site/about/index.html
deployed resource.
But when I switched to using an AWS CloudFront distribution to serve a static website, Jekyll pretty URLs no longer worked, and I got an HTTP 403 Access Error6 when trying to load any URL other than the root domain.
CloudFront Setting Default Root Object
I was so focused on treating CloudFront + S3 as a webserver that I totally misread the documentation on the Default root object setting. Unlike Apache, .htaccess,
and single page application routing, the AWS CloudFront Default root object only applies to root domain request.
Because this only applies to the root URL—e.g., https://www.example.com/
—not any of the paths on that URL, it takes “workarounds” with this hosting strategy to get pretty URLs to work instead of leading to a HTTP 403 response. In my opinion, the published AWS workaround is pretty crummy; it suggests creating a lambda function to serve as an intercept and redirect requests to a specific resource on the S3 bucket.
After I tried and was unhappy with ChatGPT (below), I was able to find another workaround posted on a Stack Overflow comment that suggests reusing the 403 resource page, adding JavaScript that redirects to the requested resource. Again, in my opinion, not a great solution.
My goal has always been to maintain my Jekyll configuration and deployment to be backward-compatible with GitHub Pages so my site will continue to work at https://jsr6720.github.io, so I simply ditched pretty URLs and changed the Jekyll _config.yml permalink attribute to include index.html.
Now I don’t have pretty URLs, but isn’t beauty in the eye of the beholder? See above for solution!
What Does ChatGPT-4o Say?
Even though I told ChatGPT that I was using CloudFront with S3, it immediately suggested changing S3 bucket permissions or redirecting HTTP 403 requests to index.html. Changing the permissions would not have the desired effect of actually loading the index.html
resource when the \about\
path was requested.
It’s not that either of these solutions is “wrong,” but neither of them really solve the problem of pretty URLs. The HTTP 403 error wasn’t for the (my misinterpretation) default root object, i.e., /about/index.html
; it was that CloudFront couldn’t access /about/ because my S3 permission settings restricted listing directory assets—as it should.
I speculate that ChatGPT-4o was recommending configuring the S3 bucket as a static website, which is in direct conflict with the Amazon CloudFront-served website developer documentation.7 The other solution to reuse the 403 page is not a solution I would pursue as an engineer.
The moral is: The more I work with generative AI, the more it amazes me how quickly LLMs confidently assert wrong solutions without the slightest clue as to whether or not they “make sense.” Only with experience was I able to identify that hijacking the 403 response didn’t make sense, and neither does creating a CloudFront Lambda@Edge function to redirect requests. On that second point, I concede, I may just be eschewing complexity.
So as I explore and adopt generative AI into my work, as models get more advanced and provide even more confidently wrong answers, I must be sure to apply critical thought to the solutions provided. Or just tell it when it’s wrong.
Significant Revisions
- Feb 21st, 2025 Added in CloudFront Functions solution. Original
title
“Jekyll Pretty URLs and AWS S3 HTTP 403 Errors” - Jul 20th, 2024 Originally published on https://www.jsrowe.com with uid 2978A937-9A88-4ABF-B792-85CEE8027C60
- Jul 13th, 2024 Initial draft
Footnotes
-
Thanks Aaron and Tony. You’re the real MVPs. ↩
-
You can add a canonical URL to your Jekyll Front Matter, but I didn’t want to do that for every page. I also added this quick fix while I worked through this CloudFront function solution. ↩
-
I also have tremendous nostalgia for when websites used to give clues to their creation by loading with a file extension. Except
backslash
. ↩ -
What probably started as one official post on the AWS Compute Blog was referenced on StackOverflow, then many blog posts, and ultimately found its way into LLM training datasets. ↩
-
I tried deploying the Lambda@Edge function once and it resulted in HTTP 503 for all of my site, so I instructed ChatGPT that I had abandoned this path and asked for a “real solution”, which led me down the CloudFront rewrite behaviors path. ↩
-
I draw heavy inspiration from https://martinfowler.com and was surprised to find that even on his site, loading a URL with no file name results in a Forbidden error. ↩
-
I think it’s worth acknowledging that well-written documentation quickly supersedes the statistical “right” response from generative AI. After all, where did ChatGPT get its “answer” from? That’s right, very likely the same site I was reading. ↩