Finding Hidden Cloud Savings

What do duplicate IAM policies and over-provisioned throughput have in common? They're both costing you time and money.

Hey folks,

When you work across many clients, you get to see lots of infrastructure configuration and learn to suggest possible improvements.

Here are two stories about how we recently helped clients out, highlighting problems you might be facing too.

Duplicate Resources Kill Developer Experience

We have a new, high-profile client for which we just completed our Infrastructure as Code audit. They have been suffering from a handful of nasty Terraliths.

It is drastically killing their developer experience and the organization's ability to ship infrastructure quickly.

Want to know what we found?

They had thousands of IAM Policy + Role + Policy Attachment resources. These were taking up over 50% of their largest Terralith's state file.

How does that happen?

Well, when you're building a strong least privilege architecture (which everyone should be), you have a lot of custom policies to limit what a role can do. This is good!

But if you're not careful, it's easy to create a lot of resources in highly used child modules. Then you can end up with many, many duplicates of those resources even though they might not actually be different. Policies are a common situation where this happens, but it happens with other potentially shared resources too.

The moral of the story: check out your child modules and audit them for shared resources that you might be creating hundreds or even thousands of times. Such shared resources can be worth refactoring and extracting out.

When you're creating them in a central location and passing them downstream where they're used, you'll avoid the duplication, speed impacts and pain.

Saving $100k A Year

Of course, infrastructure issues can cost money too.

One of my team members, Yang, recently identified a simple optimization that will save one of our clients over $100,000 annually.

While building a full cloud-native transformation and IaC'ification project for one of our clients, Yang noticed something in their AWS setup. Their Elastic File System, an AWS-managed, scalable file storage system, was configured with Provisioned Throughput at 500 MiB/s.

This cost roughly $3,300 per month regardless of their usage.

The thing is the client's systems were only using 1-3% of that provisioned capacity. So they were overpaying.

The fix was straightforward.

They simply needed to switch from Provisioned Throughput to Elastic Throughput. This change meant they'd pay only for data actually transferred instead of a fixed rate, with zero performance impact.

The amazing result: when applied across their development, staging, and production environments, this single cost optimization netted around $108,000 in annual savings.

Yang wasn't even tasked with cost optimization. He was building a new development environment when he spotted this issue.

These stories are what excite me about infrastructure consulting.

I love that we can help teams save time, money and toil through attention to detail and proactive problem solving.

May your state files stay lean and your cloud bills leaner,

Matt @ Masterpoint

PS If you want to chat about resource duplication or IaC’ification, grab some time on my calendar here. Also, I was recently on the Inside Platform Engineering podcast, talking about open source and IaC at scale. Hope you enjoy!