CuralabelTechnologies
CloudMarch 18, 2026 · 7 min read

How We Cut Infrastructure Costs 40% with Kubernetes Auto-scaling

A real teardown of how a SaaS client cut their cloud bill by 40% without touching application code.

SB
Sara Berg
Staff DevOps Engineer

Cloud bills creep. A team adds a service here, a buffer there, and twelve months later the finance team is asking pointed questions. Our client — a B2B SaaS with about 15 services on EKS — was paying for roughly 3x the capacity they actually used at any given moment.

We started where the money was: right-sizing. Most pods had requests set to comfortable round numbers from a year ago. We deployed Vertical Pod Autoscaler in recommendation mode and used a week of real data to set tighter requests on every workload. That alone reclaimed about 22% of the cluster.

Next, we replaced static node groups with Karpenter, which provisions exactly the right instance shape for pending pods and consolidates underused nodes aggressively. Combining that with spot instances on stateless workloads (with sensible PDBs) brought another 15% off the bill.

The final piece was scheduled scaling for environments that don't need to exist at 3am. Staging and review environments scale to zero overnight and on weekends. None of this required changes to application code — just disciplined platform engineering.

Ready to build something great?

Let's talk about your project. No pitch, no pressure — just a real conversation.

Book a Free Consultation