Use read replicas deliberately | RDS/Aurora cost optimization

Why it matters

Read replicas let you offload read traffic from a primary RDS or Aurora instance to one or more replicas. They are powerful for scaling read-heavy workloads and isolating reporting/query patterns, but each replica is another full-sized instance (and sometimes more storage and network cost), so you want to add them where they clearly solve a problem rather than by default.¹

Good use cases

Read-heavy workloads
- When the primary instance is constrained by read traffic (for example, APIs doing many reads with relatively few writes), replicas can take most read queries so the writer focuses on writes and critical reads.
- In Aurora, you can use replica auto scaling to increase or decrease the number of reader instances based on metrics such as CPU or connections, so read capacity adjusts with demand instead of being fixed.
Isolated reporting and analytics
- For heavy reporting, BI, or batch jobs, you can create a replica with its own instance class—potentially larger than the writer—and point reporting tools at it instead of scaling up the entire cluster or disturbing OLTP latency.
- This is often cheaper than moving the writer to a much larger instance just to satisfy periodic reporting spikes.¹
Failover options (with caveats)
- Some engines and configurations support promoting a read replica to become the new primary during an incident, but you should treat this as a deliberate failover design, not a free substitute for Multi-AZ.

Cost considerations

Each read replica is billed like another DB instance (and associated storage, where applicable), plus any cross-AZ or cross-region data transfer used for replication.¹
For Aurora, replicas share storage but still incur instance-hour charges and some replication overhead.
If a replica is lightly used, the extra cost may outweigh any performance benefit—rightsizing or query/index tuning might be a better first step.

In many cases, you’ll weigh “add replicas” vs. “scale up the primary”:

Scaling up the primary is simpler operationally but increases write and read capacity together, which can be wasteful if the bottleneck is reads only.
Adding replicas can be more cost-effective for read-heavy workloads, because you keep the writer at a modest size and add targeted read capacity, at the expense of slightly higher read latency (due to replication lag and cross-AZ hops) for some queries.

When not to add replicas

Avoid adding replicas when:

The primary’s bottlenecks are clearly write throughput, storage IOPS, or poorly optimized queries, not read load.
You do not have a clear plan for routing traffic (for example, your application always uses the writer endpoint).
The primary is oversized—consider rightsizing the instance before adding replicas.

How to route traffic to replicas

Aurora – Use the built-in reader endpoint to distribute read traffic across reader instances; keep writes and strongly consistent reads on the writer endpoint.
RDS (non-Aurora) – Configure separate endpoints (for example, via DNS or app configuration) for replicas and update your application or ORM settings so read-only workloads connect there instead of the primary.
Application-level routing – Many ORMs and frameworks support read/write splitting; configure them to send writes to the primary and read-mostly workloads (reporting, dashboards, background jobs) to replicas, and document these patterns so teams don’t accidentally route everything to the writer.

Resources