Capacity Modes | DynamoDB cost optimization

What are capacity modes?

DynamoDB offers two capacity modes that control how you pay for read and write throughput: on-demand and provisioned.¹ The capacity mode you choose fundamentally impacts your costs, operational overhead, and how your application scales.

Capacity mode comparison

Feature	On-Demand	Provisioned
Pricing model	Pay per request	Pay for provisioned capacity (hourly)
Capacity planning	None required	Required (or use auto scaling)
Scaling	Automatic, instant	Manual or auto scaling
Best for	Variable, unpredictable workloads	Steady, predictable workloads
Idle cost	$0 (no traffic = no cost)	Full hourly rate regardless of usage
Unit cost	Higher per request	Lower per request
Reserved capacity	Not available	Available (up to 77% discount)

Key principle: On-demand trades higher per-request costs for zero capacity management. Provisioned trades lower per-request costs for capacity planning responsibility.²

On-demand capacity mode

What is on-demand mode?

On-demand mode is a truly serverless option that automatically scales to accommodate your workload without any capacity planning.³ You pay only for the read and write requests your application actually performs, with no idle capacity charges.

Key characteristics:

No capacity planning: Create tables and start using them immediately
Automatic scaling: Instantly accommodates traffic spikes up to double your previous peak
Pay-per-request: Only pay for what you use
Zero idle cost: If your table receives no traffic, you pay nothing for throughput
Default mode: Recommended for most modern applications³

Request units

On-demand mode charges based on Read Request Units (RRUs) and Write Request Units (WRUs):³

Operation	Unit Definition	Cost Basis
Read Request Unit (RRU)	1 strongly consistent read/sec OR 2 eventually consistent reads/sec	Item up to 4 KB
Write Request Unit (WRU)	1 write/sec	Item up to 1 KB

Example: Reading a 10 KB item

Item size: 10 KB
Capacity consumed: 10 KB ÷ 4 KB = 2.5 → 3 RRUs (round up)

Example: Writing a 3.5 KB item

Item size: 3.5 KB
Capacity consumed: 3.5 KB ÷ 1 KB = 3.5 → 4 WRUs (round up)

Initial capacity:

New tables can sustain up to 4,000 writes/sec and 12,000 reads/sec

Automatic scaling:

Instantly accommodates up to 2× your previous peak traffic
If you previously hit 50,000 reads/sec, you can instantly sustain 100,000 reads/sec
That 100,000 becomes your new peak, enabling 200,000 reads/sec next time

Throttling rules:

If you exceed 2× your previous peak within 30 minutes, you may experience throttling
DynamoDB automatically allocates more capacity as traffic increases over time
To avoid throttling, either space traffic growth over 30 minutes or use warm throughput

Maximum throughput limits

By default, on-demand tables protect against runaway costs with account-level limits:³

Default limit: 40,000 table-level RCUs and 40,000 table-level WCUs per account per region
Configurable table limits: You can set per-table or per-index maximum throughput to prevent cost overruns

Setting table-level limits:

aws dynamodb update-table \
  --table-name MyTable \
  --on-demand-throughput '{
    "MaxReadRequestUnits": 10000,
    "MaxWriteRequestUnits": 5000
  }'

This feature helps:

Prevent unexpected cost spikes from bugs or attacks
Enforce budget controls on specific tables
Protect shared-tenant systems from individual table overconsumption

Choose on-demand when:

You have unpredictable or variable traffic patterns
You’re launching a new application without historical traffic data
You have spikey workloads (e.g., daily batch jobs, event-driven processing)
You want zero operational overhead for capacity management
Your workload has low or intermittent traffic (paying for idle capacity would be wasteful)
You need rapid development iteration without capacity tuning

Provisioned capacity mode

What is provisioned mode?

Provisioned mode requires you to specify read and write capacity in advance.⁴ You’re charged hourly for the provisioned capacity, regardless of whether you actually use it. This offers lower per-request costs but requires capacity planning.

Key characteristics:

Lower per-request cost: ~50-70% cheaper than on-demand at high utilization rates
Hourly billing: Pay for provisioned capacity every hour, even if unused
Capacity planning required: Must estimate throughput needs
Auto scaling available: Can automatically adjust capacity based on utilization
Reserved capacity available: Purchase 1 or 3-year commitments for up to 77% discount (see Reserved Capacity)

Capacity units

Provisioned mode uses Read Capacity Units (RCUs) and Write Capacity Units (WCUs):⁴

Unit	Definition	Cost Basis
Read Capacity Unit (RCU)	1 strongly consistent read/sec OR 2 eventually consistent reads/sec	Item up to 4 KB
Write Capacity Unit (WCU)	1 write/sec	Item up to 1 KB

Example: Calculate required RCUs

Requirement: 80 strongly consistent reads/sec, item size 3 KB

Step 1: Capacity per item
  Item size: 3 KB
  Capacity per read: 3 KB ÷ 4 KB = 0.75 → 1 RCU (round up)

Step 2: Total capacity needed
  Reads per second: 80
  Total RCUs: 80 reads/sec × 1 RCU = 80 RCUs

Provisioned capacity: 80 RCUs

Example: Calculate required WCUs

Requirement: 100 writes/sec, item size 512 bytes

Step 1: Capacity per item
  Item size: 512 bytes (0.5 KB)
  Capacity per write: 0.5 KB ÷ 1 KB = 0.5 → 1 WCU (round up)

Step 2: Total capacity needed
  Writes per second: 100
  Total WCUs: 100 writes/sec × 1 WCU = 100 WCUs

Provisioned capacity: 100 WCUs

Burst capacity

Provisioned mode includes burst capacity, a built-in buffer that helps handle short traffic spikes without throttling.⁴ This is a key advantage not available in on-demand mode.

How burst capacity works:

DynamoDB reserves up to 5 minutes of unused read and write capacity
This reserved capacity accumulates when your table isn’t fully utilizing its provisioned throughput
During traffic spikes, burst capacity is consumed first before throttling occurs
Burst capacity is automatic and requires no configuration

Key benefits:

Graceful handling of spikes: Absorb brief traffic surges without over-provisioning
Cost efficiency: Don’t need to provision for short-term peaks
Automatic protection: Works transparently without configuration
Complements auto scaling: Handles immediate spikes while auto scaling responds (takes 2-5 minutes)

Limitations:

Only 5 minutes maximum: Cannot handle sustained increases beyond provisioned capacity
Not guaranteed: Burst capacity may not always be available if recently consumed
Not a replacement for proper capacity planning: Should be viewed as a buffer, not primary capacity

Tip: Burst capacity allows you to provision for average traffic rather than peak traffic, reducing costs while still handling brief spikes. For longer spikes, use auto scaling or switch to on-demand mode.

When to use provisioned mode

Choose provisioned mode when:

You have steady, predictable traffic patterns
You can accurately forecast capacity requirements
You’re willing to manage capacity settings (or configure auto scaling)

Note For long-term, steady-state workloads, consider purchasing reserved capacity to save up to 77% on provisioned capacity costs. Reserved capacity requires a 1 or 3-year commitment but can dramatically reduce costs for predictable workloads.

DynamoDB auto scaling

Auto scaling automatically adjusts provisioned capacity in response to traffic changes.⁵ This combines the low cost of provisioned mode with reduced operational overhead.

How auto scaling works:

Set minimum and maximum capacity bounds
- Example: 10-500 RCUs, 5-250 WCUs
Set target utilization percentage
- Recommended: 70%
DynamoDB publishes consumption metrics to CloudWatch
When utilization exceeds/falls below target for 2 consecutive minutes:
- CloudWatch alarm triggers
- Application Auto Scaling adjusts table capacity
- Table scales up or down to approach target utilization

Configuration example:

aws application-autoscaling register-scalable-target \
  --service-namespace dynamodb \
  --resource-id "table/MyTable" \
  --scalable-dimension "dynamodb:table:ReadCapacityUnits" \
  --min-capacity 10 \
  --max-capacity 500

aws application-autoscaling put-scaling-policy \
  --service-namespace dynamodb \
  --resource-id "table/MyTable" \
  --scalable-dimension "dynamodb:table:ReadCapacityUnits" \
  --policy-name "MyReadScalingPolicy" \
  --policy-type "TargetTrackingScaling" \
  --target-tracking-scaling-policy-configuration '{
    "TargetValue": 70.0,
    "PredefinedMetricSpecification": {
      "PredefinedMetricType": "DynamoDBReadCapacityUtilization"
    }
  }'

Auto scaling best practices:

Target utilization: Use 70% as recommended by AWS⁴
Minimum capacity: Set high enough to handle baseline traffic (avoid constant scaling)
Maximum capacity: Set to account limits or budget constraints
Apply uniformly: If using GSIs, apply auto scaling to both table and indexes
Monitor CloudWatch alarms: Auto scaling may take a few minutes to respond to spikes

Auto scaling limitations:

Scaling delay: CloudWatch alarms may take 2-5 minutes to trigger
Not instantaneous: Cannot handle sudden 10× spikes as well as on-demand
Manual cleanup required: Deleting a table doesn’t auto-delete scaling policies
Scaling limits: Cannot scale faster than 4× in a single scaling event

Switching between capacity modes

DynamoDB allows switching between capacity modes, but with limits:⁶

Switching rules

Switch Direction	Frequency Limit
Provisioned → On-Demand	Up to 4 times per 24-hour rolling window
On-Demand → Provisioned	Unlimited (any time)

Switching from provisioned to on-demand

What happens:⁶

Auto scaling settings deleted (console) or preserved (CLI/SDK)
Table continues delivering throughput during transition (takes several minutes)
New on-demand table guaranteed to sustain at least previous provisioned capacity or 4,000 WRUs/12,000 RRUs, whichever is higher

Example transitions:

Case 1: Table provisioned at 100 RCU, 50 WCU

→ After switch: Can sustain at least 12,000 reads/sec, 4,000 writes/sec

Case 2: Table provisioned at 10,000 RCU, 5,000 WCU

→ After switch: Can sustain at least 10,000 reads/sec, 5,000 writes/sec

Case 3: Table previously had 50,000 RCU but currently at 100 RCU

→ After switch: Can sustain at least 50,000 reads/sec (historical peak preserved)

When to switch:

Traffic becomes unpredictable or spikey
You want to eliminate capacity management overhead
You’re over-provisioned and paying for unused capacity
Application is in development/testing phase

Switching from on-demand to provisioned