Global Secondary Indexes (GSI) | DynamoDB cost optimization

What is a Global Secondary Index?

A global secondary index (GSI) reshapes the same table data with a completely different partition key and optional sort key so you can run Query operations that the base table cannot answer efficiently.¹ Unlike LSIs, GSIs may be added or removed after table creation, can span every partition in the table, and only support eventually consistent reads because they are updated asynchronously.

Own key schema: The GSI key can be any top-level string, number, or binary attributes—even if they are not keys on the base table.¹
Own throughput + storage: Each GSI has independent read/write capacity (even in on-demand mode) and its own partitions, so it can scale or bottleneck separately from the table.
Optional projections: You decide which non-key attributes get copied into the index to keep reads fast without always touching the base table.

Attribute type restriction: GSI keys must be scalar types (String, Number, Binary). You cannot use Sets, Lists, or Maps as GSI keys, even if they exist in your base table. If you need to index a list item, extract it into a separate scalar attribute first.

The problem GSIs solve

A table’s primary key limits you to queries that include the base partition key. If a new access pattern needs a different partition key (for example, “recent orders across all customers” or “highest-balance accounts per region”), you are stuck scanning or duplicating data.

How the GSI fixes it

Imagine an Orders table keyed by CustomerID (partition) and OrderID (sort). Your finance team needs to find all orders in a specific region sorted by date to reconcile revenue daily. A GSI with Region as the partition key and OrderDate as the sort key solves this—they can query by region and get results sorted chronologically.

Base table structure

CustomerID (PK)	OrderID (SK)	Region	Status	OrderDate	OrderTotal
CUST-001	ORD-101	us-west-2	PENDING	2025-11-21	$149.00
CUST-002	ORD-103	us-east-1	DELIVERED	2025-11-19	$89.99
CUST-003	ORD-104	us-west-2	SHIPPED	2025-11-21	$450.00

Base table queries are limited to:

“Get all orders for CUST-001” - efficient (uses partition key)
“Get all orders in us-east-1 sorted by date” - requires Scan (no efficient query path)

GSI: RegionDateIndex

GSI Key Schema:

Partition key: Region
Sort key: OrderDate

Region (PK)	OrderDate (SK)	CustomerID	OrderID	Status	OrderTotal
us-east-1	2025-11-19	CUST-002	ORD-103	DELIVERED	$89.99
us-east-1	2025-11-20	CUST-001	ORD-100	SHIPPED	$299.99
us-east-1	2025-11-20	CUST-002	ORD-102	SHIPPED	$599.50
us-west-2	2025-11-21	CUST-001	ORD-101	PENDING	$149.00
us-west-2	2025-11-21	CUST-003	ORD-104	SHIPPED	$450.00

GSI enables efficient queries:

“All orders in us-east-1 sorted by date” - Query with Region = 'us-east-1'
“Orders in us-west-2 from Nov 20-22” - Query with Region = 'us-west-2' and BETWEEN '2025-11-20' AND '2025-11-22'
“Recent orders by region” - Can query any region with date ranges

Without GSI vs With GSI

Need	Without GSI	With GSI (`RegionDateIndex`)
“All orders in `us-east-1` today”	Scan the table, filter every item, pay for irrelevant reads	Query the GSI with `Region = 'us-east-1'` and date range
”Top revenue orders per region”	Aggregate client-side after a Scan or export to analytics	Query the GSI by region, then filter by OrderTotal
”Multi-tenant throttling control”	Hot customers overload a single partition	Re-partition by region in a GSI, spreading load

Composite keys

Note: As of November 2025, DynamoDB supports Multi-Attribute Keys, which is now the recommended approach for querying by multiple attributes. Multi-attribute keys eliminate the need for string concatenation and preserve type safety. The composite key pattern documented below is provided for historical context and for teams maintaining existing implementations.

What are composite keys?

A composite key is a pattern where you concatenate multiple attribute values into a single string to work around DynamoDB’s single-attribute sort key limitation. For example, instead of having just OrderDate as your sort key, you create a single attribute containing Status#OrderDate (like SHIPPED#2025-11-21). Because DynamoDB sorts strings alphabetically character-by-character, this preserves the ordering of both parts.

The delimiter (typically # or |) separates the components so your application can split them back out when reading. The key is that DynamoDB sorts left-to-right, so Status acts as the primary sort dimension and OrderDate as the secondary.

When to use composite keys

Composite keys solve specific query patterns that a single attribute cannot:

Hierarchical filtering: Query all orders with a specific status, then sort by date within that status
Category navigation: Model product hierarchies like Electronics#Laptops#Gaming and use begins_with to drill down
Multi-level sorting: Sort first by priority, then by timestamp: URGENT#2025-11-21T10:00:00
Status transitions: Track workflows like PENDING#2025-11-20, SHIPPED#2025-11-21, DELIVERED#2025-11-22 within a partition

Example: Status-based order queries

Let’s extend the Orders table to support queries like “find all SHIPPED orders in us-east-1 sorted by date.”

Base table

CustomerID (PK)	OrderID (SK)	Region	Status	OrderDate	OrderTotal
CUST-001	ORD-101	us-west-2	PENDING	2025-11-21	$149.00
CUST-002	ORD-103	us-east-1	DELIVERED	2025-11-19	$89.99
CUST-003	ORD-104	us-west-2	SHIPPED	2025-11-21	$450.00

GSI: RegionStatusDateIndex

Key schema:

Partition key: Region
Sort key: StatusDate (composite attribute containing Status#OrderDate)

Application creates StatusDate on write:

item['StatusDate'] = f"{item['Status']}#{item['OrderDate']}"
# Result: "SHIPPED#2025-11-21"

GSI data:

Region (PK)	StatusDate (SK)	CustomerID	OrderID	OrderTotal
us-east-1	DELIVERED#2025-11-19	CUST-002	ORD-103	$89.99
us-east-1	SHIPPED#2025-11-20	CUST-001	ORD-100	$299.99
us-east-1	SHIPPED#2025-11-21	CUST-002	ORD-102	$599.50
us-west-2	PENDING#2025-11-21	CUST-001	ORD-101	$149.00
us-west-2	SHIPPED#2025-11-21	CUST-003	ORD-104	$450.00

Query patterns enabled:

Query	DynamoDB Operation
All SHIPPED orders in us-east-1	`Region = 'us-east-1'` AND `begins_with(StatusDate, 'SHIPPED')`
SHIPPED orders in us-east-1 on Nov 21	`Region = 'us-east-1'` AND `StatusDate = 'SHIPPED#2025-11-21'`
SHIPPED orders in us-west-2 between Nov 20-22	`Region = 'us-west-2'` AND `BETWEEN 'SHIPPED#2025-11-20' AND 'SHIPPED#2025-11-22'`

Trade-offs

Benefits:

Enables complex queries without scanning the entire table
Works with standard DynamoDB query operators (begins_with, BETWEEN, >, <)
No special DynamoDB features required

Drawbacks:

Application code must build the composite string on write and parse it on read
Changing either component of a composite key forces a full delete + insert because the sort key value changes. For example, updating an order’s status from PENDING to SHIPPED changes the GSI sort key from PENDING#2025-11-21 to SHIPPED#2025-11-21, requiring DynamoDB to delete the old entry and insert a new one—costing 2 WCUs instead of 1
Harder to debug in the console—you see SHIPPED#2025-11-21 instead of separate fields
Order matters: you cannot efficiently query by date across all statuses with Status#OrderDate

Attribute projections and write cost

DynamoDB always projects the base table keys plus the GSI keys, then lets you choose KEYS_ONLY, INCLUDE, or ALL to tune read versus write/storage cost.¹

Projection types comparison

Using the Orders table example, here’s what each projection type stores in the GSI:

Projection Type	What’s Stored in GSI	Storage Impact	Use When
`KEYS_ONLY`	Region, OrderDate, CustomerID, OrderID	Smallest (~50 bytes/item)	You only need to identify items, then fetch full details from base table
`INCLUDE` (OrderTotal)	Keys + OrderTotal	Medium (~60 bytes/item)	You frequently need OrderTotal but rarely need other attributes
`ALL`	All attributes from base table	Largest (~120 bytes/item)	Most queries need complete items and you want fastest reads

Projection comparison

Using the RegionDateIndex GSI example:

Attribute	KEYS_ONLY projection	INCLUDE (`OrderTotal`)	ALL projection
Region (GSI PK)	✅	✅	✅
OrderDate (GSI SK)	✅	✅	✅
CustomerID (table PK)	✅	✅	✅
OrderID (table SK)	✅	✅	✅
OrderTotal	❌	✅	✅
Status	❌	❌	✅
ShippingAddress	❌	❌	✅
PaymentMethod	❌	❌	✅

KEYS_ONLY: To get OrderTotal or other attributes, you need an extra GetItem call to the base table
INCLUDE: No extra fetch needed for OrderTotal, but other attributes still require a base table lookup
ALL: No fetches needed for any attribute, but storage cost is ~2x the base table size

Write cost implications

Key principle: All writes pay 1 WCU (base table) + Σ(GSI WCUs)

Example: Updating OrderTotal

Scenario: Update OrderTotal from $299.99 to $350.00 for ORD-100

GSI Configuration	WCUs Consumed	Reason
No GSI	1	Base table only
GSI with KEYS_ONLY	1	OrderTotal not projected, GSI unchanged
GSI with INCLUDE(OrderTotal)	2	Base table (1) + GSI update (1)
GSI with ALL	2	Base table (1) + GSI update (1)

Example: Updating a GSI key attribute

Scenario: Update Region from “us-east-1” to “us-west-2” for ORD-100

GSI Configuration	WCUs Consumed	Reason
GSI with Region as partition key	3	Base table (1) + Delete old GSI entry (1) + Insert new GSI entry (1)

Why 3 WCUs? Because the partition key changed, DynamoDB must:

Delete the old index entry (Region = “us-east-1”)
Create a new index entry (Region = “us-west-2”)
Update the base table

Optimization tips:

Keep indexes lean (KEYS_ONLY or a short INCLUDE) for infrequent queries—the write cost stays close to the base table.
Project the full item (ALL) only when the workload would otherwise re-fetch every item, accepting higher write + storage charges.
Remember the quota: all LSIs and GSIs combined can project at most 100 unique non-key attributes per table.²

Consistency, replication, and back pressure

GSIs are updated asynchronously through an internal replication mechanism. If an index cannot keep up, DynamoDB throttles the base table writes even if the table itself has plenty of capacity—this is GSI back pressure.³

How GSI back pressure works

Base Table Write → Internal Async Replication → GSI Update
     ↓                                               ↓
  Success                                    Too slow / throttled
     ↓                                               ↓
     └───────────────────────────────────────────────┘
                    GSI Back Pressure
               (Base table gets throttled)

Design implications:

Size GSI write capacity separately (or monitor on-demand throughput) so index partitions keep pace.
Watch CloudWatch throttling reasons like IndexWriteProvisionedThroughputExceeded or IndexWriteKeyRangeThroughputExceeded to catch hot GSI partitions early.³
Typical replication lag: GSI updates are asynchronous and typically complete very quickly (often subsecond), but spikes in write traffic or throttling can cause delays. Design UIs to handle eventual consistency where a just-written item may not immediately appear in GSI queries (for example, show optimistic UI updates or “processing” indicators).

Capacity and cost modeling

Partition quotas: Individual partitions (whether in the base table or GSI) have throughput limits that can cause throttling even when aggregate table capacity appears sufficient. A poorly distributed GSI key can become the bottleneck.⁴
Per-index provisioning: Each individual GSI can provision up to 40k RCUs / 40k WCUs (default quota), the same limit that applies to the base table. This is a per-table and per-GSI limit, not an aggregate across all indexes.²
Index counts: Tables can host up to 20 GSIs and 5 LSIs; exceeding those limits requires redesign.²

Writes that affect a GSI (new items, updates to GSI keys or projected attributes, or deletes) pay table WCU + Σ(affected index WCU). Writes that don’t touch GSI keys or projected attributes only consume base table WCUs. Before projecting extra attributes or adding a new GSI, multiply the write frequency by the extra WCUs to see if the win in read efficiency outweighs the new bill.

Cost example: 1 million writes/month

Setup:

Base table: 1M writes/month
On-demand pricing: ~$1.25 per million WRUs (approximate, region-specific, subject to change)
Item size: < 1KB

Configuration	WCUs per Write	Monthly WCUs	Monthly Cost
No GSI	1	1M	~$1.25
1 GSI (write doesn’t affect GSI)	1	1M	~$1.25
1 GSI (INCLUDE, projected attr updated)	1 + 1 = 2	2M	~$2.50
1 GSI (GSI key updated)	1 + 2 = 3	3M	~$3.75
2 GSIs (both affected by write)	1 + 1 + 1 = 3	3M	~$3.75
3 GSIs (all affected by write)	1 + 1 + 1 + 1 = 4	4M	~$5.00

Key insights:

GSI write costs only apply when the write affects the GSI (new item, updated key/projected attributes, or deletes)
Writes that don’t touch GSI key or projected attributes don’t consume GSI WCUs
GSI key changes require delete+insert (2 WCUs) to the GSI, plus 1 WCU for the base table
Costs scale linearly with the number of indexes affected by each write operation

When NOT to use a GSI

Low query frequency: If you only query this access pattern once a month, the ongoing write and storage costs likely exceed the occasional Scan cost
Already near 20 GSI limit: Consider if multiple access patterns can share a single overloaded GSI instead
Write-heavy, read-light workloads: GSIs double (or more) your write costs but provide no benefit if you rarely query them
Data you can precompute elsewhere: If you can aggregate data in a scheduled job and store results in S3 or another table, that may be cheaper than maintaining a real-time GSI

Design tips and common gotchas

Model the new key first: Sketch the GSI’s partition/sort key combinations and ensure they align with uniform traffic. Avoid low-cardinality partitions such as status alone; prepend tenant/account to spread writes.
Leverage sparseness: GSIs automatically exclude items without the indexed attributes, reducing storage costs. Index only items that need the access pattern. For example, if only 10% of orders have a RefundDate, a GSI on RefundDate stores only those items rather than the entire table.
Query operations only: GSIs support only Query and Scan operations. You cannot use GetItem or BatchGetItem on indexes—use the base table for direct item lookups by key.
Backfill impact: Creating a new GSI triggers a full-table scan and backfill that can take hours for large tables. During backfill, the GSI is in CREATING state and cannot be queried. Plan for low-traffic windows and temporarily raise write capacity to avoid throttling your production workload.
UpdateTable throttles other operations: While a GSI is being created or deleted, UpdateTable operations on the same table may be rejected. Plan GSI changes during maintenance windows, not during active development sprints.
Deleting GSIs is permanent: Unlike creating a GSI (which backfills existing data), deleting one immediately removes all index data with no recovery option. Ensure you’ve verified the GSI is truly unused before deletion.
Monitor projected storage: Because GSIs are sparse (items missing the indexed attributes aren’t copied), you can often keep the index significantly smaller than the table by indexing only the subset you care about.
Test failure modes: Simulate hot partitions and verify alarms fire before ProvisionedThroughputExceededException hits production.