Import from S3 | DynamoDB cost optimization

What is DynamoDB S3 Import?

DynamoDB’s native S3 import feature allows you to directly import data from Amazon S3 into a DynamoDB table without consuming any write capacity units (WCUs).¹ This is a purpose-built feature for initial data loads, migrations, and large-scale data ingestion scenarios.

Key benefits:

Zero WCU consumption - Does not use your provisioned or on-demand write capacity
Cost-effective at scale - Dramatically cheaper than writing items individually
No throttling concerns - Runs independently of your table’s write capacity
High throughput - AWS manages parallel loading across partitions

The cost problem with traditional bulk loading

Many teams unknowingly waste significant money when importing large datasets into DynamoDB by using traditional methods that consume expensive write capacity.

Traditional approaches (costly)

Using AWS Glue, EMR, or custom ETL pipelines:

When you use AWS Glue, EMR, Spark jobs, Lambda functions, or custom ETL processes to bulk-load data into DynamoDB, every single record write consumes WCUs from your table. This approach has significant downsides:

Cost implications:

Every item write consumes WCUs - Whether provisioned or on-demand, you pay for every write operation
Compute costs - You also pay for the Glue jobs, EMR clusters, or Lambda executions orchestrating the writes
Throttling risks - Large writes can hit capacity limits, requiring retries and even more compute time
Scaling requirements - Need to provision sufficient write capacity temporarily, then scale down after import

When importing millions or billions of records, the WCU costs alone can be substantial. Traditional ETL-based bulk loading treats each record as an individual write operation, which is the most expensive way to import data at scale.

How it works

The S3 import process is straightforward:

Prepare your data in S3 - Export or store data in supported formats
Configure the import - Specify source bucket, format, and target table settings
AWS handles the import - DynamoDB reads from S3 and loads data in parallel
Monitor progress - Track import status via CloudWatch or the console
Import completes - Data is available in your table, no WCUs consumed

Supported formats:¹

DynamoDB JSON - Native format from DynamoDB exports
Amazon Ion - Binary format for efficient storage
CSV - Common delimited text format

Best practices

To maximize savings and efficiency with S3 import:

Use compressed formats - Amazon Ion or compressed CSV reduces S3 storage costs and transfer time
Pre-process in S3 - Do transformations before import rather than during/after
Leverage S3 Lifecycle policies - Automatically delete source data after successful import
Monitor import metrics - Use CloudWatch to track progress and detect issues

Important limitations

Be aware of these constraints:

Import duration - Very large imports can take hours or days
Table inaccessible during import - Cannot read from or write to the table while import is in progress
Format requirements - Data must be in supported formats (DynamoDB JSON, Ion, CSV)
S3 permissions - Requires appropriate IAM roles for DynamoDB to access your S3 bucket
Imports create new tables only - Up to 50 concurrent imports allowed per account

Resources

Importing data from Amazon S3 to DynamoDB ↩ ↩²