Import from S3

Use DynamoDB's native S3 import to load massive datasets without consuming write capacity units (WCUs).

What is DynamoDB S3 Import?

DynamoDB’s native S3 import feature allows you to directly import data from Amazon S3 into a DynamoDB table without consuming any write capacity units (WCUs).1 This is a purpose-built feature for initial data loads, migrations, and large-scale data ingestion scenarios.

Key benefits:

  • Zero WCU consumption - Does not use your provisioned or on-demand write capacity
  • Cost-effective at scale - Dramatically cheaper than writing items individually
  • No throttling concerns - Runs independently of your table’s write capacity
  • High throughput - AWS manages parallel loading across partitions

The cost problem with traditional bulk loading

Many teams unknowingly waste significant money when importing large datasets into DynamoDB by using traditional methods that consume expensive write capacity.

Traditional approaches (costly)

Using AWS Glue, EMR, or custom ETL pipelines:

When you use AWS Glue, EMR, Spark jobs, Lambda functions, or custom ETL processes to bulk-load data into DynamoDB, every single record write consumes WCUs from your table. This approach has significant downsides:

Cost implications:

  • Every item write consumes WCUs - Whether provisioned or on-demand, you pay for every write operation
  • Compute costs - You also pay for the Glue jobs, EMR clusters, or Lambda executions orchestrating the writes
  • Throttling risks - Large writes can hit capacity limits, requiring retries and even more compute time
  • Scaling requirements - Need to provision sufficient write capacity temporarily, then scale down after import

When importing millions or billions of records, the WCU costs alone can be substantial. Traditional ETL-based bulk loading treats each record as an individual write operation, which is the most expensive way to import data at scale.

How it works

The S3 import process is straightforward:

  1. Prepare your data in S3 - Export or store data in supported formats
  2. Configure the import - Specify source bucket, format, and target table settings
  3. AWS handles the import - DynamoDB reads from S3 and loads data in parallel
  4. Monitor progress - Track import status via CloudWatch or the console
  5. Import completes - Data is available in your table, no WCUs consumed

Supported formats:1

  • DynamoDB JSON - Native format from DynamoDB exports
  • Amazon Ion - Binary format for efficient storage
  • CSV - Common delimited text format

Best practices

To maximize savings and efficiency with S3 import:

  1. Use compressed formats - Amazon Ion or compressed CSV reduces S3 storage costs and transfer time
  2. Pre-process in S3 - Do transformations before import rather than during/after
  3. Leverage S3 Lifecycle policies - Automatically delete source data after successful import
  4. Monitor import metrics - Use CloudWatch to track progress and detect issues

Important limitations

Be aware of these constraints:

  • Import duration - Very large imports can take hours or days
  • Table inaccessible during import - Cannot read from or write to the table while import is in progress
  • Format requirements - Data must be in supported formats (DynamoDB JSON, Ion, CSV)
  • S3 permissions - Requires appropriate IAM roles for DynamoDB to access your S3 bucket
  • Imports create new tables only - Up to 50 concurrent imports allowed per account

Resources

Footnotes

  1. Importing data from Amazon S3 to DynamoDB 2