Why Using DynamoDB for Relational Data Will Haunt Your Architecture (And What to Use Instead)

Co-authored by StackAdvisor.AI / Reviewed by APPGAMBiT Team

This post was created using APPGAMBiT's content engine, ContentForge.AI. It analyzes online discussions, identifies critical developer topics, and creates detailed technical drafts. Our editorial team reviews, fact-checks, and refines each post before publication. Get in touch to know more.

The fastest way to accumulate technical debt is to force a NoSQL database to behave like a relational one. This comprehensive guide reveals when DynamoDB makes sense—and when you're setting yourself up for years of operational pain.

You've heard the success stories: companies achieving millisecond response times at hyper-scale with DynamoDB. The marketing materials promise infinite scalability, automatic sharding, and worry-free operations. Now, you're building a new application, and DynamoDB seems like the modern, cloud-native choice. After all, why would you use "old-school" relational databases when NoSQL is the future?

Here's the uncomfortable truth: if your data has complex relationships, choosing DynamoDB is likely the most expensive architectural mistake you'll make. Not expensive in terms of AWS bills (though that can happen), but expensive in developer productivity, operational complexity, and the inevitable migration project 18 months down the road.

Key Question: When does it make sense to use DynamoDB, and when are you setting yourself up for months of operational pain? This article provides a decision framework based on real-world production experience.

The problem isn't DynamoDB itself. DynamoDB is probably one of the best databases when used for its intended purpose. The problem is we attracted to DynamoDB's simplicity and scalability story and attempt to adapt it for use cases it was never designed to handle.

What DynamoDB Is Actually Designed For

Before we discuss what DynamoDB should not be used for, it's essential to understand where it shines best. DynamoDB was built from the ground up to solve specific problems at Amazon.com (let's say Internet) scale, and for those use cases, it's unmatched.

Key-Value Access Patterns

Applications that primarily retrieve items by a known key with predictable, well-defined queries. Think user sessions, shopping carts, or user profiles where you always know the user ID.

Consistent Performance at Scale

Applications requiring single-digit millisecond performance regardless of the dataset size. DynamoDB maintains consistent latency whether you have 1GB or 100TB of data.

Known Access Patterns

Usecases where all access patterns can be defined upfront and modeled into your partition and sort keys. No ad-hoc queries or flexible reporting requirements.

Eventual Consistency Tolerance

Use cases where eventual consistency is acceptable, or where you can use strongly consistent reads selectively for critical operations.

Low-ops and Pay-as-go Model

DynamoDB eliminates the need for users to provision, manage, or maintain servers. AWS handles all infrastructure management, including hardware provisioning, operating system updates, and scaling. With its pay-per-request billing model, users only pay for the resources they consume, making it cost-efficient for applications with fluctuating workloads.

Some examples of DynamoDB Use Cases

Gaming Leaderboards

Store player scores with composite keys (GameID + PlayerID). Query patterns are predictable: get player score, get top N players for a game.

// Partition Key: GameID, Sort Key: Score#PlayerID
const params = {
  TableName: 'GameLeaderboard',
  KeyConditionExpression: 'GameID = :gameId',
  ScanIndexForward: false, // Descending order by score
  Limit: 10,
  ExpressionAttributeValues: {
    ':gameId': 'game-12345'
  }
};

Session Management

Store user sessions with TTL for automatic expiration. Access pattern: get session by session ID, update session attributes.

// Partition Key: SessionID
const params = {
  TableName: 'UserSessions',
  Item: {
    SessionID: 'sess-abc123',
    UserID: 'user-456',
    LoginTime: Date.now(),
    TTL: Math.floor(Date.now() / 1000) + 3600, // Expire in 1 hour
    SessionData: { /* session attributes */ }
  }
};

IoT Time-Series Data

Store sensor readings with DeviceID as partition key and timestamp as sort key. Query recent readings for a specific device efficiently.

User Profile Management

Store user profiles where each profile is independent and accessed by UserID. No complex relationships with other entities.

Notice a pattern above? All these use cases involve simple, predictable access patterns where you know the keys you'll query by. If your application doesn't fit this pattern, keep reading.

The Relational Data Anti-Pattern in DynamoDB

Now we arrive at the core of the problem: attempting to model complex relational data in DynamoDB. This is where well-intentioned developers create technical debt for years.

The File Management System Example

Consider a file management system with the following entities and relationships:

Users → own multiple folders
Folders → contain subfolders (nested hierarchy)
Folders → contain multiple files
Files → can be shared with multiple users (many-to-many)
Users → can collaborate on folders (many-to-many)

The Implementation Nightmare

Let's look at how you'd attempt to model this in DynamoDB using single-table design:

// DynamoDB Single-Table Design
// Single table with composite keys
// PK (Partition Key), SK (Sort Key), GSI1PK, GSI1SK for access patterns

const singleTableDesign = {
  TableName: 'FileManagementSystem',
  KeySchema: [
    { AttributeName: 'PK', KeyType: 'HASH' },
    { AttributeName: 'SK', KeyType: 'RANGE' }
  ],
  GlobalSecondaryIndexes: [
    {
      IndexName: 'GSI1',
      KeySchema: [
        { AttributeName: 'GSI1PK', KeyType: 'HASH' },
        { AttributeName: 'GSI1SK', KeyType: 'RANGE' }
      ]
    },
    // More GSIs needed for different access patterns...
  ]
};

// User item
{
  PK: 'USER#user-123',
  SK: 'METADATA',
  EntityType: 'User',
  UserName: 'john.doe',
  Email: 'john@example.com'
}

// Folder item
{
  PK: 'USER#user-123',
  SK: 'FOLDER#folder-456',
  EntityType: 'Folder',
  FolderName: 'Projects',
  ParentFolderID: null,
  GSI1PK: 'FOLDER#folder-456', // For querying folder contents
  GSI1SK: 'METADATA'
}

// File item
{
  PK: 'FOLDER#folder-456',
  SK: 'FILE#file-789',
  EntityType: 'File',
  FileName: 'document.pdf',
  FileSize: 1024000,
  GSI1PK: 'FILE#file-789',
  GSI1SK: 'METADATA'
}

// Sharing relationship (many-to-many nightmare)
{
  PK: 'USER#user-999',
  SK: 'FILE#file-789',
  EntityType: 'FileShare',
  Permission: 'read',
  GSI1PK: 'FILE#file-789',
  GSI1SK: 'USER#user-999' // Reverse lookup
}

Why This Implementation Creates Problems

Query Complexity Explosion: Simple questions require multiple queries and client-side joins:

Question: "Show all files user-123 can access (owned + shared)"

Query user's folders (1 query)
For each folder, query files (N queries)
Query GSI for shared files (1 query)
Merge results in application code
Handle pagination across multiple result sets

Hierarchical Data Nightmare: Nested folder structures require recursive queries or denormalization:

// To get all files under a parent folder (including subfolders):
async function getAllFilesInFolderHierarchy(folderID) {
  const files = [];
  const foldersToProcess = [folderID];
  
  while (foldersToProcess.length > 0) {
    const currentFolder = foldersToProcess.pop();
    
    // Query 1: Get subfolders
    const subfolders = await querySubfolders(currentFolder);
    foldersToProcess.push(...subfolders.map(f => f.FolderID));
    
    // Query 2: Get files in current folder
    const folderFiles = await queryFilesInFolder(currentFolder);
    files.push(...folderFiles);
  }
  
  return files;
  // This could result in dozens or hundreds of queries!
}

Many-to-Many Relationships: Maintaining bidirectional access patterns requires duplicate data and complex update logic:

// Sharing a file with a user requires TWO writes:
async function shareFile(fileID, userID, permission) {
  await dynamodb.transactWrite({
    TransactItems: [
      {
        Put: {
          TableName: 'FileManagementSystem',
          Item: {
            PK: `USER#${userID}`,
            SK: `FILE#${fileID}`,
            EntityType: 'FileShare',
            Permission: permission,
            GSI1PK: `FILE#${fileID}`,
            GSI1SK: `USER#${userID}`
          }
        }
      },
      // Need to update file metadata to track share count
      {
        Update: {
          TableName: 'FileManagementSystem',
          Key: { PK: `FILE#${fileID}`, SK: 'METADATA' },
          UpdateExpression: 'SET SharedCount = SharedCount + :inc',
          ExpressionAttributeValues: { ':inc': 1 }
        }
      }
    ]
  });
}

// Now multiply this complexity across all operations...

Debugging and Troubleshooting: Ad-hoc queries during debugging become impossible:

Customer reports: "I can't find my file"

With SQL: SELECT * FROM files WHERE user_id = ? OR file_id IN (SELECT file_id FROM file_shares WHERE user_id = ?)
With DynamoDB: Write application code to reproduce the issue, or scan the entire table.

Reality Check: The DynamoDB implementation requires 5 Global Secondary Indexes, complex application logic for every query, and still can't efficiently answer many common questions without scanning or multiple round-trips. This is a sign you're using the wrong tool.

PostgreSQL with JSONB: The Best of Both Worlds

If DynamoDB isn't suitable for relational data, what's the alternative? Modern PostgreSQL (especially on AWS via RDS or Aurora) provides the flexibility you might think you need NoSQL for, while maintaining relational integrity where it matters.

Making the Right Choice: Decision Framework

Database selection is one of the most consequential architectural decisions you'll make. Use this framework to guide your choice between DynamoDB and a relational database like PostgreSQL.

Critical Questions to Ask

Can I define all access patterns upfront?

  • DynamoDB if: Yes, and they won't change significantly. You can model your partition and sort keys to support every query you'll need.

  • PostgreSQL if: No, or you anticipate needing ad-hoc queries for analytics, debugging, or evolving business requirements.

How complex are my data relationships?

  • DynamoDB if: Entities are mostly independent (user profiles, sessions). Relationships are simple parent-child (user → orders).

  • PostgreSQL if: Many-to-many relationships, hierarchical data, or queries requiring joins across 3+ entities.

What's my scale and traffic pattern?

  • DynamoDB if: Extreme scale (billions of items), unpredictable traffic spikes, globally distributed users requiring multi-region active-active.

  • PostgreSQL if: Most applications (Aurora handles millions of rows easily). Predictable or steady traffic growth. Single-region or simple replication needs.

Do I need transactional consistency?

  • DynamoDB if: Eventual consistency is acceptable. Transactions limited to single partition key or maximum 25 items.

  • PostgreSQL if: ACID transactions across multiple entities are critical. Financial data, inventory management, or booking systems.

How will I debug and troubleshoot?

  • DynamoDB if: You have comprehensive logging and tracing. Issues can be reproduced programmatically.

  • PostgreSQL if: You need to investigate issues with ad-hoc SQL queries. Customer support frequently needs to look up data relationships.

Decision Tree

DynamoDB Decision Tree

Some Red Flags That Indicate PostgreSQL

If you recognize any of these patterns, strongly consider PostgreSQL:

  • You find yourself planning 5+ Global Secondary Indexes
  • Your data model sketch looks like a web of connections
  • You're denormalizing data and worrying about keeping copies in sync
  • You're writing application code to reconstruct relationships from multiple queries
  • Application stakeholders ask for reports/analytics that require scanning or aggregating across relationships
  • You catch yourself saying "we'll just scan for that query"
  • Your queries require pagination across multiple DynamoDB calls

Possible Hybrid Approach: Using Both

You don't have to choose just one database. Many successful architectures use DynamoDB for what it's good at and PostgreSQL for relational data:

  • Example: E-Commerce Platform

    • DynamoDB: User sessions, shopping carts, product view history
    • PostgreSQL: Product catalog, inventory, orders, customer data, analytics
  • Example: SaaS Application

    • DynamoDB: Real-time activity feeds, notifications, feature flags
    • PostgreSQL: Organizations, users, projects, permissions, billing

Common Pitfalls & Troubleshooting

Now let's look at some of the common pitfalls and troubleshooting issues.

Pitfall 1: The "We'll Add GSIs Later" Trap

  • Problem: Starting with a simple table design thinking you can add GSIs as needed.

  • Reality: Each GSI requires projecting attributes. If your initial design didn't include necessary attributes, you need to backfill all existing items. For large tables, this is expensive and time-consuming.

  • Solution:

    • Design ALL access patterns before implementation
    • If you can't define them all, that's a sign DynamoDB isn't right
    • Consider PostgreSQL where schema evolution is straightforward

Pitfall 2: The Pagination Nightmare

  • Problem: Trying to implement offset-based pagination (page 1, 2, 3...) in DynamoDB.

  • Reality: DynamoDB only supports cursor-based pagination (LastEvaluatedKey). To get page 10, you must scan through pages 1-9. For aggregated results from multiple queries, pagination becomes nearly impossible.

// DynamoDB pagination - can't skip pages
let items = [];
let lastKey = null;

do {
  const result = await dynamodb.query({
    TableName: 'Table',
    ExclusiveStartKey: lastKey,
    Limit: 25
  });
  
  items.push(...result.Items);
  lastKey = result.LastEvaluatedKey;
} while (lastKey);

// PostgreSQL pagination - can skip to any page
const result = await db.query(`
  SELECT * FROM items
  ORDER BY created_at DESC
  LIMIT 25 OFFSET ${(pageNum - 1) * 25}
`);

Pitfall 3: Scan Operations in Production

  • Problem: Using Scan operations for anything beyond one-off maintenance tasks.

  • Reality: Scans consume read capacity proportional to table size, not result size. A scan returning 10 items from a 1M item table consumes capacity for all 1M items.

Cost Example: 
Scanning a 100GB DynamoDB table (on-demand): ~$25 per scan. Running analytics queries 10 times per day: $7,500/month just for scans.

Pitfall 4: Hot Partition Keys

  • Problem: Using a low-cardinality attribute as partition key (e.g., status field with values "active"/"inactive").

  • Reality: DynamoDB partitions data by partition key. If 90% of items have the same partition key value, 90% of traffic goes to one partition, causing throttling despite provisioned capacity.

Cost Optimization Considerations

One of the biggest myths is that DynamoDB is always cheaper than any relational databases. The reality depends entirely on your usage and access patterns.

Hidden Costs to Consider

DynamoDB Hidden Costs

  • Developer time maintaining complex access patterns
  • Data transfer costs for Lambda processing DynamoDB Streams
  • Separate analytics pipeline (Athena, Redshift) for reporting
  • Backup costs (continuous backup or on-demand snapshots)
  • Global table replication costs if multi-region needed

RDS/Aurora Hidden Costs

  • Backup storage (usually minimal with automated retention)
  • Read replicas if needed (Aurora supports 15, RDS supports 5)
  • Performance Insights (optional monitoring)
  • IOPS costs if using io1/io2 volumes (gp3 is usually sufficient)

Cost Optimization Tips

For DynamoDB (If Appropriate)

  • Use on-demand pricing for unpredictable workloads, provisioned for steady traffic
  • Enable TTL to automatically delete expired items (free deletions)
  • Minimize GSIs—each GSI effectively doubles your write costs
  • Use DynamoDB Accelerator (DAX) for read-heavy workloads instead of over-provisioning
  • Archive historical data to S3 instead of keeping in DynamoDB

For RDS/Aurora

  • Start with Aurora Serverless v2 for variable workloads (scales to zero)
  • Use Savings Plans or Reserved Instances for predictable workloads (40-60% savings)
  • Monitor IOPS usage—gp3 volumes offer 3000 IOPS baseline at no extra cost
  • Use Aurora Global Database only if truly multi-region (local read replicas are cheaper)
  • Set appropriate backup retention (7 days is often sufficient, default is 7)

Summary

Database selection is one of the most consequential architectural decisions you will make. Ideally, choose based on your data model and access patterns, not database popularity or perceived scalability benefits.

  • DynamoDB shines for key-value access patterns with predictable queries known upfront. If your queries fit this model, DynamoDB offers unmatched performance and scale.
  • PostgreSQL (RDS/Aurora) is optimal for complex relationships, flexible querying, and evolving schemas. Modern PostgreSQL with JSONB provides NoSQL flexibility where needed.
  • Hybrid approaches work well. Use DynamoDB for sessions, carts, and real-time feeds; use PostgreSQL for core business data with relationships.
  • The cost of choosing wrong database may compounds over time. Developer productivity, operational complexity, and eventual migration costs far exceed any initial performance gains.
  • When in doubt, choose PostgreSQL (or SQL database). It's easier to optimize a relational database than to retrofit relational capabilities into NoSQL.

The AWS database landscape continues to evolve. New offerings like Aurora DSQL aim to bridge some gaps between distributed scale and relational semantics. However, understanding fundamental database paradigms remains essential. No technology can eliminate the need for thoughtful data modeling.

The emergence of Aurora Serverless v2 also demonstrates that "relational databases don't scale" is increasingly a myth. Modern managed relational databases can handle massive scale while maintaining the developer experience and query flexibility that makes applications maintainable.

Before your next database decision, take these steps:

  1. Map out your application's data relationships on paper or a whiteboard
  2. List your top 10 most common query patterns
  3. Identify whether those queries involve joins, aggregations, or hierarchies
  4. If you can't clearly define all access patterns OR they require complex relationships, choose PostgreSQL
  5. If your patterns are simple, predictable, and well-defined, consider DynamoDB

Remember "Months of coding can save hours of planning." — Unknown (but deeply relevant to database selection)

More articles

AWS CodeCommit Returns: What the Reversal Means for You

In an unprecedented reversal, AWS has restored CodeCommit to general availability after deprecating it in July 2024. This decision validates teams that remained committed to AWS-native git repositories while leaving migrated teams questioning their investment.

Read more

How Amazon Q's Agentic AI Solved a Complex SecurityHub Issue in 15 Minutes (That Would Have Taken Days)

A hidden AWS SecurityHub configuration policy, invisible in the console but active at the API level, was silently draining budget. Using Amazon Q, an agentic AI assistant, within just 15 minutes, Q identified the root cause through systematic API introspection, multi-step reasoning, and tool execution.

Read more

Tell us about your project

Our office

  • 425, Avadh Kontina
    Vip Road, Canal Road Corner, near CB patel club
    Surat, Gujarat 395007
    Google Map