Narrative

Serverless Microservices Architecture

~500 (degrading)→5,000+ (stable)peak rps handled

A monolithic backend on EC2 was struggling with burst traffic — during peak hours it fell over, and during off-hours it sat idle burning money on unused capacity.

AWSServerlessBackend

← Narrative Hub

// the problem

What Was Broken

❌EC2 instances running at <10% utilization 80% of the time
❌Burst traffic caused 503s and timeout errors for users
❌Deployment required SSH-based manual steps — no CI/CD
❌Monthly infrastructure costs disproportionate to actual load

// required fix

Replace EC2 with Lambda + API Gateway for on-demand scaling
Implement DynamoDB for sub-millisecond data access
Add SQS for async processing to smooth out burst peaks
Deploy via Serverless Framework for unified infrastructure-as-code

// solution

How It Was Built

Decomposed the monolith into purpose-built Lambda functions, wired API Gateway routing, implemented DLQ for failed events, and optimized cold starts for critical paths.

Event-Driven Architecture with SQS

Heavy async operations moved to SQS-triggered Lambdas.
📄 serverless.yml

Cold Start Optimization

Profiled cold start times on critical auth paths — 800ms was unacceptable.

Event-Driven Architecture with SQS

Heavy async operations moved to SQS-triggered Lambdas. API stays responsive — requests are accepted immediately, processing happens asynchronously. Dead-letter queues catch and preserve failed events.

serverless.yml

yaml

functions:
  processOrder:
    handler: src/orders/process.handler
    events:
      - sqs:
          arn: !GetAtt OrderQueue.Arn
          batchSize: 10
          functionResponseType: ReportBatchItemFailures

resources:
  Resources:
    OrderQueue:
      Type: AWS::SQS::Queue
      Properties:
        RedrivePolicy:
          deadLetterTargetArn: !GetAtt OrderDLQ.Arn
          maxReceiveCount: 3

Cold Start Optimization

Profiled cold start times on critical auth paths — 800ms was unacceptable. Optimized package sizes with webpack tree-shaking, added provisioned concurrency for the top 3 critical functions.

// results

What Changed

Scaled seamlessly from 0 to 5,000+ RPS during peak. Monthly costs down 65% vs EC2. Zero 503s during traffic spikes.

Peak RPS handled

~500 (degrading)

→

10× capacity

Monthly infra cost

Baseline

→

Pay-per-use

Off-hours idle cost

100% of EC2 rate

→

Scale to zero

"The system now costs nothing when idle and handles any spike without pre-planned capacity. Operational overhead dropped to near zero."

// deep dive q&a

Common Questions

The traffic was highly bursty. Scaling EC2 takes time (even with ASGs), and we were paying for idle capacity during off-hours. Serverless gave us near-instant scaling and scale-to-zero pricing.

We migrated to DynamoDB, which is designed for connectionless, high-throughput serverless environments. We avoided traditional RDS connection pooling issues entirely.

We aggressively tree-shook our Node.js bundles using Webpack, reducing deployment package size. For the most critical user-facing endpoints, we enabled Provisioned Concurrency to keep instances warm.