AWS Lambda in Production: Patterns for Serverless APIs

When Serverless Makes Sense

AWS Lambda is a strong fit for workloads with highly variable traffic — event processing, webhook handlers, scheduled jobs, and APIs with traffic spikes separated by quiet periods. You pay only for execution time, scaling is automatic, and there is no infrastructure to manage. It is a poor fit for long-running processes, applications with consistent high traffic (EC2 is cheaper above a certain request rate), or workloads requiring persistent connections.

Function Structure for Production

Move all initialisation code (database connections, SDK clients, loaded configuration) outside the handler function. Lambda reuses the execution environment across multiple invocations — code outside the handler runs once on cold start, not on every invocation.

import { DynamoDBClient } from '@aws-sdk/client-dynamodb'
import { DynamoDBDocumentClient } from '@aws-sdk/lib-dynamodb'

// Initialised once per container — reused across invocations
const client = new DynamoDBClient({ region: 'us-east-1' })
const db = DynamoDBDocumentClient.from(client)

export const handler = async (event: APIGatewayProxyEvent) => {
  try {
    const { userId } = JSON.parse(event.body ?? '{}')
    const user = await db.get({ TableName: 'Users', Key: { pk: userId } })
    return {
      statusCode: 200,
      body: JSON.stringify(user.Item),
      headers: { 'Content-Type': 'application/json' },
    }
  } catch (error) {
    console.error(JSON.stringify({ error, event }))
    return { statusCode: 500, body: JSON.stringify({ error: 'Internal error' }) }
  }
}

Cold Start Optimisation

Cold starts (spinning up a new execution environment) add 200ms–2s of latency to the first invocation. Minimise them with these techniques:

Reduce bundle size: Use esbuild to tree-shake and minify. Each MB of bundle adds ~10ms of cold start time. Target under 5MB.
Use ES modules: Native ESM in Node.js 20 Lambda runtime loads faster than CommonJS
Lazy-load heavy dependencies: Import large libraries inside the handler on first use, not at module level
Provisioned concurrency: Keep a pool of warm instances for latency-sensitive endpoints — eliminates cold starts at a fixed cost

# serverless.yml / SAM template
Properties:
  FunctionName: critical-api
  ProvisionedConcurrencyConfig:
    ProvisionedConcurrentExecutions: 5  # 5 always-warm instances

Lambda Layers for Shared Dependencies

Lambda Layers let multiple functions share common dependencies (the AWS SDK, your internal libraries) without bundling them into each function. This reduces deployment package size and enables updating a shared library without redeploying every function.

# Create a layer with shared dependencies
mkdir -p layer/nodejs
cd layer/nodejs && npm install @anthropic-ai/sdk axios
cd ../.. && zip -r shared-layer.zip layer/

aws lambda publish-layer-version   --layer-name shared-deps   --zip-file fileb://shared-layer.zip   --compatible-runtimes nodejs20.x

# Reference in your function configuration
Layers:
  - !Sub arn:aws:lambda:${AWS::Region}:${AWS::AccountId}:layer:shared-deps:3

Error Handling and Dead Letter Queues

For async invocations (SNS, SQS triggers), Lambda retries failed invocations automatically. Configure a Dead Letter Queue to capture events that fail after all retries, so no data is silently lost.

Properties:
  FunctionName: event-processor
  DeadLetterConfig:
    TargetArn: !GetAtt DLQ.Arn
  ReservedConcurrentExecutions: 10  # prevent runaway scaling

Resources:
  DLQ:
    Type: AWS::SQS::Queue
    Properties:
      MessageRetentionPeriod: 1209600  # 14 days

Set up a CloudWatch alarm on DeadLetterQueue.ApproximateNumberOfMessagesVisible > 0 to get paged when events start failing.

Cost Control

Set ReservedConcurrentExecutions on every function to prevent one runaway function from consuming your entire Lambda concurrency quota
Use ARM64 (Graviton) architecture — same performance as x86, 20% cheaper per GB-second
Right-size memory: Lambda allocates CPU proportionally to memory. Run the AWS Lambda Power Tuning tool to find the memory setting that minimises cost for your function's execution pattern
Use SQS batching for high-throughput event processing — processing 10 records per invocation costs 10x less than 1 record per invocation

EventSourceMapping:
  EventSourceArn: !GetAtt Queue.Arn
  BatchSize: 10
  MaximumBatchingWindowInSeconds: 5
  FunctionResponseTypes:
    - ReportBatchItemFailures  # partial batch success — don't retry successful records

AWS Lambda in Production: Patterns for Serverless APIs

When Serverless Makes Sense

Function Structure for Production

Cold Start Optimisation

Lambda Layers for Shared Dependencies

Error Handling and Dead Letter Queues

Cost Control

Bookt.dk — Danish Salon Booking

Kubernetes for LLM Inference: Scaling AI Workloads

AWS Infrastructure for AI Workloads: The Complete Setup

Want to Build This for Your Team?