Back to Blog
Web Development9 min readSeptember 5, 2025

Integrating Claude API into Your Next.js Application

Step-by-step guide to integrating the Anthropic Claude API into a Next.js 15 app. Covers streaming, tool use, error handling, and rate limit management in production.

Claude APINext.jsAnthropicAI IntegrationTypeScript
A

Azam

DevOps & AI Consultant

Why Claude for Your Next.js App?

Anthropic's Claude API offers some distinct advantages over OpenAI for web applications: a 200k token context window, strong instruction-following, honest refusal behavior, and competitive pricing on Claude 3.5 Haiku for latency-sensitive use cases. The Anthropic TypeScript SDK is well-typed and straightforward to integrate into a Next.js 15 App Router project.

This guide covers everything from initial setup through production-ready streaming, tool use, and error handling.

Setup and Installation

npm install @anthropic-ai/sdk

# .env.local
ANTHROPIC_API_KEY=sk-ant-...

Never expose your API key to the client. All Claude API calls must go through Next.js API routes or Server Actions.

Basic API Route with Streaming

Use the App Router Route Handler with streaming response for chat applications. The key is using ReadableStream and the Anthropic SDK's streaming API together.

// app/api/chat/route.ts
import Anthropic from '@anthropic-ai/sdk'

const client = new Anthropic()

export async function POST(request: Request) {
  const { messages } = await request.json()

  const stream = await client.messages.stream({
    model: 'claude-3-5-sonnet-20241022',
    max_tokens: 1024,
    system: 'You are a helpful assistant.',
    messages,
  })

  return new Response(
    new ReadableStream({
      async start(controller) {
        for await (const chunk of stream) {
          if (
            chunk.type === 'content_block_delta' &&
            chunk.delta.type === 'text_delta'
          ) {
            controller.enqueue(
              new TextEncoder().encode(chunk.delta.text)
            )
          }
        }
        controller.close()
      },
    }),
    { headers: { 'Content-Type': 'text/plain; charset=utf-8' } }
  )
}

Tool Use (Function Calling)

Claude's tool use API lets the model call functions you define. Define tools with a JSON Schema for their input, then handle the tool call in your route handler.

const response = await client.messages.create({
  model: 'claude-3-5-sonnet-20241022',
  max_tokens: 1024,
  tools: [
    {
      name: 'search_products',
      description: 'Search the product catalog by query',
      input_schema: {
        type: 'object',
        properties: {
          query: { type: 'string', description: 'Search query' },
          category: { type: 'string', enum: ['electronics', 'clothing', 'books'] }
        },
        required: ['query']
      }
    }
  ],
  messages: [{ role: 'user', content: 'Find me a laptop under $1000' }]
})

if (response.stop_reason === 'tool_use') {
  const toolUse = response.content.find(b => b.type === 'tool_use')
  const results = await searchProducts(toolUse.input)
  // Continue conversation with tool result
}

Error Handling and Rate Limits

Wrap all API calls in try/catch and handle Anthropic-specific error types. Implement exponential backoff for rate limit errors.

import Anthropic from '@anthropic-ai/sdk'

export async function callClaude(messages: Anthropic.MessageParam[]) {
  const maxRetries = 3
  for (let attempt = 0; attempt < maxRetries; attempt++) {
    try {
      return await client.messages.create({ ... })
    } catch (error) {
      if (error instanceof Anthropic.RateLimitError) {
        const delay = Math.pow(2, attempt) * 1000
        await new Promise(r => setTimeout(r, delay))
        continue
      }
      if (error instanceof Anthropic.APIError) {
        throw new Error(`Claude API error: ${error.status} ${error.message}`)
      }
      throw error
    }
  }
}

Production Best Practices

  • Use claude-3-5-haiku-20241022 for latency-sensitive features (chat, autocomplete)
  • Use claude-3-5-sonnet-20241022 for quality-sensitive tasks (analysis, generation)
  • Set max_tokens conservatively — you pay for output tokens, and runaway generation is expensive
  • Cache system prompts where possible using Anthropic's prompt caching feature (up to 90% cost reduction on repeated system prompts)
  • Log every request with token counts and response time for cost tracking
  • Implement a per-user rate limit in your application layer before hitting the Anthropic API

Claude integrates cleanly into Next.js 15's server-first architecture. Server Actions and Route Handlers keep your API key secure while enabling streaming UIs that feel instant even on longer responses.

Want to Build This for Your Team?

I help teams implement the patterns and architectures described in these articles. Let's talk about your project.

Book a Free Call