Integration tests

Status: 🟩 COMPLETE Last updated: 2026-06-19 Plain-English tagline: Tests that wire two or more pieces together — a function plus a real database, an API endpoint plus its request validation, a React component plus its data fetching — and check they cooperate correctly, not just that each works alone.

In plain English

A unit test verifies one function in isolation. A full E2E test drives the whole app from a real browser. An integration test sits in the middle: it exercises a small SLICE of the system end-to-end, with real (or realistic) dependencies, without spinning up the entire app.

Concrete examples:

A test that POSTs to your /api/posts endpoint with real JSON and checks the database actually has the new row
A test that calls a server action and verifies it sets the right cookie + writes the right row
A test that renders a React component that loads data — including its real useEffect + fetch — and verifies what shows up
A test that invokes a function which calls another function which calls the database — all together — to verify they wire up correctly

In each case, multiple pieces talk to each other. A unit test can’t catch wiring bugs because unit tests mock the wiring away. An E2E test catches them but is expensive. Integration tests are the sweet spot.

Why “integration” matters: most real-world bugs aren’t in individual functions; they’re in HOW functions are connected. Wrong types passed across a boundary. Mismatched assumptions about what null means. Forgetting to update a related record. A library upgrade that changed a return shape. These bugs are invisible to isolated unit tests.

The name “integration test” is fuzzy — different teams draw the line differently. Some call any test that touches a database “integration.” Some reserve the term for cross-service tests. Pragmatically: it’s anything broader than a unit test and narrower than a full E2E.

Why it matters

Three concrete payoffs:

They catch wiring bugs. A unit test mocks the database; the integration test uses a real one. Mismatch in expected schema, types, ordering — all caught here.
They verify contracts at boundaries. Your API handler validates input with Zod, calls the database, returns JSON. An integration test exercises that whole pipeline. If you change Zod and forget to update the handler, the test fails.
They’re cheaper than E2E. No browser. No deploys. No real user simulation. Often run in 100ms-2s per test — slower than unit tests but vastly faster than Playwright spinning up Chrome.

The trade-off: integration tests are more expensive than unit tests (set up state, real connections, more code). They can be slower and flakier (real services have real moods). When a slice gets large enough, E2E tests start making sense instead.

Kent C. Dodds’ “Testing Trophy” argues integration tests should be the LARGEST tier, not the middle one — because they catch the most real bugs per unit of effort. Whether you adopt that view fully or stick with the pyramid, integration tests deserve serious investment in any non-trivial codebase.

A concrete example: testing an API route

Suppose you have a Next.js API route that creates posts:

// app/api/posts/route.ts
import { NextRequest, NextResponse } from "next/server";
import { z } from "zod";
import { db } from "@/lib/db";
 
const CreatePostSchema = z.object({
  title: z.string().min(1),
  body: z.string().min(1),
});
 
export async function POST(req: NextRequest) {
  const parsed = CreatePostSchema.safeParse(await req.json());
  if (!parsed.success) {
    return NextResponse.json(
      { error: parsed.error.issues[0]?.message },
      { status: 400 }
    );
  }
 
  const post = await db.posts.create({ data: parsed.data });
  return NextResponse.json(post, { status: 201 });
}

A unit test for this would mock db.posts.create. That’s fine for verifying input validation, but says nothing about whether the database call works.

An integration test uses the real database (or a realistic one):

// app/api/posts/route.test.ts
import { describe, it, expect, beforeEach } from "vitest";
import { POST } from "./route";
import { db } from "@/lib/db";
 
describe("POST /api/posts", () => {
  beforeEach(async () => {
    await db.posts.deleteMany();  // Reset state
  });
 
  it("creates a post given valid input", async () => {
    const req = new Request("http://localhost/api/posts", {
      method: "POST",
      headers: { "content-type": "application/json" },
      body: JSON.stringify({ title: "Hello", body: "World" }),
    });
 
    const res = await POST(req as any);
    expect(res.status).toBe(201);
 
    const body = await res.json();
    expect(body).toMatchObject({ title: "Hello", body: "World" });
    expect(body.id).toBeDefined();
 
    const stored = await db.posts.findUnique({ where: { id: body.id } });
    expect(stored?.title).toBe("Hello");
  });
 
  it("rejects empty titles with a 400", async () => {
    const req = new Request("http://localhost/api/posts", {
      method: "POST",
      headers: { "content-type": "application/json" },
      body: JSON.stringify({ title: "", body: "World" }),
    });
 
    const res = await POST(req as any);
    expect(res.status).toBe(400);
 
    const body = await res.json();
    expect(body.error).toMatch(/title/i);
  });
});

This test verifies the whole stack: Zod validation, database call, response shape, status codes. If ANY part of that pipeline breaks, this single test catches it.

What database to use in integration tests

Three patterns:

1. A real production-shaped database, per test

Spin up a real Postgres (Docker, Testcontainers, ephemeral Supabase project). Each test resets to a known state. Slowest but highest-fidelity. The integration tests reflect reality.

2. A shared dev/test database, with cleanup

One Postgres instance for all tests. Each test cleans up after itself (or uses transactions that get rolled back). Faster than spinning up new DBs but tests must be careful not to interfere.

3. An in-memory or local SQLite

For Prisma users, an in-memory SQLite or a temporary local Postgres can work. Much faster than a real cloud Postgres. Caveat: feature mismatches between SQLite and Postgres can hide bugs.

For Supabase projects: use a dedicated Supabase project as your test database, or run Postgres in Docker locally. Don’t run integration tests against production. Ever.

A common production-realistic pattern with Supabase:

# Run tests against a local Postgres
docker run -d --name testdb -e POSTGRES_PASSWORD=test -p 5433:5432 postgres:16
DATABASE_URL=postgresql://postgres:test@localhost:5433/postgres npx vitest

Each test starts a transaction and rolls back at the end → near-instant state reset.

Component integration tests with real data fetching

A common pattern in modern React testing: render a component INCLUDING its data fetching, mock the network layer (not the component), and verify the rendered output.

// components/PostList.tsx
"use client";
import { useEffect, useState } from "react";
 
export function PostList() {
  const [posts, setPosts] = useState<{ id: string; title: string }[] | null>(null);
 
  useEffect(() => {
    fetch("/api/posts").then(r => r.json()).then(setPosts);
  }, []);
 
  if (posts === null) return <p>Loading…</p>;
  if (posts.length === 0) return <p>No posts yet</p>;
  return (
    <ul>{posts.map(p => <li key={p.id}>{p.title}</li>)}</ul>
  );
}

// components/PostList.test.tsx
import { describe, it, expect, beforeAll, afterAll, afterEach } from "vitest";
import { render, screen, waitFor } from "@testing-library/react";
import { http, HttpResponse } from "msw";
import { setupServer } from "msw/node";
import { PostList } from "./PostList";
 
const server = setupServer();
beforeAll(() => server.listen());
afterEach(() => server.resetHandlers());
afterAll(() => server.close());
 
describe("PostList", () => {
  it("renders posts from the API", async () => {
    server.use(
      http.get("/api/posts", () => HttpResponse.json([
        { id: "1", title: "Hello" },
        { id: "2", title: "World" },
      ])),
    );
 
    render(<PostList />);
    expect(screen.getByText("Loading…")).toBeInTheDocument();
 
    await waitFor(() => {
      expect(screen.getByText("Hello")).toBeInTheDocument();
      expect(screen.getByText("World")).toBeInTheDocument();
    });
  });
 
  it("shows empty state when the API returns no posts", async () => {
    server.use(http.get("/api/posts", () => HttpResponse.json([])));
 
    render(<PostList />);
    await waitFor(() => {
      expect(screen.getByText("No posts yet")).toBeInTheDocument();
    });
  });
});

The key tool: MSW (Mock Service Worker). It intercepts fetch calls at the network layer and returns canned responses. The component’s actual fetch runs; MSW just answers it. This catches bugs that mocking the component itself would miss (wrong endpoint, wrong body shape, race conditions).

For a Next.js + Supabase project, MSW + RTL is the standard component integration testing stack.

Tools and frameworks

Most integration testing in JS uses the SAME runner as your unit tests, just with broader scope:

Tool	Purpose
Vitest / Jest	Test runner (same as unit)
React Testing Library	Component rendering
MSW (Mock Service Worker)	Mock HTTP at the network layer
Testcontainers	Programmatically spin up real databases / services in Docker
Prisma / Drizzle / Supabase test fixtures	Database setup + teardown helpers
Supertest (Node)	Hit Express/Fastify endpoints in-memory
`fetch` against a running dev server	Simple integration approach for HTTP APIs

For Bible Quest-style projects (Next.js + Supabase), the standard stack is Vitest + RTL + MSW + a real local Postgres or test-only Supabase project.

Setup, teardown, and isolation

Integration tests need state. State accumulates. Bad state from one test corrupts the next. Three patterns:

Per-test transactions (cleanest)

Wrap each test in a database transaction that’s rolled back at the end. No data leaks; resets are free.

beforeEach(async () => {
  await db.$executeRaw`BEGIN`;
});
 
afterEach(async () => {
  await db.$executeRaw`ROLLBACK`;
});

Works well with Postgres + Prisma/Drizzle. Some test frameworks (like pytest-asyncio for Python) automate this; in JS it’s manual but worth setting up.

TRUNCATE between tests

Delete all rows in relevant tables before each test:

beforeEach(async () => {
  await db.$executeRaw`TRUNCATE TABLE posts, comments, users RESTART IDENTITY CASCADE`;
});

Slower than transactions but simpler to reason about.

Per-suite fresh database

Spin up a new database for each test file. Costly but maximally isolated.

Pick based on speed and complexity. For most projects, per-test transactions are the right default.

Integration vs E2E — the line

Aspect	Integration	E2E
What’s running	Code + DB + maybe HTTP layer	Real browser + real server + real DB
What’s mocked	External services (Stripe, Anthropic, email)	Minimal — closest to real
Speed	100ms–2s per test	5s–60s per test
Brittleness	Moderate	High (UI flakes, timing)
Where bugs caught	API contracts, DB queries, component data flow	UI flows, browser-specific issues
When run	Every commit	On PR + before deploy

Use integration tests for “does this slice work end-to-end inside my own code?” Use E2E for “does this actually load and click correctly in a real browser?”

In practice, an integration test that exercises POST /api/posts and verifies the database state catches 80% of the bugs that an equivalent E2E test would — at 1/30th the cost.

A specific pattern: testing server actions

For Next.js server actions, integration testing is the right level (E2E is overkill, unit tests need too much mocking):

// app/actions/posts.test.ts
import { describe, it, expect, beforeEach } from "vitest";
import { createPost } from "./posts";
import { db } from "@/lib/db";
import * as auth from "@/lib/auth";
 
describe("createPost server action", () => {
  beforeEach(async () => {
    await db.posts.deleteMany();
    vi.spyOn(auth, "getCurrentUser").mockResolvedValue({ id: "u1", email: "g@example.com" });
  });
 
  it("creates a post for authenticated users", async () => {
    const formData = new FormData();
    formData.set("title", "Test post");
    formData.set("body", "Body");
 
    const result = await createPost(null, formData);
 
    const posts = await db.posts.findMany({ where: { authorId: "u1" } });
    expect(posts).toHaveLength(1);
    expect(posts[0].title).toBe("Test post");
  });
 
  it("returns an error when title is empty", async () => {
    const formData = new FormData();
    formData.set("title", "");
    formData.set("body", "Body");
 
    const result = await createPost(null, formData);
 
    expect(result).toMatchObject({ error: expect.any(String) });
 
    const posts = await db.posts.findMany();
    expect(posts).toHaveLength(0);
  });
});

Mock the auth boundary (the caller’s identity). Use the real DB. Verify both the return value AND the database state.

Common gotchas

Tests against shared databases drift. Two test runs simultaneously hitting the same database race each other. Use per-test transactions, isolated dev DBs per developer, or run tests serially in CI.
Cleanup that fails leaves rotting data. A test crash before cleanup leaves rows that the next run trips over. Always cleanup in beforeEach, not afterEach.
External APIs in integration tests = flaky tests. Real Stripe in your tests means real network calls, real rate limits, real outages. Mock external APIs (MSW), use sandbox environments, or run those tests separately (E2E pipeline).
MSW intercepts at the network boundary. It doesn’t catch code that runs in unexpected ways (e.g. server components that fetch from Node directly bypass MSW). Verify what’s actually being intercepted.
Timing is hard. waitFor polls until a condition is true; default timeout is 1000ms. A slow CI machine can blow past it. Set explicit timeouts.
Test data must be REALISTIC. A test that uses { title: "test" } may pass when production data has unicode, emoji, RTL text, very long strings. Cover edge cases.
Foreign key constraints make cleanup tricky. TRUNCATE posts fails if comments references it. Use CASCADE or delete in dependency order.
Transactions break some database features. Postgres advisory locks, sequences, certain extensions behave differently inside long-running transactions. Mostly fine but be aware.
beforeAll for shared expensive setup; beforeEach for per-test state. Setting up MSW happens in beforeAll; resetting database state happens in beforeEach. Don’t conflate them.
Component tests with real fetch need a fetch polyfill. JSDOM has fetch in recent versions. Older configs may need undici or cross-fetch.
Server components are hard to integration-test in isolation. A Next.js Server Component runs in the framework’s render pipeline. Direct invocation is awkward. Either E2E test the page or extract logic into testable utilities.
Don’t mock your own code. Mock external boundaries (database, third-party APIs, file system). Mocking your own functions defeats the purpose of integration testing.
Database connection limits matter. Each parallel test runner opens connections. 10 parallel runners × 10 tests each can exhaust a small Postgres connection pool. Use connection pooling or run tests serially.
process.env changes mid-test affect global state. If a test mutates env vars, it affects subsequent tests. Restore in afterEach.
Time-sensitive integration tests are fragile. “Created in the last 5 seconds” passes locally and fails in CI when the test runs slower. Use fake timers or generous tolerances.
CI machines have different performance characteristics. A test that’s fast on your M3 MacBook can be 5x slower on a small CI runner. Set generous timeouts but with upper bounds.
Errors in setup are catastrophic. A failure in beforeAll skips all tests in the suite, often with a confusing error. Log loudly in setup; assert prerequisites explicitly.
Snapshot tests for component output are usually a smell. A 100-line DOM snapshot fails on any tiny change. Prefer behavioral assertions (getByText, getByRole).
Test isolation means no shared module state. A module-level singleton (DB client, fetch instance, cache) survives across tests and leaks state. Reset or recreate between tests.
The integration test that’s actually a unit test in disguise. If you mock everything and just call one function, it’s a unit test. Real integration tests touch multiple boundaries.
Don’t integration-test what unit tests cover. If formatCurrency has 7 unit tests, your integration test doesn’t need to cover every currency case again. Test that it’s INVOKED correctly.
Failure messages should help debug. A test that asserts expect(result).toBeTruthy() and fails tells you nothing. Use specific assertions: expect(result).toBe(42), expect(post.title).toEqual("Hello").
Don’t catch exceptions in integration tests. Letting a real error throw to the test runner gives you the stack trace. Catching it and wrapping in your own error obscures the bug.
Run integration tests in a separate CI job. They’re slower than units. Splitting them lets unit tests fail fast and gives integration tests their own log surface.

When integration tests are wrong tier

Sometimes a unit test is enough:

Pure function with no external dependencies → unit
Component that doesn’t fetch data → unit
Single-purpose utility (formatter, validator) → unit

Sometimes E2E is required:

Multi-page user flow (signup → email confirmation → first action) → E2E
Browser-specific behavior (clipboard, geolocation, file upload UX) → E2E
“Does the deployed app work?” → E2E

Most cases that look like “I need to test how things fit together” are integration territory.

Sources

Kent C. Dodds — The Testing Trophy — argues for integration as the largest tier
Mock Service Worker (MSW) — network-layer mocking
Testing Library docs — component testing in any framework
Vitest docs
Testcontainers — programmatic Docker-based real dependencies
Next.js — Testing — framework-specific patterns
Martin Fowler — IntegrationTest

Tech & AI, Explained

Explorer

integration-tests