Testing Python

Overview

Note: For async tests and FastAPI integration (e.g., using httpx.AsyncClient or fastapi.testclient.TestClient), follow the FastAPI testing patterns and fixtures defined in this repo (e.g., shared test_app or client fixtures, event loop fixtures, etc.).

Writing Tests

AAA + Traversal Rules Adopt Arrange–Act–Assert with soft-style assertions and extracted traversals/conditions:

Arrange, Act, Assert in order; keep test bodies linear and readable.
Extract traversal and conditions into helpers (generators or pure functions).
Use “soft-style” assertions for multiple checks by collecting failures and asserting once at the end of the test; keep all assertion logic in the test body.
Avoid if statements in the test body; encode branching inside traversal helpers.
Use loops in the test only to iterate over traversal outputs (no ad-hoc iteration over raw nested structures).
Prefer parameterized tests (@pytest.mark.parametrize) to cover scenarios.
Group initialization/related tests with nested classes or modules; use fixtures (@pytest.fixture) and autouse fixtures instead of per-test setup where possible.

Example — traversal + soft-style assertions (Python/pytest-idiomatic):

import pytest

class NumberGeneratorService:
    def __init__(self, *, count: int, size: int, min: int, max: int) -> None:
        self.count = count
        self.size = size
        self.min = min
        self.max = max

    def generate_arrays(self) -> list[list[int]]:
        # Dummy implementation for illustration only
        import random

        result: list[list[int]] = []
        for _ in range(self.count):
            arr: list[int] = []
            while len(arr) < self.size:
                value = random.randint(self.min, self.max)
                if value not in arr:
                    arr.append(value)
            result.append(arr)
        return result


def number_generator_test_cases() -> list[tuple[int, int, int, int]]:
    return [
        (1, 5, 0, 10),
        (5, 5, 0, 10),
        (3, 1, 0, 10),
        (3, 5, 0, 10),
        (3, 1, 0, 0),
        (3, 5, 5, 10),
    ]


from typing import Generator, Iterable


def array_stream(result_arrays: Iterable[list[int]]) -> Generator[dict, None, None]:
    for array_index, array in enumerate(result_arrays):
        yield {"array_index": array_index, "array": array}


def value_stream(result_arrays: Iterable[list[int]]) -> Generator[dict, None, None]:
    for array_index, arr in enumerate(result_arrays):
        for value_index, value in enumerate(arr):
            yield {
                "array_index": array_index,
                "value_index": value_index,
                "value": value,
            }


@pytest.mark.parametrize(
    "count, size, range_min, range_max",
    number_generator_test_cases(),
    ids=lambda p: str(p),
)
def test_number_generator_service(count: int, size: int, range_min: int, range_max: int) -> None:
    # Arrange
    service = NumberGeneratorService(
        count=count,
        size=size,
        min=range_min,
        max=range_max,
    )

    # Act
    result = service.generate_arrays()

    # Assert (soft-style: collect all failures, assert once)
    errors: list[str] = []

    if len(result) != count:
        errors.append(f"expected {count} arrays, got {len(result)}")

    for item in array_stream(result):
        array_index = item["array_index"]
        array = item["array"]

        if len(array) != size:
            errors.append(f"array[{array_index}] length expected {size}, got {len(array)}")

    for item in value_stream(result):
        array_index = item["array_index"]
        value_index = item["value_index"]
        value = item["value"]

        if value < range_min:
            errors.append(
                f"min violation at [{array_index}][{value_index}]: "
                f"value {value} < {range_min}"
            )
        if value > range_max:
            errors.append(
                f"max violation at [{array_index}][{value_index}]: "
                f"value {value} > {range_max}"
            )

    for item in array_stream(result):
        array_index = item["array_index"]
        array = item["array"]
        distinct = len(set(array))
        if distinct != len(array):
            errors.append(
                f"duplicate values in array[{array_index}]: "
                f"{array} (distinct={distinct}, len={len(array)})"
            )

    assert not errors, ";\n".join(errors)

Notes:

Assertions remain in the test; traversal helpers only expose structure (array_stream, value_stream).
“Soft-style” behavior is implemented by collecting all failures into errors and asserting once; this surfaces all violations in one test run instead of failing fast on the first mismatch.
Use @pytest.mark.parametrize when each tuple should produce a separate test; use shared fixtures when multiple tests need the same setup or FastAPI app/client.
Prefer generator functions (def [ELIDED] -> Generator[ELIDED]) for traversals over building large intermediate lists to keep memory usage low and intent clear.
For FastAPI routes, follow the same AAA and traversal principles when asserting on JSON responses, headers, and status codes (e.g., traverse response payloads via helpers instead of inline nested loops/ifs in the test body).

Soft vs. Hard Assertions (pytest) Prefer “soft-style” assertions (aggregate failures) for pytest tests in this project.

Soft-style assertions collect all failures and report them together at the end of the test, improving triage.
Hard assertions (single assert / pytest.fail) are allowed only when an immediate fail-fast is essential (e.g., validating a precondition before an expensive or destructive step).
Recommended pattern for soft-style assertions: collect error messages in a list and assert once at the end.

Example — preferred soft-style pattern:

def test_user_model_soft_style() -> None:
    user = get_user_from_db()

    errors: list[str] = []

    if user.name != "Jane":
        errors.append(f"expected name 'Jane', got {user.name!r}")

    if user.age <= 18:
        errors.append(f"expected age > 18, got {user.age}")

    assert not errors, ";\n".join(errors)

Example — allowed but use sparingly (hard fail-fast):

def test_user_model_precondition() -> None:
    user = get_user_from_db()

    # Fail-fast if user is missing entirely; following checks depend on this.
    assert user is not None

    # The rest can use soft-style if multiple conditions are checked.

Static analysis / lint rule nuance:

Some linters (e.g., ruff, flake8 plugins, Sonar) may expect at least one direct assert or pytest assertion per test and may not “see” your soft-style pattern if you hide everything behind helpers.
If a false positive appears for a test that clearly uses the soft-style pattern correctly, add a targeted disable comment at the top of the file or near the test:

# noqa: S101  (or the specific rule, e.g. sonar rule ID / ruff code)

Use this sparingly and only when the soft-style aggregation is used correctly and intentionally.

Async Test Patterns (FastAPI + pytest + httpx/clients) For async tests (FastAPI endpoints, async services, async DB calls), distinguish between:

Async operations that produce values (e.g., await client.get([ELIDED]), await service.do_work()).
Plain value assertions (synchronous assert on the result).

Good patterns:

import pytest
from httpx import AsyncClient
from myapp.main import app  # FastAPI app


@pytest.mark.asyncio
async def test_healthcheck(async_client: AsyncClient) -> None:
    # Good: await the HTTP call, then assert synchronously
    response = await async_client.get("/health")
    errors: list[str] = []

    if response.status_code != 200:
        errors.append(f"expected 200, got {response.status_code}")

    data = response.json()
    if data.get("status") != "ok":
        errors.append(f"expected status 'ok', got {data.get('status')!r}")

    assert not errors, ";\n".join(errors)

@pytest.mark.asyncio
async def test_service_async_call(service: "MyService") -> None:
    # Await the producer, not the assertion
    result = await service.compute()

    errors: list[str] = []
    if result.total <= 0:
        errors.append(f"expected positive total, got {result.total}")
    if "summary" not in result.metadata:
        errors.append("missing 'summary' in metadata")

    assert not errors, ";\n".join(errors)

Avoid these patterns:

@pytest.mark.asyncio
async def test_bad_unawaited_call(async_client: AsyncClient) -> None:
    # ❌ Forgetting to await async operation: response is a coroutine, not a Response
    response = async_client.get("/health")  # missing await
    # assert response.status_code == 200  # will fail in confusing ways

@pytest.mark.asyncio
async def test_hiding_coroutines(async_client: AsyncClient) -> None:
    # ❌ Passing coroutines into helpers without awaiting inside the helper
    def check_response(resp) -> list[str]:
        # resp is a coroutine here, not the actual Response
        errors: list[str] = []
        # Any attribute access will be wrong
        return errors

    response = async_client.get("/health")  # missing await
    errors = check_response(response)
    assert not errors

Rule of thumb:

If the subject is an async operation (HTTP call, DB call, background task, etc.), always await it before asserting:
- response = await async_client.get("/path")
- result = await service.compute()
Assertions themselves are synchronous: assert on values, not on coroutines:
- assert response.status_code == 200
- Use the soft-style error aggregation pattern when you have multiple conditions.

Common pitfalls that lead to “hanging” or flaky async tests:

Forgetting to await async calls (e.g., client.get([ELIDED]) without await).
Spawning background tasks (asyncio.create_task, FastAPI background tasks, websockets) that keep running after the test ends.
Long/never-resolving waits (await asyncio.sleep with large values, await queue.get() without a producer).
Leaving open resources (unclosed AsyncClient, DB connections, server processes) when not using properly scoped fixtures.

Quick debugging tips:

Temporarily reduce timeouts in your app or client configuration for tests (e.g., HTTP client timeout).
Add explicit timeouts for awaits that depend on external systems or background work.
Search for:
- Unawaited coroutines (often visible as warnings in test output).
- Long sleeps / waits and queues without producers.
Use pytest’s verbose mode (-vv) and, if available, logging in your FastAPI app to see which request/operation was last started before the test stalled.

Integration Tests Integration tests validate behavior across multiple layers of the application: FastAPI routes, service layer, persistence/DB, background tasks, and infrastructure boundaries.

Categories:

API-level tests: FastAPI route handlers + service layer + DB + dependencies.
Service-level integration tests: service logic + DB or external components (e.g., cache, message queue).
Middleware integration tests: custom middleware for auth, rate limiting, request shaping, ID injection, locale inference, feature flags, etc.

Directory structure (example):

tests/integration/api/
tests/integration/services/
tests/integration/middleware/

Tools:

pytest
httpx.AsyncClient or fastapi.testclient.TestClient
SQLAlchemy test session / transaction rollbacks
Testcontainers for real DBs or an in-memory SQLite DB for lightweight tests
Fixtures for app, DB, and clients

Naming: *.int.test.py

Target performance: < 200ms per test (when using in-memory DB or Testcontainers with reuse). Coverage target: ~70% (lower than unit tests).

See tests/integration/README.md for patterns and examples if present.

Automated Migration (from hard to soft-style assertions) A codemod may convert direct assert statements into “soft-style” aggregated checks by:

Wrapping multiple individual conditions into a list of errors.
Preserving await for async operations.
Ensuring the final line asserts once on the aggregated errors.

Run across unit, integration, and E2E tests:

uv run python tools/codemod_expect_soft.py

After running, manually review async tests to ensure:

All HTTP calls, DB calls, and async operations are awaited.
Assertions on plain values use the soft-style pattern:
- value = await fn()
- errors.append([ELIDED]) for each condition
- assert not errors, ";\n".join(errors)
Async functions never wrap assertions inside unawaited coroutines.

Nested blocks for initialization and related tests:

import pytest
from httpx import AsyncClient
from fastapi import FastAPI

@pytest.mark.asyncio
class TestDatabase:
    @pytest.fixture(scope="class", autouse=True)
    async def setup_db(self):
        # Establish connection
        client = await create_db_client()
        yield client
        await client.close()

    class TestUserRepository:
        @pytest.mark.asyncio
        async def test_creates_and_fetches_user(self, setup_db):
            errors: list[str] = []

            # Arrange / Act via helpers[ELIDED]
            # push to errors for any violations

            assert not errors, ";\n".join(errors)

Test Structure (AAA)

Follow the Arrange–Act–Assert sequence strictly:

import pytest
from myapp.logic import my_function

def test_my_function_returns_expected_result():
    # Arrange
    input_value = "test"
    expected = "TEST"

    # Act
    result = my_function(input_value)

    # Assert
    assert result == expected

For integration tests, expand the same structure:

@pytest.mark.asyncio
async def test_get_user_success(async_client: "AsyncClient"):
    # Arrange
    user_id = await seed_user(async_client, name="Jane")

    # Act
    response = await async_client.get(f"/users/{user_id}")
    data = response.json()

    # Assert (soft-style)
    errors: list[str] = []

    if response.status_code != 200:
        errors.append(f"expected 200, got {response.status_code}")

    if data.get("name") != "Jane":
        errors.append(f"expected name 'Jane', got {data.get('name')!r}")

    assert not errors, ";\n".join(errors)

Patterns:

Tests remain linear and readable.
Traversals or cross-record checks should be extracted into helpers (generators or pure functions).
Assertions stay in the test body; helpers return structures or iterables, not pass/fail results.
Prefer parameterized integration scenarios when validating multiple cross-layer flows.

This aligns integration testing for FastAPI/pytest with the same AAA, soft-style, and traversal principles used across the test suite.

Test naming (pytest)

Use descriptive names that explain:

What is being tested
Under what conditions
What the expected outcome is

# ✅ Good
def test_raises_error_when_input_is_none():
    [ELIDED]

def test_returns_empty_list_when_no_items_match_filter():
    [ELIDED]

# ❌ Bad
def test_works():
    [ELIDED]

def test_test_1():
    [ELIDED]

General patterns that work well in Python:

test_<method>_when_<condition>_then_<expected>()
test_<thing_under_test>_<expected_behavior>()

Test organization

Rough equivalent of nested describe blocks is:

Test modules per unit (e.g. test_user_service.py)
Classes per method or feature group
Test functions per behavior

# tests/unit/test_user_service.py

class TestCreateUser:
    def test_creates_user_with_valid_data(self):
        [ELIDED]

    def test_raises_error_when_email_invalid(self):
        [ELIDED]

    def test_hashes_password_before_saving(self):
        [ELIDED]


class TestDeleteUser:
    def test_deletes_user_by_id(self):
        [ELIDED]

    def test_raises_error_when_user_not_found(self):
        [ELIDED]

You can also keep it flat if you prefer:

def test_create_user_with_valid_data():
    [ELIDED]

def test_create_user_raises_error_when_email_invalid():
    [ELIDED]

def test_create_user_hashes_password_before_saving():
    [ELIDED]

def test_delete_user_by_id():
    [ELIDED]

def test_delete_user_raises_error_when_user_not_found():
    [ELIDED]

Running tests

Assuming a standard layout like:

src/your_app/ or your_app/
tests/unit/
tests/integration/
tests/e2e/

and using pytest (+ pytest-cov for coverage).

Unit, integration, and all tests

# Unit tests (co-located or under tests/unit/)
pytest tests/unit

# Integration tests
pytest tests/integration

# All tests
pytest

Watch mode is not built into pytest, but you can use ptw (pytest-watch) or pytest-testmon if you want that behavior.

With coverage (pytest-cov)

# Unit test coverage
pytest tests/unit --cov=your_app --cov-report=term-missing

# Integration test coverage
pytest tests/integration --cov=your_app --cov-report=term-missing

# All tests with coverage
pytest --cov=your_app --cov-report=term-missing

Run a specific test file

pytest tests/unit/test_user_service.py

Run a specific test or class

# Single test function
pytest tests/unit/test_user_service.py::test_creates_user_with_valid_data

# Single test class
pytest tests/unit/test_user_service.py::TestCreateUser

Run tests matching a pattern (similar to `--grep`)

# Match by test name substring / expression
pytest -k "UserService"
pytest -k "create_user and error"

Run tests for a specific package / submodule

# If tests are organized by package
pytest tests/unit/your_app/shared_types
# Or by file pattern
pytest tests -k "shared_types"

E2E tests

If you keep E2E tests under tests/e2e/ (e.g. using Playwright for Python or another E2E tool):

pytest tests/e2e

Or use the specific runner for your E2E framework if it is not pytest-based; just mirror the structure:

E2E tests live at tests/e2e/
E2E config in the root (e.g. playwright.config.py or equivalent)

Test filtering (focus / skip)

Pytest’s equivalents to it.only, it.skip, etc.:

Run only this test (focus)

Simplest approach: use node ids or -k:

pytest tests/unit/test_user_service.py::TestCreateUser::test_creates_user_with_valid_data

pytest -k "creates_user_with_valid_data"

You can also use markers like @pytest.mark.focus and run pytest -m focus, if you define such a convention.

Skip a test

import pytest


@pytest.mark.skip(reason="not implemented yet")
def test_should_skip_this_test():
    [ELIDED]

Conditional skip:

@pytest.mark.skipif(condition, reason="explanation")
def test_skipped_on_condition():
    [ELIDED]

Mark a test as expected to fail

Roughly analogous to “this is currently broken”:

@pytest.mark.xfail(reason="known bug, tracking in ISSUE-123")
def test_currently_failing_behavior():
    [ELIDED]

Skip or focus groups of tests (describe-level equivalent)

Use class-level decorators:

import pytest


@pytest.mark.skip(reason="UserService tests temporarily disabled")
class TestUserService:
    def test_creates_user_with_valid_data(self):
        [ELIDED]

    def test_raises_error_when_email_invalid(self):
        [ELIDED]

Or run a specific class via node id as shown earlier instead of describe.only:

pytest tests/unit/test_user_service.py::TestUserService

Test coverage (Python)

Viewing coverage

# Generate coverage report (all tests)
coverage run -m pytest

# HTML coverage report
coverage html

# Open HTML coverage report (macOS)
open htmlcov/index.html
# Linux (example)
xdg-open htmlcov/index.html
# Windows (PowerShell)
start htmlcov\index.html

LLM coverage input: JSON per package/module

coverage.py can emit JSON directly:

# For a web app package
coverage run -m pytest apps/web
coverage json -o apps/web/coverage/coverage-final.json

# For shared types package
coverage run -m pytest packages/shared_types
coverage json -o packages/shared_types/coverage/coverage-final.json

# For query package
coverage run -m pytest packages/query
coverage json -o packages/query/coverage/coverage-final.json

Quick sanity check (similar to the jq no-op):

cat apps/web/coverage/coverage-final.json | jq . > /dev/null
cat packages/shared_types/coverage/coverage-final.json | jq . > /dev/null
cat packages/query/coverage/coverage-final.json | jq . > /dev/null

Optionally merge coverage for a single LLM input

Preferred: use coverage combine and then export JSON:

# Run coverage separately and keep .coverage files, e.g.
# apps/web/.coverage
# packages/shared_types/.coverage
# packages/query/.coverage

coverage combine \
  apps/web \
  packages/shared_types \
  packages/query

coverage json -o coverage/coverage-final-merged.json

Or merge JSONs yourself (example with jq, similar to your original):

jq -s 'reduce .[] as $item ({}; . * $item)' \
  apps/web/coverage/coverage-final.json \
  packages/shared_types/coverage/coverage-final.json \
  packages/query/coverage/coverage-final.json \
  > coverage/coverage-final-merged.json

Coverage goals

Unit tests: aim for ≥ 80% (branches, functions, lines, statements)
Integration tests: aim for ≥ 70%
Overall target: 80%
Prioritize critical business logic; don’t chase 100% if it adds little value.
Even “types-first” or “schema-first” packages should use the same thresholds if they have runtime constructs (enums, helpers, validators) that execute at runtime and thus show up in coverage.

What to test

✅ Do test:

Business logic and algorithms
Edge cases and error conditions
Public APIs and interfaces
Data transformations
Validation logic

❌ Don’t test:

Third-party libraries
Trivial getters/setters or dataclass boilerplate
Framework internals (Django/Flask/FastAPI internals, etc.)
Pure configuration files

Mocking (functions, modules, timers) in Python

Using unittest.mock (works with pytest and unittest).

Mocking functions

from unittest.mock import Mock

def test_mock_function():
    # Basic mock
    mock_fn = Mock()
    mock_fn.return_value = "mocked value"

    result = mock_fn("arg")

    mock_fn.assert_called_with("arg")
    mock_fn.assert_called_once()
    assert result == "mocked value"


def test_mock_with_implementation():
    mock_fn = Mock(side_effect=lambda x: x * 2)

    assert mock_fn(3) == 6
    mock_fn.assert_called_with(3)

If you use pytest-mock, you can also do:

def test_with_mocker(mocker):
    mock_fn = mocker.Mock(return_value="mocked")
    mock_fn("arg")
    mock_fn.assert_called_once_with("arg")

Mocking modules / functions in modules

from unittest.mock import patch

# api.py
# def fetch_user(user_id): [ELIDED]

@patch("myapp.api.fetch_user")
def test_fetch_user(mock_fetch_user):
    mock_fetch_user.return_value = {"id": 1, "name": "John"}

    from myapp.service import get_user  # imports inside test to avoid import-time patch issues

    user = get_user(1)

    mock_fetch_user.assert_called_once_with(1)
    assert user["name"] == "John"

Partial mock (keep most behavior, override one function):

from unittest.mock import patch

# utils.py
# def some_function(): [ELIDED]
# def other_function(): [ELIDED]

def test_partial_mock_utils():
    import myapp.utils as utils

    with patch.object(utils, "some_function") as mock_some_function:
        mock_some_function.return_value = "mocked"

        result = utils.some_function()
        assert result == "mocked"

        mock_some_function.assert_called_once()

Or with pytest-mock:

def test_partial_mock_utils(mocker):
    import myapp.utils as utils

    mock_some_function = mocker.patch.object(utils, "some_function", return_value="mocked")
    assert utils.some_function() == "mocked"
    mock_some_function.assert_called_once()

Mocking timers / time-dependent behavior

Python doesn’t have fake timers built in, but you can patch time APIs or use helper libraries.

Basic patch using pytest’s monkeypatch:

import time

def do_after_delay(callback, delay):
    time.sleep(delay)
    callback()

def test_do_after_delay(monkeypatch):
    calls = []

    def fake_sleep(seconds):
        # Skip real waiting, just record the call
        calls.append(seconds)

    monkeypatch.setattr(time, "sleep", fake_sleep)

    callback_called = []

    def callback():
        callback_called.append(True)

    do_after_delay(callback, 1.0)

    assert calls == [1.0]
    assert callback_called == [True]

Using freezegun (or similar) for time-based logic:

from freezegun import freeze_time
import datetime

def is_expired(now, expires_at):
    return now >= expires_at

def test_is_expired():
    with freeze_time("2025-01-01 10:00:00"):
        now = datetime.datetime.now()
        expires_at = datetime.datetime(2025, 1, 1, 9, 0, 0)
        assert is_expired(now, expires_at) is True

This gives you Python-native equivalents of your Vitest/bun workflow: coverage reports, JSON for tooling/LLMs, clear coverage goals, and structured mocking patterns for functions, modules, and time.

Here’s a direct Python/pytest equivalent.

# test_fetch_user.py
import pytest

# --- Code under test (examples) ---

import asyncio

async def fetch_user_async(user_id: int):
    await asyncio.sleep(0)  # simulate async work
    if user_id < 0:
        raise ValueError("User not found")
    return {"id": 1, "name": "John"}


def fetch_user_callback(user_id: int, callback):
    # simulate async work with asyncio
    async def _work():
        await asyncio.sleep(0)
        if user_id < 0:
            callback(ValueError("User not found"), None)
        else:
            callback(None, {"id": 1, "name": "John"})

    asyncio.create_task(_work())


# --- Async testing (Promises → async/await) ---

@pytest.mark.asyncio
async def test_should_resolve_with_user_data():
    user = await fetch_user_async(1)
    assert user == {"id": 1, "name": "John"}


@pytest.mark.asyncio
async def test_should_reject_with_error():
    with pytest.raises(ValueError, match="User not found"):
        await fetch_user_async(-1)


# --- Testing callbacks (Jest-style done) ---

@pytest.mark.asyncio
async def test_should_call_callback_with_result():
    future: asyncio.Future = asyncio.get_event_loop().create_future()

    def callback(error, user):
        if error:
            future.set_exception(error)
        else:
            future.set_result(user)

    fetch_user_callback(1, callback)

    user = await future
    assert user == {"id": 1, "name": "John"}

Notes:

Uses pytest plus pytest-asyncio for @pytest.mark.asyncio.
fetch_user_async mirrors the Promise-based version.
The callback test uses an asyncio.Future to emulate Jest’s done callback.

You’re right that “E2E” is usually used for UI flows, but for an API-centric service, “E2E” is often just “hit the real HTTP API in something close to a production environment (docker-compose, real DB, etc.).” UI is optional; the key is that you’re exercising the full stack across a network boundary.

Here’s how I’d adapt those Playwright/hydration ideas to FastAPI + Uvicorn + pytest for API/E2E tests.

Terminology: what’s what

For a FastAPI service, you’ll typically see:

Unit tests Call pure Python functions, maybe override dependencies. No HTTP.
Integration tests Use FastAPI’s TestClient or httpx.AsyncClient against the app object. May use real DB/test DB, but often still in-process (no Uvicorn, no docker).
API E2E tests
- App runs as a real process (Uvicorn or gunicorn) – often via docker-compose.
- Tests talk to it via HTTP (e.g., http://api:8000) using httpx/requests.
- Real-ish backing services (DB, cache, broker) are present.

So yes, “E2E via API only” is a thing and very common in backend-heavy systems.

Core idea translated from Playwright: avoid sleeps, use explicit readiness markers

In UI tests you “wait for hydration marker, then assert”. In API tests you “wait for readiness/health marker, then assert.”

1. Prefer a health/readiness endpoint over `time.sleep`

Expose something like:

# app/main.py
from fastapi import FastAPI

app = FastAPI()

@app.get("/healthz")
def healthz():
    return {"status": "ok"}

Then in tests, instead of sleeping 10 seconds hoping the container is ready, poll /healthz until you get a 200/expected payload.

Example pytest helper:

# tests/utils.py
import time
import httpx

def wait_for_service(base_url: str, timeout: float = 30.0, interval: float = 0.5) -> None:
    deadline = time.time() + timeout
    last_exc = None

    while time.time() < deadline:
        try:
            resp = httpx.get(f"{base_url}/healthz", timeout=5.0)
            if resp.status_code == 200 and resp.json().get("status") == "ok":
                return
        except Exception as exc:  # connection refused, etc.
            last_exc = exc
        time.sleep(interval)

    raise TimeoutError(f"Service at {base_url} not ready (last error: {last_exc})")

This is the API analogue of “web-first assertions” instead of waitForTimeout.

2. pytest fixture for “app is ready”

If your app is started by docker-compose, you usually just need the base URL and a “ready” check:

# tests/conftest.py
import os
import pytest
from .utils import wait_for_service

@pytest.fixture(scope="session")
def api_base_url() -> str:
    # e.g. "http://localhost:8000" or docker-compose service host
    return os.getenv("API_BASE_URL", "http://localhost:8000")

@pytest.fixture(scope="session", autouse=True)
def wait_for_api(api_base_url: str):
    wait_for_service(api_base_url)
    yield  # tests run after this point

Now all tests can safely call the API without random sleeps.

Example “E2E API” test with eventual consistency

Say you have an endpoint that kicks off some async work (e.g., background task) and later makes results available:

# app/main.py (simplified)
from fastapi import FastAPI
from pydantic import BaseModel

app = FastAPI()

_jobs = {}

class JobRequest(BaseModel):
    payload: str

@app.post("/jobs")
def create_job(req: JobRequest):
    job_id = "some-id"  # generated in real code
    _jobs[job_id] = {"status": "pending", "result": None}
    # enqueue background work here[ELIDED]
    return {"job_id": job_id}

@app.get("/jobs/{job_id}")
def get_job(job_id: str):
    return _jobs[job_id]

API-level E2E test that avoids sleep by polling with a bounded timeout:

# tests/e2e/test_jobs.py
import time
import httpx

def wait_for_job_completion(base_url: str, job_id: str, timeout: float = 30.0, interval: float = 0.5):
    deadline = time.time() + timeout
    while time.time() < deadline:
        resp = httpx.get(f"{base_url}/jobs/{job_id}", timeout=5.0)
        resp.raise_for_status()
        body = resp.json()

        if body.get("status") == "completed":
            return body

        time.sleep(interval)

    raise TimeoutError(f"Job {job_id} did not complete within {timeout} seconds")

def test_job_flow_end_to_end(api_base_url: str):
    # Create job
    create_resp = httpx.post(
        f"{api_base_url}/jobs",
        json={"payload": "test-data"},
        timeout=5.0,
    )
    create_resp.raise_for_status()
    job_id = create_resp.json()["job_id"]

    # Wait for the asynchronous processing to finish
    job_body = wait_for_job_completion(api_base_url, job_id)

    # Assert on final result
    assert job_body["status"] == "completed"
    assert job_body["result"] == "expected-result"

This mirrors the “wait for hydration marker, then assert on dynamic elements” idea, but for background work / eventual consistency at the API layer.

When to use TestClient vs real Uvicorn

Integration tests (fast, in-process):

from fastapi.testclient import TestClient
from app.main import app

client = TestClient(app)

def test_create_job_integration():
    resp = client.post("/jobs", json={"payload": "x"})
    assert resp.status_code == 200

Good for most logic; no Uvicorn, no real network.

E2E tests (slower, but realistic):
- App started separately (docker-compose or a pytest fixture starting Uvicorn).
- Tests use httpx against a URL.
- Include health checks and bounded polling instead of global sleeps.

Summary

API-only E2E is valid: if your system’s primary interface is HTTP, “E2E” can just be “through the real API in a realistic environment.”
Avoid time.sleep() and guessing when the app or async work is done.
Use:
- A health/readiness endpoint as your “hydration marker”.
- Polling helpers with timeouts for async flows/background jobs.
- httpx + pytest fixtures hitting a real Uvicorn process or docker-compose stack for true E2E.

Keep tests fast

Avoid unnecessary async/event-loop usage
Mock external dependencies (DB, HTTP, filesystem, etc.)
Don’t test implementation details (private methods, internal calls); test observable behavior

# example: mocking an HTTP client
from unittest.mock import Mock

def test_fetch_user_uses_client():
    client = Mock()
    client.get.return_value = {"email": "john@example.com"}

    user = fetch_user(client, user_id=1)

    client.get.assert_called_once_with("/users/1")
    assert user.email == "john@example.com"

Make tests independent

Each test should be runnable alone
Do not rely on test ordering
Clean up state in fixtures or teardown

import pytest
import tempfile
import shutil
from myapp import create_user_db

@pytest.fixture
def temp_db():
    tmp_dir = tempfile.mkdtemp()
    db = create_user_db(tmp_dir)
    yield db
    shutil.rmtree(tmp_dir)  # cleanup

def test_create_user(temp_db):
    user = temp_db.create_user(email="a@example.com")
    assert user.id is not None

def test_delete_user(temp_db):
    user = temp_db.create_user(email="b@example.com")
    temp_db.delete_user(user.id)
    assert temp_db.get_user(user.id) is None

Use descriptive assertions

Good:

def test_user_email_is_set():
    user = User(email="john@example.com")
    assert user.email == "john@example.com"

Bad:

def test_user_email_is_truthy():
    user = User(email="john@example.com")
    assert user.email  # too vague

Add custom messages if it helps:

assert user.email == "john@example.com", "User email should match input value"

Test one thing at a time

Good:

def test_valid_email_format():
    assert is_valid_email("john@example.com") is True
    assert is_valid_email("invalid-email") is False

def test_email_length_limit():
    long_email = "a" * 250 + "@example.com"
    assert is_valid_email(long_email) is False

Bad:

def test_email_validation():
    # format, length, domain, blacklist, etc. all in one
    assert is_valid_email("john@example.com") is True
    # several unrelated concerns mixed together

Each test should have a clear, narrow responsibility.

Avoid test duplication (use fixtures / setup)

Using pytest fixtures instead of repeating setup:

import pytest
from myapp import UserService

@pytest.fixture
def user_service():
    return UserService()

def test_create_user(user_service):
    user = user_service.create_user(email="a@example.com")
    assert user.id is not None

def test_delete_user(user_service):
    user = user_service.create_user(email="b@example.com")
    user_service.delete_user(user.id)
    assert user_service.get_user(user.id) is None

If using unittest style:

import unittest
from myapp import UserService

class TestUserService(unittest.TestCase):
    def setUp(self):
        self.service = UserService()

    def test_create_user(self):
        user = self.service.create_user(email="a@example.com")
        self.assertIsNotNone(user.id)

    def test_delete_user(self):
        user = self.service.create_user(email="b@example.com")
        self.service.delete_user(user.id)
        self.assertIsNone(self.service.get_user(user.id))

Debugging tests (Python / pytest)

Using print/logs

def test_should_process_data():
    input_data = [ELIDED]
    expected = [ELIDED]

    data = process_data(input_data)
    print("Processed data:", data)  # or use logging
    assert data == expected

Using debugger

def test_should_process_data():
    input_data = [ELIDED]
    expected = [ELIDED]

    breakpoint()  # or: import pdb; pdb.set_trace()
    data = process_data(input_data)
    assert data == expected

Running a single test

Assuming tests live under tests/ with tests/unit, tests/component, tests/integration:

# Run tests matching a pattern in name or -k expression
pytest -k "should_process_data"

# Run a specific test file
pytest tests/unit/test_user_service.py

# Run a specific test function in a file
pytest tests/unit/test_user_service.py::test_should_process_data

# Run a specific class method (if you use test classes)
pytest tests/unit/test_user_service.py::TestUserService::test_should_process_data

Common patterns

Testing error handling

import pytest

def test_should_throw_error_for_invalid_input():
    with pytest.raises(ValueError, match="Invalid email"):
        validate_email("invalid")

Testing “type guards” / predicates

def test_is_user_returns_true_for_valid_user_object():
    obj = {"id": 1, "name": "John"}
    assert is_user(obj) is True

def test_is_user_returns_false_for_invalid_object():
    obj = {"foo": "bar"}
    assert is_user(obj) is False

Testing transformations

def test_should_transform_user_data_correctly():
    input_data = {"first_name": "John", "last_name": "Doe"}
    output = transform_user(input_data)
    assert output == {"full_name": "John Doe"}

Resources

pytest documentation
Coverage and pytest-cov documentation
General Python testing best practices (fixture usage, parametrization, test naming, etc.)

Test placement strategy (Python)

All tests are under tests/ at the project root.
Unit tests:
- tests/unit/
- Mirror the application package/module structure where it helps:
  - src/myapp/user.py → tests/unit/test_user.py
  - src/myapp/features/foo.py → tests/unit/features/test_foo.py
Component tests:
- tests/component/
- Grouped by component or feature boundary:
  - tests/component/api/
  - tests/component/services/
Integration tests:
- tests/integration/
- Optionally organized by category (similar idea to client/, server/, middleware/):
  - tests/integration/client/
  - tests/integration/server/
  - tests/integration/middleware/
- Shared fixtures:
  - tests/fixtures/ (DB setup, external service mocks, common data builders, etc.)
E2E tests (if you have them):
- tests/e2e/
Coverage expectations:
- Include all runtime packages in coverage, e.g. shared_types (or equivalent), and enforce 80%+ coverage via pytest-cov/coverage configuration.

nestharus/TEST.md

Async Testing

FastAPI Testing

Using the Fixtures

Coverage & Reporting

Additional Best Practices