Skip to content

3. Pydantic for Data Validation

Date: 2025-11-04
Status: Accepted
Deciders: Core Team
Related: ADR-005 (spot-sdk Package)

Context

With microservices architecture, services exchange data via APIs and message queues. We need:

  • Strong data validation at service boundaries
  • Clear data schemas that serve as contracts
  • Automatic API documentation
  • Type safety in Python code
  • Serialization/deserialization of complex objects
  • JSON Schema generation for cross-language interop

Requirements:

  • Runtime validation of incoming data
  • IDE autocomplete and type checking support
  • Integration with FastAPI for API docs
  • Performance (high throughput validation)
  • Clear validation error messages
  • Support for complex nested structures

Decision

Use Pydantic (v2) for all data validation and schema definitions across the platform.

Rationale

  1. Type Safety: Pydantic models provide runtime validation + static type checking
  2. FastAPI Integration: Native integration with FastAPI for automatic API docs
  3. Performance: Pydantic v2 is built on Rust core, very fast validation
  4. Developer Experience: Excellent IDE support, clear error messages
  5. JSON Schema: Automatic generation of JSON schemas for OpenAPI specs
  6. Serialization: Built-in JSON serialization with proper type handling
  7. Validation: Rich validation rules (regex, ranges, custom validators)

Consequences

Positive

  • Strong contracts between services prevent invalid data
  • Automatic API documentation via OpenAPI/Swagger
  • Early detection of data issues at boundaries
  • Excellent IDE autocomplete and type hints
  • Clear, actionable error messages for clients
  • Type-safe code reduces bugs
  • JSON Schema export for language-agnostic contracts

Negative

  • Learning curve for team members unfamiliar with Pydantic
  • Validation overhead (though minimal with v2)
  • Need to maintain model definitions alongside code
  • Pydantic-specific patterns may not translate to other languages
  • Breaking changes in Pydantic updates require migration

Alternatives Considered

Alternative 1: Python dataclasses

  • Pros:
  • Built into Python standard library
  • Simple and lightweight
  • Good IDE support
  • Cons:
  • No runtime validation
  • No serialization/deserialization
  • No JSON Schema generation
  • No FastAPI integration
  • Manual validation code needed
  • Why rejected: Lacks validation and serialization we need

Alternative 2: marshmallow

  • Pros:
  • Mature library with large community
  • Flexible validation and serialization
  • Good documentation
  • Cons:
  • Slower than Pydantic v2
  • Less tight FastAPI integration
  • Separate schema and model classes
  • Verbose syntax
  • Less IDE support for type hints
  • Why rejected: Pydantic offers better performance and FastAPI integration

Alternative 3: JSON Schema + jsonschema library

  • Pros:
  • Language-agnostic schemas
  • Standard format for API contracts
  • Widely supported
  • Cons:
  • Schemas separate from code
  • No Python type hints
  • Manual serialization code
  • Verbose schema definitions
  • Poor IDE support
  • Why rejected: Doesn't provide the developer experience we want

Implementation Notes

All service contracts use Pydantic models:

from pydantic import BaseModel, EmailStr, Field

class Email(BaseModel):
    id: str = Field(..., description="Unique email identifier")
    sender: EmailStr
    recipients: list[EmailStr]
    subject: str
    body: str

    class Config:
        json_schema_extra = {
            "example": {
                "id": "email_123",
                "sender": "user@example.com",
                "recipients": ["recipient@example.com"],
                "subject": "Test Email",
                "body": "Email body text"
            }
        }

Key practices:

  • Use Pydantic v2 syntax (Field, ConfigDict)
  • Define clear field descriptions for API docs
  • Provide example data for documentation
  • Use appropriate validators (EmailStr, constr, etc.)
  • Keep models in spot-sdk package for sharing

References