Skip to content

4. Separate Analyzer Repositories

Date: 2025-11-04
Status: Accepted
Deciders: Core Team
Related: ADR-001 (Microservices Architecture), ADR-005 (spot-sdk Package)

Context

Analyzers are specialized services that use different technologies:

  • spot-analyzer-nlp: Uses DistilBERT and NER models (Python ML libraries)
  • spot-analyzer-llm: Uses Ollama for LLM-based analysis (large model downloads)
  • spot-analyzer-context: Rule-based analysis (lightweight logic)

Each analyzer has:

  • Different dependencies and ML frameworks
  • Different resource requirements (CPU, memory, GPU)
  • Different development teams and expertise
  • Different release cycles
  • Different testing requirements

We need to decide: monorepo vs separate repositories for analyzers.

Decision

Use separate Git repositories for each analyzer type:

  • spot-platform/ - Core orchestration platform
  • spot-sdk/ - Shared contracts and interfaces
  • spot-analyzer-nlp/ - NLP-based analyzer
  • spot-analyzer-llm/ - LLM-based analyzer
  • spot-analyzer-context/ - Rule-based analyzer

Rationale

  1. Clear Ownership: Each repository has a clear owner team
  2. Independent Releases: Analyzers can be released independently
  3. Focused Dependencies: Each analyzer only includes its required dependencies
  4. Repository Size: Keeps repos small and fast to clone
  5. CI/CD Simplicity: Each analyzer has its own pipeline
  6. Team Autonomy: Teams can work without affecting other analyzers
  7. Technology Isolation: NLP team doesn't need to understand LLM code

Consequences

Positive

  • Smaller repositories are faster to clone and easier to navigate
  • Independent release cycles and version numbers
  • Focused CI/CD pipelines (only test what changed)
  • Clear ownership and responsibility boundaries
  • Easier onboarding (new developers only learn relevant repos)
  • Can use different CI tools per analyzer if needed
  • Dependency conflicts isolated per analyzer

Negative

  • Need to maintain multiple repositories
  • Cross-repo changes require coordination
  • Harder to make atomic changes across analyzers
  • Duplicate CI/CD configuration across repos
  • Need version management across repos
  • Cannot use repo-wide code search
  • Need process for keeping contracts in sync

Alternatives Considered

Alternative 1: Monorepo

  • Pros:
  • Single place for all code
  • Atomic commits across services
  • Easier refactoring across services
  • Single CI/CD pipeline
  • Repo-wide search and refactoring tools
  • Cons:
  • Large repository slow to clone
  • All dependencies in one place (huge node_modules/venv)
  • Changes to one analyzer trigger CI for all
  • Merge conflicts more common
  • Harder to enforce ownership
  • Mixed ML frameworks in single repo
  • Why rejected: Analyzers are too different, monorepo benefits don't outweigh costs

Alternative 2: Monorepo with build tools (Nx, Turborepo)

  • Pros:
  • Monorepo benefits with selective builds
  • Incremental testing
  • Dependency graph management
  • Cons:
  • Requires additional tooling and learning
  • Complexity overhead for small team
  • Still large repository
  • Build tool lock-in
  • Why rejected: Over-engineered for our team size and structure

Implementation Notes

Repository structure:

GitHub/GitLab Organization: spot-platform
├── spot-platform/          (main platform)
├── spot-sdk/         (shared contracts)
├── spot-analyzer-nlp/      (NLP analyzer)
├── spot-analyzer-llm/      (LLM analyzer)
└── spot-analyzer-context/  (context analyzer)

Coordination mechanisms:

  • Contracts: spot-sdk package versioned and published
  • Communication: Changes to contracts discussed in main platform repo issues
  • Documentation: Central docs in spot-platform repo
  • Testing: Integration tests in spot-platform verify analyzer compatibility

Analyzer repository template:

spot-analyzer-xxx/
├── src/              # Analyzer implementation
├── tests/            # Unit and integration tests
├── Dockerfile        # Container definition
├── pyproject.toml    # Dependencies (references spot-sdk)
└── README.md         # Analyzer-specific docs

References

  • Monorepo vs Polyrepo
  • SPOT Analyzer Development Guide: spot-sdk/docs/ANALYZER_DEVELOPMENT.md
  • Repository structure example: spot-analyzer-nlp/