4. Separate Analyzer Repositories¶
Date: 2025-11-04
Status: Accepted
Deciders: Core Team
Related: ADR-001 (Microservices Architecture), ADR-005 (spot-sdk Package)
Context¶
Analyzers are specialized services that use different technologies:
- spot-analyzer-nlp: Uses DistilBERT and NER models (Python ML libraries)
- spot-analyzer-llm: Uses Ollama for LLM-based analysis (large model downloads)
- spot-analyzer-context: Rule-based analysis (lightweight logic)
Each analyzer has:
- Different dependencies and ML frameworks
- Different resource requirements (CPU, memory, GPU)
- Different development teams and expertise
- Different release cycles
- Different testing requirements
We need to decide: monorepo vs separate repositories for analyzers.
Decision¶
Use separate Git repositories for each analyzer type:
spot-platform/- Core orchestration platformspot-sdk/- Shared contracts and interfacesspot-analyzer-nlp/- NLP-based analyzerspot-analyzer-llm/- LLM-based analyzerspot-analyzer-context/- Rule-based analyzer
Rationale¶
- Clear Ownership: Each repository has a clear owner team
- Independent Releases: Analyzers can be released independently
- Focused Dependencies: Each analyzer only includes its required dependencies
- Repository Size: Keeps repos small and fast to clone
- CI/CD Simplicity: Each analyzer has its own pipeline
- Team Autonomy: Teams can work without affecting other analyzers
- Technology Isolation: NLP team doesn't need to understand LLM code
Consequences¶
Positive¶
- Smaller repositories are faster to clone and easier to navigate
- Independent release cycles and version numbers
- Focused CI/CD pipelines (only test what changed)
- Clear ownership and responsibility boundaries
- Easier onboarding (new developers only learn relevant repos)
- Can use different CI tools per analyzer if needed
- Dependency conflicts isolated per analyzer
Negative¶
- Need to maintain multiple repositories
- Cross-repo changes require coordination
- Harder to make atomic changes across analyzers
- Duplicate CI/CD configuration across repos
- Need version management across repos
- Cannot use repo-wide code search
- Need process for keeping contracts in sync
Alternatives Considered¶
Alternative 1: Monorepo¶
- Pros:
- Single place for all code
- Atomic commits across services
- Easier refactoring across services
- Single CI/CD pipeline
- Repo-wide search and refactoring tools
- Cons:
- Large repository slow to clone
- All dependencies in one place (huge node_modules/venv)
- Changes to one analyzer trigger CI for all
- Merge conflicts more common
- Harder to enforce ownership
- Mixed ML frameworks in single repo
- Why rejected: Analyzers are too different, monorepo benefits don't outweigh costs
Alternative 2: Monorepo with build tools (Nx, Turborepo)¶
- Pros:
- Monorepo benefits with selective builds
- Incremental testing
- Dependency graph management
- Cons:
- Requires additional tooling and learning
- Complexity overhead for small team
- Still large repository
- Build tool lock-in
- Why rejected: Over-engineered for our team size and structure
Implementation Notes¶
Repository structure:
GitHub/GitLab Organization: spot-platform
├── spot-platform/ (main platform)
├── spot-sdk/ (shared contracts)
├── spot-analyzer-nlp/ (NLP analyzer)
├── spot-analyzer-llm/ (LLM analyzer)
└── spot-analyzer-context/ (context analyzer)
Coordination mechanisms:
- Contracts: spot-sdk package versioned and published
- Communication: Changes to contracts discussed in main platform repo issues
- Documentation: Central docs in spot-platform repo
- Testing: Integration tests in spot-platform verify analyzer compatibility
Analyzer repository template:
spot-analyzer-xxx/
├── src/ # Analyzer implementation
├── tests/ # Unit and integration tests
├── Dockerfile # Container definition
├── pyproject.toml # Dependencies (references spot-sdk)
└── README.md # Analyzer-specific docs
References¶
- Monorepo vs Polyrepo
- SPOT Analyzer Development Guide:
spot-sdk/docs/ANALYZER_DEVELOPMENT.md - Repository structure example:
spot-analyzer-nlp/