Skip to content

Knowledge Store

The SPOT Knowledge Store is the platform's RAG (retrieval-augmented generation) layer. Context providers deposit tagged company documents; analyzers fetch them on demand by semantic similarity + tag expression. Neither side knows the other exists ; they share only the tag vocabulary.

Two phases

Ingestion (out-of-band, scheduled) ; the api-gateway runs a cron-style scheduler that calls each enabled context provider's POST /internal/sync on a cadence declared in spot.yaml. The provider reads from its source system and bulk_upserts KnowledgeDocuments to the store.

Consumption (per-email, on-demand) ; an analyzer calls KnowledgeClient.for_analysis(email).fetch(...) whenever it needs context during analysis. The store embeds the query, filters by tag expression, returns top-K documents by cosine similarity.

Data model

Every document has the same shape:

class KnowledgeDocument(BaseModel):
    id: str                    # stable unique id
    content: str               # text payload an LLM or analyzer reads
    tags: list[str] = []       # AND/OR-filterable at query time
    metadata: dict = {}        # structured fields (email, title, url, ...)
    source: str = ""           # provider that produced it (audit)
    updated_at: datetime | None = None
    expires_at: datetime | None = None   # optional TTL, cleaned up hourly
    score: float | None = None # populated on query results, 0.0 – 1.0

tags is the single categorisation axis ; there is no type field. Use the shared-vocabulary constants from spot_sdk.knowledge_tags (EMPLOYEE, WIKI_PAGE, POLICY, EXECUTIVE, FINANCE, ...); custom tags like acme:jira_issue are fine.

Tag expressions

Queries filter on a tiny mini-language:

Expression Semantics
employee has tag employee
employee+executive has BOTH
wiki_page\|policy has EITHER
employee+executive\|director (employee AND executive) OR director
"" or None no tag filter

+ (AND) binds tighter than | (OR). No parentheses in v1. Tokens must match [a-z0-9][a-z0-9_\-:]* ; anything else is rejected, so the expression is always safe to splice into SQL.

Writing an ingestion provider

from spot_sdk.knowledge import KnowledgeClient, KnowledgeDocument, chunk_text
from spot_sdk.knowledge_tags import KnowledgeTag

kb = KnowledgeClient(
    url=os.environ["SPOT_KNOWLEDGE_URL"],        # injected by installer
    api_key=os.environ["SPOT_INTERNAL_API_KEY"], # idem
)

@app.post("/internal/sync")
async def sync() -> dict[str, int]:
    employees = fetch_from_ldap()
    await kb.bulk_upsert([
        KnowledgeDocument(
            id=f"employee:{e.email}",
            content=f"{e.name}, {e.title}, {e.department}. Email: {e.email}.",
            tags=[
                KnowledgeTag.EMPLOYEE,
                *([KnowledgeTag.EXECUTIVE] if e.is_exec else []),
                e.department.lower(),
            ],
            metadata={"email": e.email, "title": e.title},
            source="provider-employee-dir",
        )
        for e in employees
    ])
    return {"upserted": len(employees)}

For long wikis or policy docs, split into chunks with spot_sdk.knowledge.chunk_text(text, max_chars=2000, overlap=200) and upsert each chunk with metadata["parent_id"] pointing at the source doc ; embeddings are better per-chunk and retrieval is finer-grained.

Querying from an analyzer

from spot_sdk.knowledge import KnowledgeClient

@app.post("/internal/analyze")
async def analyze(email: Email) -> AnalysisResult:
    kb = KnowledgeClient.for_analysis(email)

    # "Does this email claim to be from an executive?"
    hits = await kb.fetch(
        tags="employee+executive|director",
        text=email.headers.sender,
        top_k=1,
    )
    if hits and hits[0].metadata.get("email") != email.headers.sender:
        return AnalysisResult(is_phishing=True, confidence=0.9, ...)
    ...

Use KnowledgeClient.for_analysis(email) ; it reads the URL from SPOT_KNOWLEDGE_URL and picks up the stage-level retrieval_limits that the orchestrator passes in on email.retrieval_limits.

Every retrieved document can be folded into an AnalysisIndicator's evidence so the analyst UI can explain why.

Retrieval limits (operator policy)

Workflow stages can declare caps in spot.yaml:

stages:
  - name: enrichment
    retrieval_limits:
      max_top_k: 10          # cap top_k requested by any analyzer here
      min_score_floor: 0.3   # raise the floor even if analyzer asked lower
    analyzers:
      - id: analyzer-llm
      - id: analyzer-rules

The orchestrator attaches them to the analyzer-call payload and the SDK enforces them transparently ; caps only, never raises what the analyzer asked for.

Provider ingestion schedule

Declare it on the provider entry:

plugins:
  context_providers:
    employee-dir:
      enabled: true
      url: http://provider-employee-dir:8000
      sync_schedule: "0 */6 * * *"   # every 6 hours
      sync_timeout_ms: 120000

Supported cron syntax: fixed values, *, comma lists (0,15,30,45), ranges (3-6), steps (*/10), and @hourly / @daily / @weekly / @monthly aliases. Absent or blank means "no scheduled sync ; trigger manually only".

Manual triggers (admin only):

POST /api/v1/plugins/context_provider/{id}/sync
GET  /api/v1/plugins/context_provider/{id}/sync   # last-run state

The web dashboard surfaces both on the provider's detail page, with a "Sync now" button.

Stats

GET /api/v1/knowledge/stats returns:

{
  "total": 1234,
  "per_tag": {"employee": 500, "wiki_page": 700, "policy": 34}
}

Embedding backend (Ollama)

Embeddings are computed by Ollama (configurable via the Embedder protocol if you need a different backend). The knowledge service expects it at OLLAMA_URL (default http://ollama:11434) with the model named in EMBEDDING_MODEL (default bge-m3).

Operators have two options:

  1. External Ollama. Run Ollama anywhere reachable from the spot_spot-network Docker network and set OLLAMA_URL in /opt/spot/.env.
  2. Bundled side-car. Add ollama to COMPOSE_PROFILES in /opt/spot/.env ; the deploy repo ships an ollama + ollama-init service pair under that profile. ollama-init runs once on first boot, pulls EMBEDDING_MODEL (and anything in OLLAMA_EXTRA_MODELS, space-separated), and exits.

Readiness

The knowledge service exposes two endpoints for this:

  • GET /health ; liveness only (returns {"status": "ok"}).
  • GET /readiness ; probes Ollama with a lightweight GET /api/version and returns:
{
  "status": "ok",   // or "degraded"
  "embedding": {
    "url": "http://ollama:11434",
    "model": "bge-m3",
    "reachable": true
  }
}

The api-gateway proxy at GET /api/v1/knowledge/readiness (admin-only) combines that probe with installed-context-provider state so the dashboard can render a single banner ; see the next section.

Failure mode

When Ollama is unreachable, any request that touches embeddings (/upsert, /bulk-upsert, /query) returns 503 with a payload like:

{
  "detail": "Embedding backend unreachable at http://ollama:11434. Ensure Ollama is running and reachable, and that the 'bge-m3' model has been pulled.",
  "embedding": {
    "url": "http://ollama:11434",
    "model": "bge-m3",
    "reachable": false
  }
}

Raw httpx.ConnectErrors are caught by a FastAPI exception handler and translated into that shape ; callers never see an opaque 500.

Web dashboard

The Knowledge Store is managed from the Knowledge top-level page (admin-only). Three surfaces:

  • Readiness banner: shown (amber) whenever any required component is missing ; knowledge service unreachable, embedding backend down, or zero enabled context providers. Each row explains how to fix the specific gap (point OLLAMA_URL somewhere real or enable the bundled side-car; install a context_provider plugin). Disappears once all three components report green.
  • Summary cards: total docs, distinct sources, top tags (each tag chip is a filter link).
  • Semantic search panel: text + optional tag expression + top_k + min_score. Mimics an analyzer's retrieval and lists ranked hits with cosine-similarity scores.
  • Browse table: paginated list with filters for tag expression, source, and a substring match on id/content. Admin users can delete a document inline (confirmation required).

/knowledge/{id} ; document inspector

Metadata grid, clickable tag chips (each filters back to /knowledge?tags=...), raw content, and pretty-printed JSON metadata. Admin users get a Delete button.

Context-provider integration

  • /plugins/context_provider/{id} shows sync info (last run, next run, doc count, last error) and links to the Knowledge Store pre-filtered by source=<provider_id>.
  • /config/plugin/context_provider/{id} has a Scheduling section to edit the sync_schedule cron and sync_timeout_ms from the dashboard. Saving there reconciles the running scheduler immediately ; no process restart.

All dashboard operations go through the api-gateway proxies at /api/v1/knowledge/* (documents list, query, delete, stats, sources); the dashboard never talks to the knowledge service directly.

Architecture (short)

  • core/services/knowledge ; in-core microservice (port 8000). FastAPI, asyncpg, pgvector. Embedding via Ollama (pluggable via an Embedder protocol). Redis caches the embed lookups and the full-query results.
  • Postgres: knowledge_documents table with HNSW index on embedding vector(1024), GIN index on tags, partial index on expires_at. Alembic migration 008 creates it.
  • api-gateway: KnowledgeScheduler (one asyncio task per provider) and three admin endpoints for trigger / status / stats.
  • SDK: KnowledgeClient, KnowledgeDocument, KnowledgeTag, chunk_text, content_hash, plus FakeKnowledgeClient for plugin unit tests.

Not in scope (v1)

  • Multi-tenancy / per-org isolation ; global store.
  • Replacing pgvector with Qdrant / Weaviate ; KnowledgeStore keeps the backend abstract enough to swap later.
  • LLM-driven adaptive retrieval ; the analyzer decides what to query.