Knowledge Store¶
The SPOT Knowledge Store is the platform's RAG (retrieval-augmented generation) layer. Context providers deposit tagged company documents; analyzers fetch them on demand by semantic similarity + tag expression. Neither side knows the other exists ; they share only the tag vocabulary.
Two phases¶
Ingestion (out-of-band, scheduled) ; the api-gateway runs a cron-style scheduler that calls each enabled context provider's POST /internal/sync on a cadence declared in spot.yaml. The provider reads from its source system and bulk_upserts KnowledgeDocuments to the store.
Consumption (per-email, on-demand) ; an analyzer calls KnowledgeClient.for_analysis(email).fetch(...) whenever it needs context during analysis. The store embeds the query, filters by tag expression, returns top-K documents by cosine similarity.
Data model¶
Every document has the same shape:
class KnowledgeDocument(BaseModel):
id: str # stable unique id
content: str # text payload an LLM or analyzer reads
tags: list[str] = [] # AND/OR-filterable at query time
metadata: dict = {} # structured fields (email, title, url, ...)
source: str = "" # provider that produced it (audit)
updated_at: datetime | None = None
expires_at: datetime | None = None # optional TTL, cleaned up hourly
score: float | None = None # populated on query results, 0.0 – 1.0
tags is the single categorisation axis ; there is no type field. Use the shared-vocabulary constants from spot_sdk.knowledge_tags (EMPLOYEE, WIKI_PAGE, POLICY, EXECUTIVE, FINANCE, ...); custom tags like acme:jira_issue are fine.
Tag expressions¶
Queries filter on a tiny mini-language:
| Expression | Semantics |
|---|---|
employee | has tag employee |
employee+executive | has BOTH |
wiki_page\|policy | has EITHER |
employee+executive\|director | (employee AND executive) OR director |
"" or None | no tag filter |
+ (AND) binds tighter than | (OR). No parentheses in v1. Tokens must match [a-z0-9][a-z0-9_\-:]* ; anything else is rejected, so the expression is always safe to splice into SQL.
Writing an ingestion provider¶
from spot_sdk.knowledge import KnowledgeClient, KnowledgeDocument, chunk_text
from spot_sdk.knowledge_tags import KnowledgeTag
kb = KnowledgeClient(
url=os.environ["SPOT_KNOWLEDGE_URL"], # injected by installer
api_key=os.environ["SPOT_INTERNAL_API_KEY"], # idem
)
@app.post("/internal/sync")
async def sync() -> dict[str, int]:
employees = fetch_from_ldap()
await kb.bulk_upsert([
KnowledgeDocument(
id=f"employee:{e.email}",
content=f"{e.name}, {e.title}, {e.department}. Email: {e.email}.",
tags=[
KnowledgeTag.EMPLOYEE,
*([KnowledgeTag.EXECUTIVE] if e.is_exec else []),
e.department.lower(),
],
metadata={"email": e.email, "title": e.title},
source="provider-employee-dir",
)
for e in employees
])
return {"upserted": len(employees)}
For long wikis or policy docs, split into chunks with spot_sdk.knowledge.chunk_text(text, max_chars=2000, overlap=200) and upsert each chunk with metadata["parent_id"] pointing at the source doc ; embeddings are better per-chunk and retrieval is finer-grained.
Querying from an analyzer¶
from spot_sdk.knowledge import KnowledgeClient
@app.post("/internal/analyze")
async def analyze(email: Email) -> AnalysisResult:
kb = KnowledgeClient.for_analysis(email)
# "Does this email claim to be from an executive?"
hits = await kb.fetch(
tags="employee+executive|director",
text=email.headers.sender,
top_k=1,
)
if hits and hits[0].metadata.get("email") != email.headers.sender:
return AnalysisResult(is_phishing=True, confidence=0.9, ...)
...
Use KnowledgeClient.for_analysis(email) ; it reads the URL from SPOT_KNOWLEDGE_URL and picks up the stage-level retrieval_limits that the orchestrator passes in on email.retrieval_limits.
Every retrieved document can be folded into an AnalysisIndicator's evidence so the analyst UI can explain why.
Retrieval limits (operator policy)¶
Workflow stages can declare caps in spot.yaml:
stages:
- name: enrichment
retrieval_limits:
max_top_k: 10 # cap top_k requested by any analyzer here
min_score_floor: 0.3 # raise the floor even if analyzer asked lower
analyzers:
- id: analyzer-llm
- id: analyzer-rules
The orchestrator attaches them to the analyzer-call payload and the SDK enforces them transparently ; caps only, never raises what the analyzer asked for.
Provider ingestion schedule¶
Declare it on the provider entry:
plugins:
context_providers:
employee-dir:
enabled: true
url: http://provider-employee-dir:8000
sync_schedule: "0 */6 * * *" # every 6 hours
sync_timeout_ms: 120000
Supported cron syntax: fixed values, *, comma lists (0,15,30,45), ranges (3-6), steps (*/10), and @hourly / @daily / @weekly / @monthly aliases. Absent or blank means "no scheduled sync ; trigger manually only".
Manual triggers (admin only):
POST /api/v1/plugins/context_provider/{id}/sync
GET /api/v1/plugins/context_provider/{id}/sync # last-run state
The web dashboard surfaces both on the provider's detail page, with a "Sync now" button.
Stats¶
GET /api/v1/knowledge/stats returns:
Embedding backend (Ollama)¶
Embeddings are computed by Ollama (configurable via the Embedder protocol if you need a different backend). The knowledge service expects it at OLLAMA_URL (default http://ollama:11434) with the model named in EMBEDDING_MODEL (default bge-m3).
Operators have two options:
- External Ollama. Run Ollama anywhere reachable from the
spot_spot-networkDocker network and setOLLAMA_URLin/opt/spot/.env. - Bundled side-car. Add
ollamatoCOMPOSE_PROFILESin/opt/spot/.env; the deploy repo ships anollama+ollama-initservice pair under that profile.ollama-initruns once on first boot, pullsEMBEDDING_MODEL(and anything inOLLAMA_EXTRA_MODELS, space-separated), and exits.
Readiness¶
The knowledge service exposes two endpoints for this:
GET /health; liveness only (returns{"status": "ok"}).GET /readiness; probes Ollama with a lightweightGET /api/versionand returns:
{
"status": "ok", // or "degraded"
"embedding": {
"url": "http://ollama:11434",
"model": "bge-m3",
"reachable": true
}
}
The api-gateway proxy at GET /api/v1/knowledge/readiness (admin-only) combines that probe with installed-context-provider state so the dashboard can render a single banner ; see the next section.
Failure mode¶
When Ollama is unreachable, any request that touches embeddings (/upsert, /bulk-upsert, /query) returns 503 with a payload like:
{
"detail": "Embedding backend unreachable at http://ollama:11434. Ensure Ollama is running and reachable, and that the 'bge-m3' model has been pulled.",
"embedding": {
"url": "http://ollama:11434",
"model": "bge-m3",
"reachable": false
}
}
Raw httpx.ConnectErrors are caught by a FastAPI exception handler and translated into that shape ; callers never see an opaque 500.
Web dashboard¶
The Knowledge Store is managed from the Knowledge top-level page (admin-only). Three surfaces:
/knowledge ; browse & search¶
- Readiness banner: shown (amber) whenever any required component is missing ; knowledge service unreachable, embedding backend down, or zero enabled context providers. Each row explains how to fix the specific gap (point
OLLAMA_URLsomewhere real or enable the bundled side-car; install acontext_providerplugin). Disappears once all three components report green. - Summary cards: total docs, distinct sources, top tags (each tag chip is a filter link).
- Semantic search panel:
text+ optional tag expression +top_k+min_score. Mimics an analyzer's retrieval and lists ranked hits with cosine-similarity scores. - Browse table: paginated list with filters for tag expression, source, and a substring match on id/content. Admin users can delete a document inline (confirmation required).
/knowledge/{id} ; document inspector¶
Metadata grid, clickable tag chips (each filters back to /knowledge?tags=...), raw content, and pretty-printed JSON metadata. Admin users get a Delete button.
Context-provider integration¶
/plugins/context_provider/{id}shows sync info (last run, next run, doc count, last error) and links to the Knowledge Store pre-filtered bysource=<provider_id>./config/plugin/context_provider/{id}has a Scheduling section to edit thesync_schedulecron andsync_timeout_msfrom the dashboard. Saving there reconciles the running scheduler immediately ; no process restart.
All dashboard operations go through the api-gateway proxies at /api/v1/knowledge/* (documents list, query, delete, stats, sources); the dashboard never talks to the knowledge service directly.
Architecture (short)¶
core/services/knowledge; in-core microservice (port 8000). FastAPI, asyncpg, pgvector. Embedding via Ollama (pluggable via anEmbedderprotocol). Redis caches the embed lookups and the full-query results.- Postgres:
knowledge_documentstable with HNSW index onembedding vector(1024), GIN index ontags, partial index onexpires_at. Alembic migration 008 creates it. - api-gateway:
KnowledgeScheduler(one asyncio task per provider) and three admin endpoints for trigger / status / stats. - SDK:
KnowledgeClient,KnowledgeDocument,KnowledgeTag,chunk_text,content_hash, plusFakeKnowledgeClientfor plugin unit tests.
Not in scope (v1)¶
- Multi-tenancy / per-org isolation ; global store.
- Replacing pgvector with Qdrant / Weaviate ;
KnowledgeStorekeeps the backend abstract enough to swap later. - LLM-driven adaptive retrieval ; the analyzer decides what to query.