Mail integration with SPOT¶

SPOT analyses emails that mail-retriever plugins push into the platform. This document covers the bundled retriever-smtp plugin, which sits between an existing SMTP MTA and final delivery using the standard SMTP "after-queue content filter" pattern. SPOT does not run the MTA itself ; you keep your existing Postfix, exim, sendmail or OpenSMTPD instance.

How it fits together¶

flowchart LR
    sender([External sender])
    mta[/MTA<br>port 25/]
    retriever[retriever-smtp<br>:10025]
    orch[spot-mail-orchestrator]
    analyzer[spot-analyzer-orchestrator]
    reinject[/MTA reinject<br>:10026/]
    delivery([Final delivery])

    sender -->|SMTP| mta
    mta -->|content_filter| retriever
    retriever -->|POST /internal/ingest| orch
    orch -->|enqueue / RPC| analyzer
    analyzer -.verdict.-> orch
    orch -.verdict.-> retriever
    retriever -->|reinject<br>+ X-SPOT-* headers| reinject
    reinject --> delivery

Two operating modes are selectable per-deployment via MODE:

blocking (default) ; the SMTP transaction is held until the verdict arrives. The retriever then either reinjects the message tagged with X-SPOT-* headers, or returns SMTP 4xx/5xx so Postfix applies its own queue or bounce policy. Use this when downstream rules (Sieve, milter, header-based filtering) act on the verdict at delivery time.
tag-only ; the retriever reinjects every message immediately; verdict headers are added only when the verdict arrives within the SMTP window. Use this when you cannot tolerate any added latency.

1. Bring up the SPOT side¶

The retriever is opt-in. Add mail-smtp to COMPOSE_PROFILES in your .env and configure the variables under # MAIL RETRIEVERS (in .env.example):

COMPOSE_PROFILES=mail-smtp
RETRIEVER_SMTP_REINJECT_HOST=postfix.internal.example.com
RETRIEVER_SMTP_MODE=blocking
RETRIEVER_SMTP_ON_TIMEOUT=allow

Then bring the stack up:

docker compose up -d

spot-retriever-smtp exposes:

:8000 (control plane) ; /health, /version, /settings/schema.
:10025 (SMTP) ; bound to RETRIEVER_SMTP_LISTEN_BIND (default 127.0.0.1). Change to 0.0.0.0 only if Postfix runs on a different host.

Verify:

curl -fsS http://localhost:8000/health
# {"status":"ok","mode":"blocking","smtp_listening":true,...}

2. Configure your MTA¶

retriever-smtp plugs into the standard SMTP "after-queue content filter" pattern that every mainstream Linux MTA implements. Only two things have to be in place, regardless of which MTA you run:

The inbound SMTP listener forwards each accepted message over SMTP to retriever-smtp:10025 before final delivery.
A second, internal-only SMTP listener accepts the reinjected message back from retriever-smtp (default :10026) and does not apply the content filter again ; otherwise messages loop.

The reinject port must be unreachable from the public internet. It is loosely restricted because it only accepts traffic from the trusted retriever; treat it as part of the MTA's own internal plumbing.

The exact configuration syntax depends on the MTA. The Postfix recipe below covers the most common deployment; the same pattern translates directly to exim's transport_filter / local_interfaces, sendmail's INPUT_MAIL_FILTER, or OpenSMTPD's match ... action "relay via smtp://retriever-smtp:10025" rules. Adapt port numbers, listener names and authentication boundaries to match your MTA's conventions.

Postfix example¶

Add a content_filter to the smtp service in master.cf, plus a post-filter pickup port:

# /etc/postfix/master.cf

smtp      inet  n       -       y       -       -       smtpd
    -o content_filter=smtp:[127.0.0.1]:10025

# Post-filter pickup port. retriever-smtp delivers cleaned/tagged
# mail back here. NOT exposed to the internet.
127.0.0.1:10026 inet n  -       y       -       -       smtpd
    -o content_filter=
    -o smtpd_authorized_xforward_hosts=127.0.0.0/8,[::1]/128
    -o local_recipient_maps=
    -o relay_recipient_maps=
    -o smtpd_helo_restrictions=
    -o smtpd_client_restrictions=
    -o smtpd_sender_restrictions=
    -o smtpd_recipient_restrictions=permit_mynetworks,reject

Reload Postfix:

postfix reload

Postfix's smtp transport is the right choice here even for delivery to a process on the same host ; the alternative pipe transport is older and more fragile.

3. Smoke-test the path¶

You can use swaks to test the mail retriever plugin.

swaks --to user@your-domain.example \
      --from sender@external.example \
      --server postfix.example:25 \
      --header "Subject: SPOT smoke test" \
      --body "Just verifying the content_filter chain."

Look for:

spot-retriever-smtp log line reinjecting message: ....
spot-mail-orchestrator log line for POST /internal/ingest.
The delivered message containing X-SPOT-Status: analyzed (and a threat level / confidence) when running in blocking or tag-only mode with a verdict that arrived in time.

4. Choosing blocking vs tag-only¶

Scenario	Recommended
You want SMTP-time decisions (defer/reject phishing)	`blocking`
You can't tolerate added SMTP latency under any circumstance	`tag-only`
Most mail goes to a downstream Sieve/milter that acts on headers	`blocking`
You only use SPOT for after-the-fact dashboards	`tag-only`

In blocking mode, also choose your timeout policy:

RETRIEVER_SMTP_ON_TIMEOUT=allow (default, fail-open) ; when the verdict times out, the message is reinjected without SPOT headers and delivered normally. Suitable when SPOT is a defence-in-depth layer and you don't want it blocking mail when downstream services are slow.
RETRIEVER_SMTP_ON_TIMEOUT=block (fail-closed) ; the retriever replies SMTP 451 so the sending MTA retries later. Suitable when SPOT is the primary defence; legitimate senders retry, and the queue drains once SPOT is healthy.

5. What the headers mean¶

Every analysed message carries:

Header	Example	Meaning
`X-SPOT-Job-Id`	`9f3a...`	Stable id of the ingestion job; useful for tracing in SPOT dashboards.
`X-SPOT-Status`	`analyzed`	Terminal state: `analyzed`, `accepted`, `timeout` or `rejected`.
`X-SPOT-Is-Phishing`	`yes`/`no`	The headline verdict.
`X-SPOT-Threat-Level`	`high`	Threat level string from the workflow.
`X-SPOT-Confidence`	`0.910`	0.0–1.0 confidence score.

Only Job-Id and Status are guaranteed; verdict fields appear only when Status: analyzed.

Downstream tools can act on these headers ; examples:

Sieve: if header :contains "X-SPOT-Threat-Level" "high" { fileinto "Junk"; stop; }
Procmail: :0H * ^X-SPOT-Is-Phishing: yes -> Spam folder.
Maildir filtering: drop or quarantine on X-SPOT-Threat-Level: high.

6. Operational notes¶

The retriever is stateless. Restarting it does not lose any in-flight mail; Postfix queues it and retries.
The :10026 post-filter port must be unreachable from the public internet ; it has no client restrictions because it is meant only for the retriever's reinject step.
The internal API key (SPOT_INTERNAL_API_KEY) is the same value shared by api-gateway, the knowledge service, and the mail-orchestrator. Rotate it everywhere together.
Recent ingestion jobs and aggregate counters are visible in the dashboard at /mail-retrievers (read-only, viewer role).

7. Troubleshooting¶

retriever-smtp returns 451 to every message : Either the orchestrator is down (docker compose logs mail-orchestrator) or RETRIEVER_SMTP_ON_TIMEOUT=block is set and the workflow is consistently timing out. Verify the workflow completes inside RETRIEVER_SMTP_ANALYSIS_TIMEOUT_MS.

Mail loops between Postfix and retriever-smtp : The :10026 smtpd entry is missing the -o content_filter= override. Without it Postfix re-applies the filter to the reinjected message and the cycle never ends.

Connection refused on port 10025 : RETRIEVER_SMTP_LISTEN_BIND defaults to 127.0.0.1. Either move Postfix to the same host or set the bind address to a network Postfix can reach.

401 Unauthorized in mail-orchestrator logs : SPOT_INTERNAL_API_KEY differs between the api-gateway and the retriever. They must match exactly ; re-check .env and restart both services.