Natural-Language Hunt

HuntService turns a free-text question like “show me failed SSH logins from any IP that also pinged the internal DNS server in the last 24h” into a structured EventQuery executed against LogStore. The LLM never returns raw events — it returns a query, which Seerflow then runs deterministically.

Why a query, not a free-form answer

If the LLM hallucinates a row, the analyst would be acting on fiction. By forcing the LLM to emit a structured filter that Seerflow validates and runs against real storage, the surfaced events are always real. The LLM’s role is parsing the question, not answering it.

Flow

Operator question
    │
    ▼
HuntService.hunt(question)
    │
    ▼ (cache miss)
LLM prompt: "Translate this question to an EventQuery JSON"
    │
    ▼
parser.py — strict JSON parse
    │
    ▼
_to_event_query.py — coerce to validated EventQuery
    │
    ▼
LogStore.query(EventQuery) — real rows
    │
    ▼
HuntResult { question, query, events, model, latency_ms }

REST API

Method	Path	Purpose
POST	`/api/v1/hunt`	Run a natural-language hunt. Body: `{ "query": "..." }`. Requires `llm.backend` configured (else `503 Service Unavailable`).

Hunt results are cached by the canonical form of the query (whitespace-collapsed, lower-cased), so two analysts asking the same thing share one LLM call.

For known-entity hunts where you already have the UUID and time window, skip the LLM entirely and call the typed endpoints directly — GET /api/v1/entities/{entity_uuid}/timeline plus GET /api/v1/events with explicit filters.

Output shape

{
  "question": "failed SSH logins in the last 24h from any IP that pinged internal DNS",
  "query": {
    "since": "2026-05-12T15:00:00Z",
    "until": "2026-05-13T15:00:00Z",
    "filters": [
      { "field": "logsource.category", "op": "eq", "value": "authentication" },
      { "field": "logsource.service", "op": "eq", "value": "ssh" },
      { "field": "event.outcome", "op": "eq", "value": "failure" }
    ],
    "entity_join": {
      "type": "ipv4",
      "secondary_filter": { "field": "logsource.category", "op": "eq", "value": "dns" }
    }
  },
  "events": [ /* ... real rows from LogStore ... */ ],
  "total": 87,
  "model": "claude-sonnet-4-6",
  "latency_ms": 1620
}

Configuration

llm:
  backend: ollama
  ollama_model: phi4-mini
  ollama_timeout_s: 30.0

The hunt service shares the LLM backend / timeout with the explanation service.

Limits

The translator targets a deliberately narrow EventQuery schema. Queries needing filters outside that schema return 400 Bad Request with hints, rather than guessing.
Wall-clock timeouts apply (ollama_timeout_s / cloud_timeout_s). Over budget → 502 Bad Gateway with a “try a tighter window” suggestion.
The dashboard surfaces both the query and the translated EventQuery side-by-side so the analyst can spot misinterpretation before reading results.