FoxhoundConceptsTwo-Stage Scoring

Two-Stage Scoring

Foxhound uses two different scoring methods at different pipeline stages, each measuring something different.

PropertyDetail
Toolquery.py
MethodCosine similarity (0.0-1.0)
MeasuresTextual similarity between query and document embeddings
CostFree

Cosine similarity compares the mathematical “direction” of two text embeddings. It’s fast and free, making it ideal for casting a wide net. However, it’s a rough filter — it can miss contextually relevant results that use different wording.

# Cosine similarity scores appear in search results
uv run python query.py --semantic "budget concerns" --top-k 100

Stage 2: LLM Triage (Analysis)

PropertyDetail
Toolanalyze.py --full-pipeline
MethodLLM relevance score (1-10)
MeasuresContextual relevance to your specific analysis question
CostFree (local Mistral 7B) or paid (cloud models)

LLM triage uses a language model to actually read each document and score how relevant it is to your specific context. This catches documents that are contextually relevant but use different terminology.

# The full pipeline runs triage first, then sends high-scoring docs to deep analysis
uv run python analyze.py results.json \
  --full-pipeline \
  --context "Identify budget risks" \
  --model deepseek

How They Work Together

  1. Search (query.py) — Cosine similarity retrieves the top-K most textually similar documents
  2. Triage (analyze.py --full-pipeline) — LLM reads each candidate and scores contextual relevance 1-10
  3. Analysis — Only documents scoring 7+ are sent to the paid model for deep analysis

This two-stage approach minimises cost while maximising relevance. The free cosine similarity stage casts a wide net, and the LLM triage stage filters to the most relevant results before expensive deep analysis.

When to Use Each

ScenarioApproach
Quick explorationquery.py with cosine similarity only
Cost-sensitive analysis--full-pipeline (triage filters before paid model)
Maximum recallMultiple searches + merge, then --full-pipeline
Free-only workflowquery.py for search, --local for analysis

MIT 2026 © Docs Hub