Two-Stage Scoring
Foxhound uses two different scoring methods at different pipeline stages, each measuring something different.
Stage 1: Cosine Similarity (Search)
| Property | Detail |
|---|---|
| Tool | query.py |
| Method | Cosine similarity (0.0-1.0) |
| Measures | Textual similarity between query and document embeddings |
| Cost | Free |
Cosine similarity compares the mathematical “direction” of two text embeddings. It’s fast and free, making it ideal for casting a wide net. However, it’s a rough filter — it can miss contextually relevant results that use different wording.
# Cosine similarity scores appear in search results
uv run python query.py --semantic "budget concerns" --top-k 100Stage 2: LLM Triage (Analysis)
| Property | Detail |
|---|---|
| Tool | analyze.py --full-pipeline |
| Method | LLM relevance score (1-10) |
| Measures | Contextual relevance to your specific analysis question |
| Cost | Free (local Mistral 7B) or paid (cloud models) |
LLM triage uses a language model to actually read each document and score how relevant it is to your specific context. This catches documents that are contextually relevant but use different terminology.
# The full pipeline runs triage first, then sends high-scoring docs to deep analysis
uv run python analyze.py results.json \
--full-pipeline \
--context "Identify budget risks" \
--model deepseekHow They Work Together
- Search (
query.py) — Cosine similarity retrieves the top-K most textually similar documents - Triage (
analyze.py --full-pipeline) — LLM reads each candidate and scores contextual relevance 1-10 - Analysis — Only documents scoring 7+ are sent to the paid model for deep analysis
This two-stage approach minimises cost while maximising relevance. The free cosine similarity stage casts a wide net, and the LLM triage stage filters to the most relevant results before expensive deep analysis.
When to Use Each
| Scenario | Approach |
|---|---|
| Quick exploration | query.py with cosine similarity only |
| Cost-sensitive analysis | --full-pipeline (triage filters before paid model) |
| Maximum recall | Multiple searches + merge, then --full-pipeline |
| Free-only workflow | query.py for search, --local for analysis |