FoxhoundGuidesChoosing a Model

Choosing a Model

Pick the right AI model for your use case and budget.

Available Models

FlagModelInput CostOutput CostContextBest For
--model deepseekDeepSeek V3.2$0.25/M$0.38/M164KDefault. Best value
--model gemini-flashGemini 2.0 Flash$0.10/M$0.40/M1MLarge document sets
--model gemini-freeGemini 2.0 Flash ExpFreeFree1MTesting
--model haikuClaude 3.5 Haiku$0.80/M$4.00/M200KHigh-quality scoring
--localOllama Mistral 7BFreeFreeLocalMaximum privacy

Decision Guide

Use DeepSeek (default) when:

  • You want the best balance of cost and quality
  • Your document set is under 164K tokens
  • Standard analysis and summarisation tasks
uv run python analyze.py results.json \
  --context "Summarise key findings" \
  --model deepseek

Use Gemini Flash when:

  • You have a very large document set (>500 documents)
  • You need the 1M token context window
  • Cost sensitivity is moderate
uv run python analyze.py results.json \
  --context "Analyse all documents" \
  --model gemini-flash

Use Ollama (local) when:

  • Privacy is the top priority — nothing leaves your machine
  • You want free analysis with no API costs
  • You accept lower quality compared to cloud models
uv run python analyze.py results.json \
  --context "Summarise findings" \
  --local

Use the Full Pipeline when:

  • You want to minimise costs on large document sets
  • Free local triage filters to high-relevance documents first
  • Only high-scoring documents go to the paid model
uv run python analyze.py results.json \
  --full-pipeline \
  --context "Identify key themes" \
  --model deepseek

Typical Costs (DeepSeek V3.2)

Query TypeDocumentsEstimated Cost
Quick targeted50-100$0.01-0.02
Standard100-300$0.02-0.05
Exploratory300-600$0.05-0.10
Comprehensive500-1500$0.10-0.25

Cost Safety

All paid API calls show a cost estimate and require y/n confirmation before proceeding. Configure limits in config.yaml:

analysis:
  confirm_before_api_call: true
  max_cost_per_query: 1.00
  warn_above: 0.10

MIT 2026 © Docs Hub