Tests and Evals Map

A static guide to what the repository tests and evaluation suites are trying to protect. Use the filters to focus on a behavior class.

Back to Overview

tests

test_graph_routing.py

Protects conditional graph paths such as cache-hit routing, clarification routing, retrieval/news selection, calculator routing, and guardrails outcomes.

evals

routing.jsonl

Checks whether the router makes the right high-level decision for user intent: report retrieval, news, clarification, or direct answer.

tests

test_merge.py

Verifies that duplicate chunks from parallel retrieval branches are removed by a stable document key.

tests

test_retrieve_decision.py

Checks the decision node behavior around retrieval, direct answers, ambiguity, and out-of-scope input.

evals

retrieval.jsonl

Evaluates whether the RAG path can find relevant annual-report evidence for financial questions.

tests

test_calculator.py

Protects deterministic financial math tools: generic calculate, growth rate, ratio, and average.

evals

calculation.jsonl

Checks whether questions that require arithmetic are routed and answered with calculated values.

tests

test_guardrails.py

Ensures empty or too-short context fails before answer generation.

tests

test_fallback.py

Verifies that failed guardrails produce a helpful fallback instead of a broken answer.

tests

test_citations.py

Protects citation filtering, renumbering, source previews, and superscript conversion.

evals

boundary.jsonl

Checks boundary behavior such as unrelated questions and graceful refusal.