Metrics Overview¶

Evret metrics score ranked retrieval results at a top-k cutoff.

Each metric answers a different question. Use the metric that matches your product goal.

Quick Pick¶

Goal	Metric
Check if at least one relevant result appears	Hit Rate
Check how much relevant content is recovered	Recall
Check how clean the top results are	Precision
Check how early the first relevant hit appears	MRR
Check ranking quality across positions	nDCG
Check rank aware precision over all relevant hits	Average Precision

For one query, define:

Most metric pages in this section use this small example:

In that case, relevant hits in top 5 are d2 and d8.

Evret computes each metric for every query, then returns the average.

For \(|Q|\) queries:

\[ \operatorname{score} = \frac{1}{|Q|} \sum_{i=1}^{|Q|} \operatorname{score}_i \]

This is why each metric class has:

Hit Rate, Recall, Precision, MRR, nDCG, and Average Precision all return scores from 0.0 to 1.0.
nDCG currently uses binary relevance, where relevant ids have relevance 1.0.
Precision divides by the configured k, not by the number of returned results.
Average Precision divides by the total number of relevant ids for the query.
Empty relevant sets score 0.0.