Dataset Format¶
Evret supports JSON and CSV for evaluation datasets.
JSON Format¶
Top level object has:
queries: required listdocuments: optional list
Query Item Fields¶
query_idorid: string (required)query_textorquery: string (required)relevant_doc_ids: list of strings (optional)- Use when you have pre-labeled document IDs as ground truth
- For classic IR evaluation with exact doc ID matching
expected_answers: list of strings (optional)- Use when you want a judge to determine relevance
- Store gold supporting text snippets that judges match against retrieved docs
relevant_docs: list of strings (deprecated, backward compatible)- Legacy field name, automatically mapped to
relevant_doc_ids
Note: Provide either relevant_doc_ids OR expected_answers, not both
Document Item Fields¶
doc_id: stringtext: stringmetadata: object, optional
JSON Example with Expected Answers (Judge-Based Evaluation)¶
{
"queries": [
{
"query_id": "q1",
"query_text": "does a flight above 500 dollars need manager approval",
"expected_answers": [
"Flights above 500 dollars require manager approval before booking business travel."
]
},
{
"query_id": "q2",
"query_text": "what hotel reimbursement limit applies to business travel",
"expected_answers": [
"Hotel reimbursement is capped at 180 dollars per night unless finance approves an exception."
]
}
],
"documents": [
{
"doc_id": "travel_policy_2",
"text": "Flights above 500 dollars require manager approval before booking business travel.",
"metadata": {
"source": "travel_policy.md",
"section": "flight_approval"
}
},
{
"doc_id": "travel_policy_3",
"text": "Hotel reimbursement is capped at 180 dollars per night unless finance approves an exception.",
"metadata": {
"source": "travel_policy.md",
"section": "hotel_cap"
}
}
]
}
JSON Example with Document IDs (Classic IR Evaluation)¶
{
"queries": [
{
"query_id": "q1",
"query_text": "does a flight above 500 dollars need manager approval",
"relevant_doc_ids": ["travel_policy_2"]
},
{
"query_id": "q2",
"query_text": "what hotel reimbursement limit applies to business travel",
"relevant_doc_ids": ["travel_policy_3"]
}
],
"documents": [
{
"doc_id": "travel_policy_2",
"text": "Flights above 500 dollars require manager approval before booking business travel.",
"metadata": {
"source": "travel_policy.md",
"section": "flight_approval"
}
},
{
"doc_id": "travel_policy_3",
"text": "Hotel reimbursement is capped at 180 dollars per night unless finance approves an exception.",
"metadata": {
"source": "travel_policy.md",
"section": "hotel_cap"
}
}
]
}
CSV Format¶
Required columns:
query_textorqueryrelevant_docs
Optional columns:
query_idoridrelevant_doc_ids(for classic IR evaluation)expected_answers(for judge-based evaluation)
Note: The old relevant_docs column is still supported for backward compatibility.
The relevance field (relevant_doc_ids, expected_answers, or legacy relevant_docs) can be:
- JSON list string like
"[\"Flights above 500 dollars require manager approval before booking business travel.\"]" - Comma separated values when the labels are short and unambiguous
CSV Example with Expected Answers¶
query_id,query_text,expected_answers
q1,does a flight above 500 dollars need manager approval,"[""Flights above 500 dollars require manager approval before booking business travel.""]"
q2,what hotel reimbursement limit applies to business travel,"[""Hotel reimbursement is capped at 180 dollars per night unless finance approves an exception.""]"
CSV Example with Document IDs¶
query_id,query_text,relevant_doc_ids
q1,does a flight above 500 dollars need manager approval,"[""travel_policy_2""]"
q2,what hotel reimbursement limit applies to business travel,"[""travel_policy_3""]"
Loader Methods¶
from evret import EvaluationDataset
dataset_json = EvaluationDataset.from_json("eval_data.json")
dataset_csv = EvaluationDataset.from_csv("eval_data.csv")