Skip to content

Generation API

evret.generation.dataset

LLM-assisted evaluation dataset generation.

ChunkingConfig dataclass

Settings for structure-aware text chunking.

CompletionProvider

Bases: Protocol

Minimal LLM interface used by the dataset generator.

complete(prompt)

Return a completion for the prompt.

DatasetGenerator

Generate diverse retrieval-evaluation examples from documents.

from_provider(provider, *, model=None, api_key=None, temperature=0.2, max_retries=3, chunking_config=None, examples_per_chunk=6, show_progress=True) classmethod

Create a generator using Evret's configured LLM providers.

generate(documents)

Chunk documents and generate a rich evaluation dataset.

GeneratedChunk dataclass

Chunk emitted by the dataset-generation chunker.

to_document_example()

Convert to Evret's evaluation document shape.

GeneratedDataset dataclass

Generated dataset with rich examples and Evret-compatible export.

to_dict()

Return a rich JSON-serializable dataset.

to_evaluation_dataset()

Convert to Evret's existing evaluation dataset model.

GeneratedExample dataclass

Rich generated query example before conversion to EvaluationDataset.

to_dict()

Return the rich JSON-serializable example.

to_query_example()

Convert to Evret's evaluation query shape.

SourceDocument dataclass

Input document for dataset generation.

build_generation_prompt(chunk, *, num_examples=6)

Build the single diverse generation prompt for one chunk.

chunk_documents(documents, *, config=None)

Split documents into structure-aware chunks.