Evaluations
Evaluations API
This document outlines the API endpoints for managing evaluations in PySpur.
List Available Evaluations
Description: Lists all available evaluations by scanning the tasks directory for YAML files. Returns metadata about each evaluation including name, description, type, and number of samples.
URL: /evals/
Method: GET
Response Schema:
Each dictionary in the list contains:
Launch Evaluation
Description: Launches an evaluation job by triggering the evaluator with the specified evaluation configuration. The evaluation is run asynchronously in the background.
URL: /evals/launch/
Method: POST
Request Payload:
Response Schema:
Get Evaluation Run Status
Description: Gets the status of a specific evaluation run, including results if the evaluation has completed.
URL: /evals/runs/{eval_run_id}
Method: GET
Parameters:
Response Schema:
List Evaluation Runs
Description: Lists all evaluation runs, ordered by start time descending.
URL: /evals/runs/
Method: GET
Response Schema:
Where EvalRunResponse
contains: