API Reference
Devbox
- The Devbox Object
- Devbox Lifecycle
- Devbox File Tools
- Devbox Shell Tools
- Devbox Network Tools
- Devbox Persistence Tools
- Devbox Observability Tools
- Devbox Add-ons
Blueprint
- The Blueprint Object
- Blueprint Lifecycle
- Blueprint Observability
Repository
- The Repository Object
- Repository Lifecycle
Code Scenario
- Code Scenario Lifecycle
- Scenarios Runs
- Custom Scenario Scorer
- Public Scenarios
Benchmark
- Benchmark Lifecycle
- Benchmark Runs
- Public Benchmarks
List BenchmarkRuns.
List all BenchmarkRuns matching filter.
import Runloop from '@runloop/api-client';
const client = new Runloop({
bearerToken: process.env['RUNLOOP_API_KEY'], // This is the default and can be omitted
});
async function main() {
// Automatically fetches more pages as needed.
for await (const benchmarkRunView of client.benchmarks.runs.list()) {
console.log(benchmarkRunView.id);
}
}
main();
{
"runs": [
{
"id": "<string>",
"benchmark_id": "<string>",
"name": "<string>",
"start_time_ms": 123,
"duration_ms": 123,
"state": "running",
"score": 123,
"pending_scenarios": [
"<string>"
],
"scenario_runs": [
{
"scenario_id": "<string>",
"scenarioRunId": "<string>",
"scoringResult": {
"score": 123,
"scoring_function_results": [
{
"score": 123,
"scoring_function_name": "<string>",
"output": "<string>",
"state": "unknown"
}
]
}
}
],
"metadata": {}
}
],
"has_more": true,
"total_count": 123,
"remaining_count": 123
}
Authorizations
Bearer authentication header of the form Bearer <token>
, where <token>
is your auth token.
Query Parameters
The limit of items to return. Default is 20.
Load the next page of data starting after the item with the given ID.
The Benchmark ID to filter by.
Response
List of BenchmarkRuns matching filter.
A BenchmarkRunView represents a run of a complete set of Scenarios, organized under a Benchmark.
The ID of the BenchmarkRun.
The ID of the Benchmark.
The time the benchmark run execution started (Unix timestamp milliseconds).
The state of the BenchmarkRun.
running
, completed
List of Scenarios that need to be completed before benchmark can be completed.
List of Scenarios have been completed.
A BenchmarkRunScenariosListView represents a run of a complete set of Scenarios run, organized under a Benchmark.
ID of the Scenario that has been run.
The scoring result of the ScenarioRun.
Total score for all scoring contracts. This will be a value between 0 and 1.
List of all individual scoring function results.
A ScoringFunctionResultView represents the result of running a single scoring function on a given input context.
Final score for the given scoring function.
Scoring function name that ran.
Log output of the scoring function.
The state of the scoring function application.
unknown
, complete
, error
ID of the scenario run.
User defined metadata to attach to the benchmark run for organization.
The name of the BenchmarkRun.
The duration for the BenchmarkRun to complete.
The final score across the BenchmarkRun, present once completed. Calculated as sum of scenario scores / number of scenario runs.
Was this page helpful?
import Runloop from '@runloop/api-client';
const client = new Runloop({
bearerToken: process.env['RUNLOOP_API_KEY'], // This is the default and can be omitted
});
async function main() {
// Automatically fetches more pages as needed.
for await (const benchmarkRunView of client.benchmarks.runs.list()) {
console.log(benchmarkRunView.id);
}
}
main();
{
"runs": [
{
"id": "<string>",
"benchmark_id": "<string>",
"name": "<string>",
"start_time_ms": 123,
"duration_ms": 123,
"state": "running",
"score": 123,
"pending_scenarios": [
"<string>"
],
"scenario_runs": [
{
"scenario_id": "<string>",
"scenarioRunId": "<string>",
"scoringResult": {
"score": 123,
"scoring_function_results": [
{
"score": 123,
"scoring_function_name": "<string>",
"output": "<string>",
"state": "unknown"
}
]
}
}
],
"metadata": {}
}
],
"has_more": true,
"total_count": 123,
"remaining_count": 123
}