Class: Experiment
An experiment is a collection of logged events, such as model inputs and outputs, which represent a snapshot of your application at a particular point in time. An experiment is meant to capture more than just the model you use, and includes the data you use to test, pre- and post- processing code, comparison metrics (scores), and any other metadata you want to include.
Experiments are associated with a project, and two experiments are meant to be easily comparable via
their inputs
. You can change the attributes of the experiments in a project (e.g. scoring functions)
over time, simply by changing what you log.
You should not create Experiment
objects directly. Instead, use the braintrust.init()
method.
Constructors
constructor
• new Experiment(project
, id
, name
, dataset?
)
Parameters
Name | Type |
---|---|
project | RegisteredProject |
id | string |
name | string |
dataset? | Dataset |
Methods
close
▸ close(): Promise
<string
>
Finish the experiment and return its id. After calling close, you may not invoke any further methods on the experiment object.
Will be invoked automatically if the experiment is wrapped in a callback passed to braintrust.withExperiment
.
Returns
Promise
<string
>
The experiment id.
log
▸ log(event
): string
Log a single event to the experiment. The event will be batched and uploaded behind the scenes.
Parameters
Name | Type | Description |
---|---|---|
event | Readonly <ExperimentLogFullArgs > | The event to log. |
Returns
string
startSpan
▸ startSpan(args?
): Span
Create a new toplevel span. The name parameter is optional and defaults to "root".
See Span.startSpan
for full details.
Parameters
Name | Type |
---|---|
args? | StartSpanOptionalNameArgs |
Returns
summarize
▸ summarize(options?
): Promise
<ExperimentSummary
>
Summarize the experiment, including the scores (compared to the closest reference experiment) and metadata.
Parameters
Name | Type | Description |
---|---|---|
options | Object | Options for summarizing the experiment. |
options.comparisonExperimentId? | string | The experiment to compare against. If None, the most recent experiment on the origin's main branch will be used. |
options.summarizeScores? | boolean | Whether to summarize the scores. If False, only the metadata will be returned. |
Returns
Promise
<ExperimentSummary
>
A summary of the experiment, including the scores (compared to the closest reference experiment) and metadata.
traced
▸ traced<R
>(callback
, args?
): R
Wrapper over Experiment.startSpan
, which passes the initialized Span
it to the given callback and ends it afterwards. See Span.traced
for full details.
Type parameters
Name |
---|
R |
Parameters
Name | Type |
---|---|
callback | (span : Span ) => R |
args? | StartSpanArgs & { name? : string } & SetCurrentArg |
Returns
R
Properties
dataset
• Optional
Readonly
dataset: Dataset
id
• Readonly
id: string
kind
• kind: "experiment"
name
• Readonly
name: string
project
• Readonly
project: RegisteredProject