braintrust
An isomorphic JS library for logging data to Braintrust. braintrust
is distributed as a library on NPM (opens in a new tab).
Quickstart
Install the library with npm (or yarn).
npm install braintrust
Then, run a simple experiment with the following code (replace YOUR_API_KEY
with
your Braintrust API key):
import * as braintrust from "braintrust";
const experiment = await braintrust.init("NodeTest", {
apiKey: "YOUR_API_KEY",
});
experiment.log({
inputs: { test: 1 },
output: "foo",
expected: "bar",
scores: {
n: 0.5,
},
metadata: {
id: 1,
},
});
console.log(await experiment.summarize());
Classes
Interfaces
- DataSummary
- DatasetRecord
- DatasetSummary
- EvalMetadata
- Evaluator
- ExperimentSummary
- LogOptions
- ScoreSummary
- Span
Functions
Eval
▸ Eval<Input
, Output
, Expected
>(name
, evaluator
): Promise
<void
| ExperimentSummary
>
Type parameters
Name |
---|
Input |
Output |
Expected |
Parameters
Name | Type |
---|---|
name | string |
evaluator | Evaluator <Input , Output , Expected > |
Returns
Promise
<void
| ExperimentSummary
>
_internalGetGlobalState
▸ _internalGetGlobalState(): BraintrustState
Returns
BraintrustState
_internalSetInitialState
▸ _internalSetInitialState(): void
Returns
void
currentExperiment
▸ currentExperiment(): Experiment
| undefined
Returns the currently-active experiment (set by braintrust.withExperiment
or braintrust.withCurrent
). Returns undefined if no current experiment has been set.
Returns
Experiment
| undefined
currentLogger
▸ currentLogger(): Logger
| undefined
Returns the currently-active logger (set by braintrust.withLogger
or braintrust.withCurrent
). Returns undefined if no current logger has been set.
Returns
Logger
| undefined
currentSpan
▸ currentSpan(): Span
Return the currently-active span for logging (set by traced
or braintrust.withCurrent
). If there is no active span, returns a no-op span object, which supports the same interface as spans but does no logging.
See Span
for full details.
Returns
init
▸ init(project
, options?
): Promise
<Experiment
>
Log in, and then initialize a new experiment in a specified project. If the project does not exist, it will be created.
Remember to close your experiment when it is finished by calling Experiment.close
. We recommend initializing the experiment within a callback (using braintrust.withExperiment
) to automatically mark it as current and ensure it is terminated.
Parameters
Name | Type | Description |
---|---|---|
project | string | The name of the project to create the experiment in. |
options | Readonly <InitOptions > | Additional options for configuring init(). |
Returns
Promise
<Experiment
>
The newly created Experiment.
initDataset
▸ initDataset(project
, options?
): Promise
<Dataset
>
Create a new dataset in a specified project. If the project does not exist, it will be created.
Remember to close your dataset when it is finished by calling Dataset.close
. We recommend initializing the dataset within a callback (using braintrust.withDataset
) to ensure it is terminated.
Parameters
Name | Type | Description |
---|---|---|
project | string | The name of the project to create the dataset in. |
options | Readonly <InitDatasetOptions > | Additional options for configuring init(). |
Returns
Promise
<Dataset
>
The newly created Dataset.
initLogger
▸ initLogger(options?
): Logger
Create a new logger in a specified project. If the project does not exist, it will be created.
Parameters
Name | Type | Description |
---|---|---|
options | Readonly <InitLoggerOptions > | Additional options for configuring init(). |
Returns
The newly created Logger.
log
▸ log(event
): string
Log a single event to the current experiment. The event will be batched and uploaded behind the scenes.
Parameters
Name | Type | Description |
---|---|---|
event | ExperimentLogFullArgs | The event to log. See Experiment.log for full details. |
Returns
string
The id
of the logged event.
login
▸ login(options?
): Promise
<void
>
Log into Braintrust. This will prompt you for your API token, which you can find at
https://www.braintrustdata.com/app/token (opens in a new tab). This method is called automatically by init()
.
Parameters
Name | Type | Description |
---|---|---|
options | Object | Options for configuring login(). |
options.apiKey? | string | The API key to use. If the parameter is not specified, will try to use the BRAINTRUST_API_KEY environment variable. If no API key is specified, will prompt the user to login. |
options.apiUrl? | string | The URL of the Braintrust API. Defaults to https://www.braintrustdata.com (opens in a new tab). |
options.disableCache? | boolean | Do not use cached login information. |
options.forceLogin? | boolean | Login again, even if you have already logged in (by default, this function will exit quickly if you have already logged in) |
options.orgName? | string | (Optional) The name of a specific organization to connect to. This is useful if you belong to multiple. |
Returns
Promise
<void
>
startSpan
▸ startSpan(args?
): Span
Toplevel function for starting a span. It checks the following (in precedence order):
- Currently-active span
- Currently-active experiment
- Currently-active logger
and creates a span in the first one that is active. If none of these are active, it returns a no-op span object.
Unless a name is explicitly provided, the name of the span will be the name of the calling function, or "root" if no meaningful name can be determined.
We recommend running spans within a callback (using traced
) to automatically mark them as current and ensure they are terminated. If you wish to start a span outside a callback, be sure to terminate it with span.end()
.
See Span.startSpan
for full details.
Parameters
Name | Type |
---|---|
args? | StartSpanOptionalNameArgs |
Returns
summarize
▸ summarize(options?
): Promise
<ExperimentSummary
>
Summarize the current experiment, including the scores (compared to the closest reference experiment) and metadata.
Parameters
Name | Type | Description |
---|---|---|
options | Object | Options for summarizing the experiment. |
options.comparisonExperimentId? | string | The experiment to compare against. If None, the most recent experiment on the origin's main branch will be used. |
options.summarizeScores? | boolean | Whether to summarize the scores. If False, only the metadata will be returned. |
Returns
Promise
<ExperimentSummary
>
A summary of the experiment, including the scores (compared to the closest reference experiment) and metadata.
traced
▸ traced<R
>(callback
, args?
): R
Wrapper over braintrust.startSpan
, which passes the initialized Span
it to the given callback and ends it afterwards. See Span.traced
for full details.
Type parameters
Name |
---|
R |
Parameters
Name | Type |
---|---|
callback | (span : Span ) => R |
args? | StartSpanArgs & { name? : string } & SetCurrentArg |
Returns
R
withCurrent
▸ withCurrent<R
>(object
, callback
): R
Set the given experiment or span as current within the given callback and any asynchronous operations created within the callback. The current experiment can be accessed with braintrust.currentExperiment
, and the current span with braintrust.currentSpan
.
Type parameters
Name |
---|
R |
Parameters
Name | Type |
---|---|
object | Span | Experiment | Logger |
callback | () => R |
Returns
R
withDataset
▸ withDataset<R
>(project
, callback
, options?
): Promise
<R
>
Wrapper over braintrust.initDataset
, which passes the initialized Dataset
it to the given callback and closes it afterwards. See braintrust.initDataset
for full details.
Type parameters
Name |
---|
R |
Parameters
Name | Type |
---|---|
project | string |
callback | (dataset : Dataset ) => R |
options | Readonly <InitDatasetOptions > |
Returns
Promise
<R
>
withExperiment
▸ withExperiment<R
>(project
, callback
, options?
): Promise
<R
>
Wrapper over braintrust.init
, which passes the initialized Experiment
it to the given callback and closes it afterwards. See braintrust.init
for full details.
Type parameters
Name |
---|
R |
Parameters
Name | Type |
---|---|
project | string |
callback | (experiment : Experiment ) => R |
options | Readonly <InitOptions & SetCurrentArg > |
Returns
Promise
<R
>
withLogger
▸ withLogger<R
>(callback
, options?
): Promise
<R
>
Wrapper over braintrust.initLogger
, which passes the initialized Logger
it to the given callback and closes it afterwards. See braintrust.initLogger
for full details.
Type parameters
Name |
---|
R |
Parameters
Name | Type |
---|---|
callback | (logger : Logger ) => R |
options | Readonly <InitLoggerOptions & SetCurrentArg > |
Returns
Promise
<R
>
wrapOpenAI
▸ wrapOpenAI<T
>(openai
): T
Wrap an OpenAI
object (created with new OpenAI(...)
) to add tracing. If Braintrust is
not configured, this is a no-op
Currently, this only supports the v4
API.
Type parameters
Name | Type |
---|---|
T | extends object |
Parameters
Name | Type |
---|---|
openai | T |
Returns
T
The wrapped OpenAI
object.
wrapOpenAIv4
▸ wrapOpenAIv4<T
>(openai
): T
Type parameters
Name | Type |
---|---|
T | extends OpenAILike |
Parameters
Name | Type |
---|---|
openai | T |
Returns
T
Type Aliases
EndSpanArgs
Ƭ EndSpanArgs: Object
Type declaration
Name | Type |
---|---|
endTime? | number |
EvalScorerArgs
Ƭ EvalScorerArgs<Input
, Output
, Expected
>: EvalCase
<Input
, Expected
> & { output
: Output
}
Type parameters
Name |
---|
Input |
Output |
Expected |
EvalTask
Ƭ EvalTask<Input
, Output
>: (input
: Input
, hooks
: EvalHooks
) => Promise
<Output
> | (input
: Input
, hooks
: EvalHooks
) => Output
Type parameters
Name |
---|
Input |
Output |
ExperimentLogFullArgs
Ƭ ExperimentLogFullArgs: Partial
<Omit
<OtherExperimentLogFields
, "scores"
>> & Required
<Pick
<OtherExperimentLogFields
, "scores"
>> & Partial
<InputField
| InputsField
> & Partial
<IdField
>
ExperimentLogPartialArgs
Ƭ ExperimentLogPartialArgs: Partial
<OtherExperimentLogFields
> & Partial
<InputField
| InputsField
>
IdField
Ƭ IdField: Object
Type declaration
Name | Type |
---|---|
id | string |
InitOptions
Ƭ InitOptions: Object
Type declaration
Name | Type |
---|---|
apiKey? | string |
apiUrl? | string |
baseExperiment? | string |
dataset? | Dataset |
description? | string |
disableCache? | boolean |
experiment? | string |
isPublic? | boolean |
orgName? | string |
update? | boolean |
InputField
Ƭ InputField: Object
Type declaration
Name | Type |
---|---|
input | unknown |
InputsField
Ƭ InputsField: Object
Type declaration
Name | Type |
---|---|
inputs | unknown |
OtherExperimentLogFields
Ƭ OtherExperimentLogFields: Object
Type declaration
Name | Type |
---|---|
datasetRecordId | string |
expected | unknown |
metadata | Record <string , unknown > |
metrics | Record <string , unknown > |
output | unknown |
scores | Record <string , number > |
SetCurrentArg
Ƭ SetCurrentArg: Object
Type declaration
Name | Type |
---|---|
setCurrent? | boolean |
StartSpanArgs
Ƭ StartSpanArgs: Object
Type declaration
Name | Type |
---|---|
event? | StartSpanEventArgs |
spanAttributes? | Record <any , any > |
startTime? | number |
StartSpanOptionalNameArgs
Ƭ StartSpanOptionalNameArgs: StartSpanArgs
& { name?
: string
}
Variables
noopSpan
• Const
noopSpan: NoopSpan