Class: Dataset

A dataset is a collection of records, such as model inputs and outputs, which represent data you can use to evaluate and fine-tune models. You can log production data to datasets, curate them with interesting examples, edit/delete records, and run evaluations against them.

You should not create Dataset objects directly. Instead, use the braintrust.initDataset() method.

Constructors

constructor

• new Dataset(project, id, name, pinnedVersion?)

Parameters

Name	Type
`project`	`RegisteredProject`
`id`	`string`
`name`	`string`
`pinnedVersion?`	`string`

Methods

[asyncIterator]

▸ [asyncIterator](): AsyncGenerator<DatasetRecord, any, unknown>

Fetch all records in the dataset.

Returns

AsyncGenerator<DatasetRecord, any, unknown>

Example

// Use an async iterator to fetch all records in the dataset.
for await (const record of dataset) {
 console.log(record);
}

clearCache

▸ clearCache(): void

Returns

void

close

▸ close(): Promise<string>

Terminate connection to the dataset and return its id. After calling close, you may not invoke any further methods on the dataset object.

Will be invoked automatically if the dataset is bound as a context manager.

Returns

Promise<string>

The dataset id.

delete

▸ delete(id): string

Parameters

Name	Type
`id`	`string`

Returns

string

fetch

▸ fetch(): AsyncGenerator<DatasetRecord, any, unknown>

Fetch all records in the dataset.

Returns

AsyncGenerator<DatasetRecord, any, unknown>

An iterator over the dataset's records.

Example

// Use an async iterator to fetch all records in the dataset.
for await (const record of dataset.fetch()) {
 console.log(record);
}

// You can also iterate over the dataset directly.
for await (const record of dataset) {
 console.log(record);
}

fetchedData

▸ fetchedData(): Promise<any[]>

Returns

Promise<any[]>

insert

▸ insert(event): string

Insert a single record to the dataset. The record will be batched and uploaded behind the scenes. If you pass in an id, and a record with that id already exists, it will be overwritten (upsert).

Parameters

Name	Type	Description
`event`	`Object`	The event to log.
`event.id?`	`string`	(Optional) a unique identifier for the event. If you don't provide one, Braintrust will generate one for you.
`event.input?`	`unknown`	The argument that uniquely define an input case (an arbitrary, JSON serializable object).
`event.metadata?`	`Record`<`string`, `unknown`>	(Optional) a dictionary with additional data about the test example, model outputs, or just about anything else that's relevant, that you can use to help find and analyze examples later. For example, you could log the `prompt`, example's `id`, or anything else that would be useful to slice/dice later. The values in `metadata` can be any JSON-serializable type, but its keys must be strings.
`event.output`	`unknown`	The output of your application, including post-processing (an arbitrary, JSON serializable object).

Returns

string

The id of the logged record.

summarize

▸ summarize(options?): Promise<DatasetSummary>

Summarize the dataset, including high level metrics about its size and other metadata.

Parameters

Name	Type
`options`	`Object`
`options.summarizeData?`	`boolean`

Returns

Promise<DatasetSummary>

DatasetSummary

A summary of the dataset.

version

▸ version(): Promise<any>

Returns

Promise<any>

Properties

id

• Readonly id: string

name

• Readonly name: string

project

• Readonly project: RegisteredProject

Node.js Experiment