BlogBlog DocsDocs CareersCareers ContactContact

Blog
Docs
Pricing
Careers
Contact
Discord
Sign in

Quick start
Examples
- Python
- Node.js
  Dataset
  Experiment
  Logger
  Noopspan
  Spanimpl
  DataSummary
  DatasetRecord
  DatasetSummary
  EvalMetadata
  Evaluator
  ExperimentSummary
  LogOptions
  ScoreSummary
  Span
- Architecture
- Authentication
Release notes

Docs

Alpaca Evals

In collaboration with the Alpaca team (opens in a new tab), we've loaded several submissions from the Alpaca leaderboard (opens in a new tab) into Braintrust, where you can see not only the aggregated performance, but also dig into individual models and better understand their strengths and weaknesses.

Check out the Alpaca Evals (opens in a new tab) project on Braintrust to dig in further—no login required.

Alpaca Example

Examples Classification

Braintrust
Blog
Docs
Pricing
Careers
Contact
Discord
Privacy
Terms