What JSONL format does it produce?

It follows OpenAI's chat fine-tuning schema — one JSON object per line with a messages array containing optional system, user, and assistant turns. You can paste it into a .jsonl file and upload it to the OpenAI fine-tuning API.

Where does my API key go?

It stays in your browser tab and is sent directly to OpenAI or Anthropic with the request you trigger. It is never stored, logged, or routed through any Gera server, and refreshing the tab clears it.

Can I trust synthetic training data?

Not blindly. Model-generated data is a fast starting point but needs human review for correctness, bias, and diversity. Always curate the pairs and remove anything wrong before fine-tuning a production model.

How many pairs should I generate?

This tool produces small batches (5-30) so you can inspect quality cheaply. Real fine-tuning usually wants dozens to hundreds of examples — generate in batches, curate, and combine them into a larger dataset.

Who pays for the API calls?

You do, on your own provider account. Each batch is one real API call billed at your usage rate, and larger batches cost more tokens. The tool itself is free.

Fine-Tune Training Data Builder (BYO Key)

Get one useful tool a week

Like this tool? Enter your email and we'll send you one genuinely useful Gera tool a week — plus a link to come back to this one. No spam, one-click unsubscribe any time.

Fine-tuning lives or dies on its dataset, and assembling one by hand is slow. This tool bootstraps a dataset — describe the behaviour you want to teach and it generates diverse prompt/completion pairs formatted as OpenAI chat fine-tuning JSONL, using your own OpenAI or Anthropic key, entirely in your browser.

How it works

Choose a provider and model, paste your API key, describe the task, and optionally add a reference example, a system prompt for the eventual fine-tuned model, and the number of pairs you want. The tool asks the model to produce diverse, realistic examples — varying phrasing, length, and edge cases — and to return strict JSON. The response is parsed and shape-checked in the browser, then assembled into JSONL where each line is {"messages":[{system},{user},{assistant}]}. It is one direct request to the provider.

For Anthropic, the request includes the official direct-browser-access header so it works straight from the page.

The JSONL format explained

OpenAI’s chat fine-tuning format expects one JSON object per line. Each object has a single key, messages, containing an array of message turns. A typical training pair looks like this (conceptually):

{"messages": [
  {"role": "system", "content": "You are a helpful customer support agent for..."},
  {"role": "user", "content": "How do I reset my password?"},
  {"role": "assistant", "content": "To reset your password, go to..."}
]}

The system turn is optional but strongly recommended: it teaches the model the persona and constraints it should apply at inference time. If you will always call the fine-tuned model with a specific system prompt, include that exact text here so the training data matches the production call.

What makes a good synthetic dataset

Diversity of phrasing is the most important property. If every “user” turn asks the same question with the same words, the fine-tuned model learns to respond to that exact phrasing and fails on paraphrases. The tool instructs the model to vary sentence structure, vocabulary, and question format across examples.

Difficulty spread. Include easy, direct questions alongside ambiguous or edge-case ones. A dataset with only clean examples produces a model that works well on clean inputs and fails on anything slightly unusual.

Coverage of your error cases. If you know your base model makes a specific mistake — over-hedging, wrong format, missing a key piece of information — include pairs that show the correct behaviour in exactly those situations.

Building a real dataset

Reference examples anchor the style — even one good pair raises quality sharply.
System prompt is baked into every JSONL line so training matches how you will actually call the model.
Batch and curate — generate small batches, delete the weak pairs, and stack the good ones.

Tips

Always read every pair before training; synthetic data introduces subtle errors that fine-tuning will faithfully memorise.
Mix in real, hand-written examples for the cases that matter most.
Keep the system prompt here identical to the one you will use at inference time, or the fine-tune will be mismatched.
Do not treat output count as a proxy for quality — twenty well-reviewed pairs outperform two hundred unreviewed synthetic ones in most fine-tuning experiments.