Fine-tuning lives or dies on its dataset, and assembling one by hand is slow. This tool bootstraps a dataset — describe the behaviour you want to teach and it generates diverse prompt/completion pairs formatted as OpenAI chat fine-tuning JSONL, using your own OpenAI or Anthropic key, entirely in your browser.
How it works
Choose a provider and model, paste your API key, describe the task, and optionally add a reference example, a system prompt for the eventual fine-tuned model, and the number of pairs you want. The tool asks the model to produce diverse, realistic examples — varying phrasing, length, and edge cases — and to return strict JSON. The response is parsed and shape-checked in the browser, then assembled into JSONL where each line is {"messages":[{system},{user},{assistant}]}. It is one direct request to the provider.
For Anthropic, the request includes the official direct-browser-access header so it works straight from the page.
Building a real dataset
- Reference examples anchor the style — even one good pair raises quality sharply.
- System prompt is baked into every JSONL line so training matches how you will actually call the model.
- Batch and curate — generate small batches, delete the weak pairs, and stack the good ones.
Tips
- Always read every pair before training; synthetic data introduces subtle errors that fine-tuning will faithfully memorise.
- Mix in real, hand-written examples for the cases that matter most.
- Keep the system prompt here identical to the one you will use at inference time, or the fine-tune will be mismatched.