Build a clean fine-tuning dataset, line by line
Fine-tuning a model means feeding it well-formed training examples in JSONL
format. This builder lets you add prompt/response examples through a simple form,
validates each one as you type, and exports a clean .jsonl file with one valid
JSON object per line — ready to upload.
How the chat format works
OpenAI’s fine-tuning API expects each line to be a JSON object containing a
messages array, mirroring how you call the chat API at inference time:
{"messages":[{"role":"system","content":"You are a terse assistant."},{"role":"user","content":"Capital of France?"},{"role":"assistant","content":"Paris."}]}
- The optional system message sets behaviour and should match what you will use in production.
- The user message is the input, and the assistant message is the ideal output the model should learn to produce.
- Each example is independent — there is no enclosing array and no commas between lines.
The builder assembles this structure for every example and escapes the JSON correctly, so you never hand-edit brittle quoting.
Tips for good training data
- Be consistent. Use the same system prompt across examples that share the same task so the model learns one behaviour, not many.
- Show, don’t tell. Demonstrate the exact style and length you want in the assistant responses rather than describing it.
- Cover the edges. Include the tricky and ambiguous inputs you expect in production, not just the easy ones.
- Keep it clean. Validate before uploading — a single malformed line can fail an entire fine-tuning job.