Stable Cascade Configuration Guide

Optimize Stable Cascade stage A/B settings for quality and speed

Ad placeholder (leaderboard)

Stable Cascade configuration

Stable Cascade (the Würstchen v3 architecture) splits generation across three stages working in a heavily compressed latent space. Stage C is the text-conditioned prior doing the creative work, Stage B decodes it into a larger latent, and Stage A is a tiny VAE producing pixels. Most of your tuning happens on Stage C and Stage B, and this guide picks sensible values for your quality target and VRAM.

How it works

Because Stage C operates on a 42x-compressed latent, it needs surprisingly few steps and a low guidance scale to produce strong results. Stage B mostly decodes, so it needs even fewer steps and almost no guidance. The tool maps a draft/balanced/maximum target to step counts for each stage, suggests a CFG pair, and recommends a latent resolution that fits your GPU — warning you when the full bf16 pipeline is tight on smaller cards.

Tips and notes

  • Don’t over-step Stage B. Beyond ~10 steps it adds time without real quality gains; spend your budget on Stage C instead.
  • Keep Stage C guidance low. A CFG of ~4 is the sweet spot; high CFG over-saturates and distorts in the compressed latent.
  • Use bf16. Stable Cascade ships in bfloat16 weights; running fp32 doubles VRAM for no quality benefit.
  • Square resolutions are safest at 1024x1024; push to 1536 only with ample VRAM, as the prior was trained around 1024.
Ad placeholder (rectangle)