How the distributions are compared
An exact within-batch repulsion, paired with a Nyström attraction to a reference frozen once over the full data.
- MMDEstimated right. The classical MMD, once dismissed as too weak, becomes a strong objective with an exact within-batch repulsion and a Nyström attraction toward a frozen full-data reference.
- BATCHLarge and fresh. The generated batch is the operative variable. Quality climbs to an optimum above 2048, an order past common practice, with gradient caching absorbing the memory.
- JOINTMatch the joint, not the marginal. On conditional tasks we match the joint image-text law, so prompt fidelity becomes part of the objective.











