We propose to exploit the observed sensitivity difference to improve membership inference in the caption-free setting. Specifically, we generate conditioning embeddings that:
As in Fig. 1(c), MoFit generates embeddings that amplify $\mathcal{L}_{\text{cond}} - \mathcal{L}_{\text{uncond}}$ for members while suppressing it for hold-outs, thereby reinstating a reliable separability signal in the caption-free setting.
We evaluate on LDMs fine-tuned with Pokemon, MS-COCO, and Flickr datasets under a caption-free threat model.
Key results from Tab. 1: