Overview
This research focuses on developing techniques to remove specific identity reproduction capability from latent diffusion models while preserving their general image generation capabilities. The approach targets the StableDiffusions-based Arc2Face model, which specializes in face generation from SoTA ArcFace identity embeddings.
Methodology
Our approach focuses on the cross-attention layers of the diffusion model, specifically targeting the key–value representations that encode identity information. By selectively modifying these representations, we aim to prevent the model from reproducing specific identities.
Full details of the methodology will be disclosed after publication.
Figure 1: Cross-attention mechanism in diffusion models
Experimental Results
Below are detailed visualizations showing the unlearning progression across different experiments. Each experiment demonstrates how the model progressively "forgets" the target identity while maintaining its ability to generate other faces through anchor pull, and keeps retain identities consistent.
Experiment: Identity 65
Parameters: lr=2e-05, preservation_weight=15.0, neg_guidance=1.0
Forget Progression
Baseline
Step 60
Step 180
Step 240
Step 300
Anchor
Anchor
Retain Progression (Identity 14692)
Step 60
Step 180
Step 300
Experiment: Identity 378
Parameters: lr=5e-05, preservation_weight=35.0, neg_guidance=1.0
Forget Progression
Baseline
Step 60
Step 180
Step 240
Step 300
Anchor
Anchor
Retain Progression (Identity 14692)
Step 60
Step 180
Step 300
Experiment: Identity 698
Parameters: lr=5e-05, preservation_weight=20.0, neg_guidance=1.0
Forget Progression
Baseline
Step 60
Step 180
Step 240
Step 300
Anchor
Anchor
Retain Progression (Identity 11102)
Step 60
Step 180
Step 300
References
[1] Papantoniou, F. P., Lattas, A., Moschoglou, S., Deng, J., Kainz, B., & Zafeiriou, S. (2024). Arc2Face: A Foundation Model for ID-Consistent Human Faces. European Conference on Computer Vision (ECCV), pp. 241-261. arXiv:2403.11641
[2] Deng, J., Guo, J., Yang, J., Xue, N., Kotsia, I., & Zafeiriou, S. (2022). ArcFace: Additive Angular Margin Loss for Deep Face Recognition. IEEE Transactions on Pattern Analysis and Machine Intelligence, 44(10), 5962-5979. DOI:10.1109/TPAMI.2021.3087709
[3] Gandikota, R., Materzynska, J., Fiotto-Kaufman, J., & Bau, D. (2023). Erasing Concepts from Diffusion Models. arXiv:2303.07345
[4] Ruiz, N., Li, Y., Jampani, V., Pritch, Y., Rubinstein, M., & Aberman, K. (2023). DreamBooth: Fine Tuning Text-to-Image Diffusion Models for Subject-Driven Generation. arXiv:2208.12242
Ongoing Work
This page will be updated as the research progresses. Check back for new results and findings.