💠 Compositional Learning Journal Club
Join us this week for an in-depth discussion on Unlearning in Deep generative models in the context of cutting-edge generative models. We will explore recent breakthroughs and challenges, focusing on how these models handle unlearning tasks and where improvements can be made.
✅ This Week's Presentation:
🔹 Title: Erasing Undesirable Concepts in Diffusion Models with Adversarial Preservation
🔸 Presenter: Aryan Komaei
🌀 Abstract:
Diffusion models can unintentionally generate harmful content when trained on unfiltered data. Previous methods tried to address this by adding loss or regularization terms to minimize changes in the model, but balancing content erasure and model stability remains difficult. This paper proposes a novel approach: identifying and preserving "adversarial concepts" — the concepts most affected by parameter changes — to ensure that content erasure has minimal impact on other elements. Their method outperforms current state-of-the-art techniques in maintaining content quality while removing unwanted information.
Session Details:
- 📅 Date: Tuesday
- 🕒 Time: 4:45 - 5:45 PM
- 🌐 Location: Online at vc.sharif.edu/ch/rohban
We look forward to your participation! ✌️
Join us this week for an in-depth discussion on Unlearning in Deep generative models in the context of cutting-edge generative models. We will explore recent breakthroughs and challenges, focusing on how these models handle unlearning tasks and where improvements can be made.
✅ This Week's Presentation:
🔹 Title: Erasing Undesirable Concepts in Diffusion Models with Adversarial Preservation
🔸 Presenter: Aryan Komaei
🌀 Abstract:
Diffusion models can unintentionally generate harmful content when trained on unfiltered data. Previous methods tried to address this by adding loss or regularization terms to minimize changes in the model, but balancing content erasure and model stability remains difficult. This paper proposes a novel approach: identifying and preserving "adversarial concepts" — the concepts most affected by parameter changes — to ensure that content erasure has minimal impact on other elements. Their method outperforms current state-of-the-art techniques in maintaining content quality while removing unwanted information.
Session Details:
- 📅 Date: Tuesday
- 🕒 Time: 4:45 - 5:45 PM
- 🌐 Location: Online at vc.sharif.edu/ch/rohban
We look forward to your participation! ✌️
arXiv.org
Erasing Undesirable Concepts in Diffusion Models with Adversarial...
Diffusion models excel at generating visually striking content from text but can inadvertently produce undesirable or harmful content when trained on unfiltered internet data. A practical solution...
Forwarded from Deep RL (Sp25)
🚀 Join Luis Serrano’s Talk at Sharif University of Technology
🎙 Title: The Role of Reinforcement Learning in Training and Fine-Tuning Large Language Models
👨🏫 Speaker: Luis Serrano (Founder and CEO of Serrano.Academy)
📅 Date: Thursday (June 5, 2025)
🕗 Time: 5:30 PM Iran Time
💡 Sign Up Here: https://forms.gle/Pswm8oLMBfGyN16E8
@DeepRLCourse
🎙 Title: The Role of Reinforcement Learning in Training and Fine-Tuning Large Language Models
👨🏫 Speaker: Luis Serrano (Founder and CEO of Serrano.Academy)
📅 Date: Thursday (June 5, 2025)
🕗 Time: 5:30 PM Iran Time
💡 Sign Up Here: https://forms.gle/Pswm8oLMBfGyN16E8
@DeepRLCourse
Forwarded from Deep RL (Sp25)
🚀 Join Marlos C. Machado’s Talk at Sharif University of Technology
🎙 Title: Representation-Driven Option Discovery in RL
👨🏫 Speaker: Marlos C. Machado (Assistant Professor at the University of Alberta, Alberta Machine Intelligence Institute Fellow, Canada CIFAR AI Chair through Amii, and a principal investigator in the Reinforcement Learning and Artificial Intelligence Group)
📅 Date: Friday (June 6, 2025)
🕗 Time: 6:30 PM Iran Time
💡 Sign Up Here: https://forms.gle/11E9YdyMYWEPw4hLA
@DeepRLCourse
🎙 Title: Representation-Driven Option Discovery in RL
👨🏫 Speaker: Marlos C. Machado (Assistant Professor at the University of Alberta, Alberta Machine Intelligence Institute Fellow, Canada CIFAR AI Chair through Amii, and a principal investigator in the Reinforcement Learning and Artificial Intelligence Group)
📅 Date: Friday (June 6, 2025)
🕗 Time: 6:30 PM Iran Time
💡 Sign Up Here: https://forms.gle/11E9YdyMYWEPw4hLA
@DeepRLCourse
Forwarded from Deep RL (Sp25)
🚀 Join Christopher Amato’s Talk at Sharif University of Technology
🎙 Title: A Short Introduction to Cooperative Multi-Agent Reinforcement Learning
👨🏫 Speaker: Christopher Amato (Associate Professor in the Khoury College of Computer Sciences at Northeastern University)
📅 Date: Friday (June 6, 2025)
🕗 Time: 4:30 PM Iran Time
💡 Sign Up Here: https://forms.gle/69VpNRoULDrjTPKz7
@DeepRLCourse
🎙 Title: A Short Introduction to Cooperative Multi-Agent Reinforcement Learning
👨🏫 Speaker: Christopher Amato (Associate Professor in the Khoury College of Computer Sciences at Northeastern University)
📅 Date: Friday (June 6, 2025)
🕗 Time: 4:30 PM Iran Time
💡 Sign Up Here: https://forms.gle/69VpNRoULDrjTPKz7
@DeepRLCourse
Forwarded from انجمن هوش مصنوعی شریف :: SAIC
حتی اگر به مرحله نهایی هکاتون نرسیدید، هنوز هم میتونید مورد حمایت مالی قرار بگیرید!
Please open Telegram to view this post
VIEW IN TELEGRAM
Forwarded from محراب مرادزاده
انجمن هوش مصنوعی شریف :: SAIC
سلام دوستان
ما که هکاتون رو برگزار کردیم و جایزه ۱۵۵ میلیونی خودمون رو هم دادیم.
فولاد مبارکه و سایر حامیان ولی حمایت تا سقف ۲ میلیاردشون رو گفتن ربطی به رتبهبندی ما در هکاتون نداره و خودشون داوری میکنند. نیازی هم نیست حتما محصول LLM باشه. هر چیزی که به هوش مصنوعی مرتبط باشه رو داوری میکنند.
خلاصه اینکه حتی اگر توی هکاتون شرکت نکردید هم میتونید این فرم رو پر کنید.
اینم لینک چند تا از مسائل واقعی شرکتها که اگر حل کنید ازتون میخرن یا سرمایهگذاری میکنند:
https://drive.google.com/drive/folders/1-C_jYAaLn4Ij4ZpeqIeROTRT1DE9qhuZ?usp=sharing
ما که هکاتون رو برگزار کردیم و جایزه ۱۵۵ میلیونی خودمون رو هم دادیم.
فولاد مبارکه و سایر حامیان ولی حمایت تا سقف ۲ میلیاردشون رو گفتن ربطی به رتبهبندی ما در هکاتون نداره و خودشون داوری میکنند. نیازی هم نیست حتما محصول LLM باشه. هر چیزی که به هوش مصنوعی مرتبط باشه رو داوری میکنند.
خلاصه اینکه حتی اگر توی هکاتون شرکت نکردید هم میتونید این فرم رو پر کنید.
اینم لینک چند تا از مسائل واقعی شرکتها که اگر حل کنید ازتون میخرن یا سرمایهگذاری میکنند:
https://drive.google.com/drive/folders/1-C_jYAaLn4Ij4ZpeqIeROTRT1DE9qhuZ?usp=sharing
💠 Compositional Learning Journal Club
Join us this week for an in-depth discussion on Compositional Learning in the context of cutting-edge text-to-image generative models. We will explore recent breakthroughs and challenges, focusing on how these models handle compositional tasks and where improvements can be made.
🌟 This Week's Presentation:
📌 Title:
A Cat Is A Cat (Not A Dog!): Unraveling Information Mix-ups in Text-to-Image Encoders through Causal Analysis and Embedding Optimization
🎙️ Presenter: Amir Kasaei
🧠 Abstract:
This work presents an in-depth analysis of the causal structure in the text encoder of text-to-image (T2I) diffusion models, highlighting its role in introducing information bias and loss. While prior research has mainly addressed these issues during the denoising stage, this study focuses on the underexplored contribution of text embeddings—particularly in multi-object generation scenarios. The authors investigate how text embeddings influence the final image output and why models often favor the first-mentioned object, leading to imbalanced representations. To mitigate this, they propose a training-free text embedding balance optimization method that improves information balance in Stable Diffusion by 125.42%. Additionally, a new automatic evaluation metric is introduced, offering a more accurate assessment of information loss with an 81% concordance rate with human evaluations. This metric better captures object presence and accuracy compared to existing measures like CLIP-based text-image similarity scores.
📄 Paper:
A Cat Is A Cat (Not A Dog!): Unraveling Information Mix-ups in Text-to-Image Encoders through Causal Analysis and Embedding Optimization
Session Details:
- 📅 Date: Tuesday
- 🕒 Time: 5:00 - 6:00 PM
- 🌐 Location: Online at vc.sharif.edu/ch/rohban
We look forward to your participation! ✌️
Join us this week for an in-depth discussion on Compositional Learning in the context of cutting-edge text-to-image generative models. We will explore recent breakthroughs and challenges, focusing on how these models handle compositional tasks and where improvements can be made.
🌟 This Week's Presentation:
📌 Title:
A Cat Is A Cat (Not A Dog!): Unraveling Information Mix-ups in Text-to-Image Encoders through Causal Analysis and Embedding Optimization
🎙️ Presenter: Amir Kasaei
🧠 Abstract:
This work presents an in-depth analysis of the causal structure in the text encoder of text-to-image (T2I) diffusion models, highlighting its role in introducing information bias and loss. While prior research has mainly addressed these issues during the denoising stage, this study focuses on the underexplored contribution of text embeddings—particularly in multi-object generation scenarios. The authors investigate how text embeddings influence the final image output and why models often favor the first-mentioned object, leading to imbalanced representations. To mitigate this, they propose a training-free text embedding balance optimization method that improves information balance in Stable Diffusion by 125.42%. Additionally, a new automatic evaluation metric is introduced, offering a more accurate assessment of information loss with an 81% concordance rate with human evaluations. This metric better captures object presence and accuracy compared to existing measures like CLIP-based text-image similarity scores.
📄 Paper:
A Cat Is A Cat (Not A Dog!): Unraveling Information Mix-ups in Text-to-Image Encoders through Causal Analysis and Embedding Optimization
Session Details:
- 📅 Date: Tuesday
- 🕒 Time: 5:00 - 6:00 PM
- 🌐 Location: Online at vc.sharif.edu/ch/rohban
We look forward to your participation! ✌️
arXiv.org
A Cat Is A Cat (Not A Dog!): Unraveling Information Mix-ups in...
This paper analyzes the impact of causal manner in the text encoder of text-to-image (T2I) diffusion models, which can lead to information bias and loss. Previous works have focused on addressing...
We have few open RA positions on Generalization in Reinforcement Learning, who will directly work with me during this summer. This topic deals with generalization of an RL agent beyond its training environments (e.g. see this paper and also this one as two instances). I am looking for highly motivated researchers, including B.Sc. or above students/alumni with these requirements:
1. Strong theoretical background in Probability and Statistics.
2. Proficient in Deep and Reinforcement Learning (must have taken/audited these two courses).
3. Having at least 3 months of past research experience.
4. Being self-reliant, self-motivated, and quick learner.
5. On site presence in the lab, and directly reporting to me on a weekly basis in the lab meeting.
If you are eligible, please fill this form at most by Wednesday, June 11th 2025, 9:00 AM Tehran time: Application Form.
1. Strong theoretical background in Probability and Statistics.
2. Proficient in Deep and Reinforcement Learning (must have taken/audited these two courses).
3. Having at least 3 months of past research experience.
4. Being self-reliant, self-motivated, and quick learner.
5. On site presence in the lab, and directly reporting to me on a weekly basis in the lab meeting.
If you are eligible, please fill this form at most by Wednesday, June 11th 2025, 9:00 AM Tehran time: Application Form.
arXiv.org
On the Importance of Exploration for Generalization in...
Existing approaches for improving generalization in deep reinforcement learning (RL) have mostly focused on representation learning, neglecting RL-specific aspects such as exploration. We...
💠 Compositional Learning Journal Club
Join us this week for an in-depth discussion on Unlearning in Deep generative models in the context of cutting-edge generative models. We will explore recent breakthroughs and challenges, focusing on how these models handle unlearning tasks and where improvements can be made.
✅ This Week's Presentation:
🔹 Title: The Illusion of Unlearning: The Unstable Nature of Machine Unlearning in Text-to-Image Diffusion Models
🔸 Presenter: Aryan Komaei
🌀 Abstract:
This paper tackles a critical issue in text-to-image diffusion models like Stable Diffusion, DALL·E, and Midjourney. These models are trained on massive datasets, often containing private or copyrighted content, which raises serious legal and ethical concerns. To address this, machine unlearning methods have emerged, aiming to remove specific information from the models. However, this paper reveals a major flaw: these unlearned concepts can come back when the model is fine-tuned. The authors introduce a new framework to analyze and evaluate the stability of current unlearning techniques and offer insights into why they often fail, paving the way for more robust future methods.
Session Details:
- 📅 Date: Tuesday
- 🕒 Time: 11:00 - 12:00 PM
- 🌐 Location: Online at vc.sharif.edu/ch/rohban
We look forward to your participation! ✌️
Join us this week for an in-depth discussion on Unlearning in Deep generative models in the context of cutting-edge generative models. We will explore recent breakthroughs and challenges, focusing on how these models handle unlearning tasks and where improvements can be made.
✅ This Week's Presentation:
🔹 Title: The Illusion of Unlearning: The Unstable Nature of Machine Unlearning in Text-to-Image Diffusion Models
🔸 Presenter: Aryan Komaei
🌀 Abstract:
This paper tackles a critical issue in text-to-image diffusion models like Stable Diffusion, DALL·E, and Midjourney. These models are trained on massive datasets, often containing private or copyrighted content, which raises serious legal and ethical concerns. To address this, machine unlearning methods have emerged, aiming to remove specific information from the models. However, this paper reveals a major flaw: these unlearned concepts can come back when the model is fine-tuned. The authors introduce a new framework to analyze and evaluate the stability of current unlearning techniques and offer insights into why they often fail, paving the way for more robust future methods.
Session Details:
- 📅 Date: Tuesday
- 🕒 Time: 11:00 - 12:00 PM
- 🌐 Location: Online at vc.sharif.edu/ch/rohban
We look forward to your participation! ✌️
🩻 Medical Imaging Journal Club
Join us this week as we explore cutting-edge advances in anomaly detection with diffusion models. We'll dive into a notable paper that expands the role of generative models in identifying and localizing anomalies in complex visual data. We'll also discuss their potential in the field of medical imaging.
✅ This Week's Presentation:
🔹 Title: Anomaly Detection with Conditioned Denoising Diffusion Models (DDAD)
🔸 Presenter: Mobina Poulaei
🌀 Abstract:
This paper introduces a novel framework that integrates diffusion models into the anomaly detection pipeline. By uniquely conditioning the reconstruction process, the model aims to produce cleaner, anomaly-free outputs that can be systematically compared against the inputs to reveal deviations. The work also explores strategies for enhancing feature-level comparisons to improve anomaly localization.
Session Details:
- 📅 Date: Wednesday
- 🕒 Time: 11:00 AM - 12:00 PM
- 🌐 Location: Online at
vc.sharif.edu/ch/rohban
We look forward to your participation! ✌️
Join us this week as we explore cutting-edge advances in anomaly detection with diffusion models. We'll dive into a notable paper that expands the role of generative models in identifying and localizing anomalies in complex visual data. We'll also discuss their potential in the field of medical imaging.
✅ This Week's Presentation:
🔹 Title: Anomaly Detection with Conditioned Denoising Diffusion Models (DDAD)
🔸 Presenter: Mobina Poulaei
🌀 Abstract:
This paper introduces a novel framework that integrates diffusion models into the anomaly detection pipeline. By uniquely conditioning the reconstruction process, the model aims to produce cleaner, anomaly-free outputs that can be systematically compared against the inputs to reveal deviations. The work also explores strategies for enhancing feature-level comparisons to improve anomaly localization.
Session Details:
- 📅 Date: Wednesday
- 🕒 Time: 11:00 AM - 12:00 PM
- 🌐 Location: Online at
vc.sharif.edu/ch/rohban
We look forward to your participation! ✌️
arXiv.org
Anomaly Detection with Conditioned Denoising Diffusion Models
Traditional reconstruction-based methods have struggled to achieve competitive performance in anomaly detection. In this paper, we introduce Denoising Diffusion Anomaly Detection (DDAD), a novel...
🩻 Medical Imaging Journal Club
Join us this week as we explore advances in anomaly detection using diffusion models, with a focus on their application to real-world medical imaging data. We’ll examine a novel paper that leverages weakly supervised learning and DDIMs (Denoising Diffusion Implicit Models) for generating detailed and reliable anomaly maps — without requiring pixel-level annotations.
✅ This Week’s Presentation:
🔹 Title: Diffusion Models for Medical Anomaly Detection
🔸 Presenter: Mobina Poulaei
🌀 Abstract:
This paper presents a novel, weakly supervised method for medical anomaly detection based on denoising diffusion implicit models. Unlike conventional GANs or autoencoders, the proposed framework preserves fine image details while performing image-to-image translation from pathological to healthy domains. It utilizes a deterministic noise-encoding scheme along with classifier guidance to reconstruct healthy-looking versions of medical scans. The resulting pixel-wise anomaly maps, derived from comparing original and reconstructed images, demonstrate precise localization of pathological regions.
Session Details:
- 📅 Date: Wednesday
- 🕒 Time: 11:00 AM – 12:00 PM
- 🌐 Location: Online at
vc.sharif.edu/ch/rohban
We look forward to your participation! ✌️
Join us this week as we explore advances in anomaly detection using diffusion models, with a focus on their application to real-world medical imaging data. We’ll examine a novel paper that leverages weakly supervised learning and DDIMs (Denoising Diffusion Implicit Models) for generating detailed and reliable anomaly maps — without requiring pixel-level annotations.
✅ This Week’s Presentation:
🔹 Title: Diffusion Models for Medical Anomaly Detection
🔸 Presenter: Mobina Poulaei
🌀 Abstract:
This paper presents a novel, weakly supervised method for medical anomaly detection based on denoising diffusion implicit models. Unlike conventional GANs or autoencoders, the proposed framework preserves fine image details while performing image-to-image translation from pathological to healthy domains. It utilizes a deterministic noise-encoding scheme along with classifier guidance to reconstruct healthy-looking versions of medical scans. The resulting pixel-wise anomaly maps, derived from comparing original and reconstructed images, demonstrate precise localization of pathological regions.
Session Details:
- 📅 Date: Wednesday
- 🕒 Time: 11:00 AM – 12:00 PM
- 🌐 Location: Online at
vc.sharif.edu/ch/rohban
We look forward to your participation! ✌️
arXiv.org
Diffusion Models for Medical Anomaly Detection
In medical applications, weakly supervised anomaly detection methods are of great interest, as only image-level annotations are required for training. Current anomaly detection methods mainly rely...
🪢 Compositional Learning Journal Club
Join us this week for an in-depth discussion on Compositional Learning in the context of cutting-edge text-to-image generative models. We will explore recent breakthroughs and challenges, focusing on how these models handle compositional tasks and where improvements can be made.
🌟 This Week's Presentation:
📌 Title:
We Never Know How Text-to-Image Diffusion Models Work, Until We Learn How Vision-Language Models Function
🎙️ Presenter: Amir Kasaei
🧠 Abstract:
Text-to-image diffusion models, particularly Stable Diffusion, have significantly advanced computer vision by enabling high-quality image synthesis from textual prompts. However, their performance often degrades when handling complex prompts involving multiple attributes or objects. This work investigates the root causes of this limitation, focusing on the role of the CLIP text encoder. The study identifies a phenomenon of attribute bias in the text embedding space and reveals a contextual issue in the handling of padding embeddings, which leads to concept entanglement. To address these challenges, the authors propose Magnet, a novel, training-free method that enhances attribute disentanglement through the use of positive and negative binding vectors, supported by a neighbor-based strategy to improve accuracy. Experimental results demonstrate that Magnet significantly boosts both image synthesis quality and attribute binding precision, with minimal computational cost, and effectively supports the generation of unconventional or abstract visual concepts.
📄 Paper:
Magnet: We Never Know How Text-to-Image Diffusion Models Work, Until We Learn How Vision-Language Models Function
Session Details:
- 📅 Date: Tuesday
- 🕒 Time: 11:00 AM - 12:00 PM
- 🌐 Location: Online at vc.sharif.edu/ch/rohban
We look forward to your participation! ✌️
Join us this week for an in-depth discussion on Compositional Learning in the context of cutting-edge text-to-image generative models. We will explore recent breakthroughs and challenges, focusing on how these models handle compositional tasks and where improvements can be made.
🌟 This Week's Presentation:
📌 Title:
We Never Know How Text-to-Image Diffusion Models Work, Until We Learn How Vision-Language Models Function
🎙️ Presenter: Amir Kasaei
🧠 Abstract:
Text-to-image diffusion models, particularly Stable Diffusion, have significantly advanced computer vision by enabling high-quality image synthesis from textual prompts. However, their performance often degrades when handling complex prompts involving multiple attributes or objects. This work investigates the root causes of this limitation, focusing on the role of the CLIP text encoder. The study identifies a phenomenon of attribute bias in the text embedding space and reveals a contextual issue in the handling of padding embeddings, which leads to concept entanglement. To address these challenges, the authors propose Magnet, a novel, training-free method that enhances attribute disentanglement through the use of positive and negative binding vectors, supported by a neighbor-based strategy to improve accuracy. Experimental results demonstrate that Magnet significantly boosts both image synthesis quality and attribute binding precision, with minimal computational cost, and effectively supports the generation of unconventional or abstract visual concepts.
📄 Paper:
Magnet: We Never Know How Text-to-Image Diffusion Models Work, Until We Learn How Vision-Language Models Function
Session Details:
- 📅 Date: Tuesday
- 🕒 Time: 11:00 AM - 12:00 PM
- 🌐 Location: Online at vc.sharif.edu/ch/rohban
We look forward to your participation! ✌️
Call for Research Assistants in Large Language Model Projects
If you are familiar with Large Language Models (LLMs), you are invited to join our research projects as a research assistant. These projects focus on advanced topics in reasoning with large language models and are jointly supervised by Dr. Rohban and Dr. Jafari. The projects are conducted within the RIML and INL laboratories at Sharif University of Technology, and may also be considered as undergraduate thesis projects, if applicable. If you are interested, please complete the following form:
Registration Form
If you are familiar with Large Language Models (LLMs), you are invited to join our research projects as a research assistant. These projects focus on advanced topics in reasoning with large language models and are jointly supervised by Dr. Rohban and Dr. Jafari. The projects are conducted within the RIML and INL laboratories at Sharif University of Technology, and may also be considered as undergraduate thesis projects, if applicable. If you are interested, please complete the following form:
Registration Form
📢 Research Assistant Positions Available
The Robust and Interpretable Machine Learning (RIML) Lab and the Trustworthy and Secure Artificial Intelligence Lab (TSAIL) at the Computer Engineering Department of Sharif University of Technology are seeking highly motivated and talented research assistants to join our team. This collaborative project is jointly supervised by Dr. Rohban and Dr. Sadeghzadeh.
🔍 Position Overview
We are working on cutting-edge research in the field of generative models, with a focus on robustness, interpretability, and trustworthiness. As a research assistant, you will contribute to impactful projects at the intersection of theory and real-world applications.
🧠 Required Qualifications
- Solid background in machine learning, artificial intelligence, and generative models
- Hands-on experience with generative models and their practical applications
- Proficiency in Python and frameworks such as PyTorch
- Strong communication skills and the ability to work well in a collaborative research environment
📝 How to Apply
If you are interested in joining our team, please complete the application form and upload your CV using the following link:
👉 Application Form
📚 Suggested Background Reading
To better understand the context of our research, we recommend reviewing the following papers:
1. http://arxiv.org/abs/2410.15618
2. http://arxiv.org/abs/2305.10120
We look forward to your application!
The Robust and Interpretable Machine Learning (RIML) Lab and the Trustworthy and Secure Artificial Intelligence Lab (TSAIL) at the Computer Engineering Department of Sharif University of Technology are seeking highly motivated and talented research assistants to join our team. This collaborative project is jointly supervised by Dr. Rohban and Dr. Sadeghzadeh.
🔍 Position Overview
We are working on cutting-edge research in the field of generative models, with a focus on robustness, interpretability, and trustworthiness. As a research assistant, you will contribute to impactful projects at the intersection of theory and real-world applications.
🧠 Required Qualifications
- Solid background in machine learning, artificial intelligence, and generative models
- Hands-on experience with generative models and their practical applications
- Proficiency in Python and frameworks such as PyTorch
- Strong communication skills and the ability to work well in a collaborative research environment
📝 How to Apply
If you are interested in joining our team, please complete the application form and upload your CV using the following link:
👉 Application Form
📚 Suggested Background Reading
To better understand the context of our research, we recommend reviewing the following papers:
1. http://arxiv.org/abs/2410.15618
2. http://arxiv.org/abs/2305.10120
We look forward to your application!
Telegram
RIML Lab
Robust and Interpretable Machine Learning Lab,
Prof. Mohammad Hossein Rohban,
Sharif University of Technology
https://www.aparat.com/mh_rohban
twitter.com/MhRohban
https://www.linkedin.com/company/robust-and-interpretable-machine-learning-lab/
Prof. Mohammad Hossein Rohban,
Sharif University of Technology
https://www.aparat.com/mh_rohban
twitter.com/MhRohban
https://www.linkedin.com/company/robust-and-interpretable-machine-learning-lab/
💠 Compositional Learning Journal Club
Join us this week for an in-depth discussion on Unlearning in Deep generative models in the context of cutting-edge generative models. We will explore recent breakthroughs and challenges, focusing on how these models handle unlearning tasks and where improvements can be made.
✅ This Week's Presentation:
🔹 Title: Categorical Reparameterization with Gumbel-Softmax
🔸 Presenter: Aryan Komaei
🌀 Abstract:
This paper addresses the challenge of using categorical variables in stochastic neural networks, which traditionally struggle with backpropagation due to non-differentiable sampling. The authors propose the Gumbel-Softmax distribution as a solution — a differentiable approximation of categorical variables that allows for efficient gradient-based optimization. The key benefit is that it can be smoothly annealed to behave like a true categorical distribution. The method outperforms previous gradient estimators in tasks like structured prediction and generative modeling, and also enables significant speedups in semi-supervised classification.
Session Details:
- 📅 Date: Tuesday
- 🕒 Time: 11:00 - 12:30 PM
- 🌐 Location: Online at vc.sharif.edu/ch/rohban
We look forward to your participation! ✌️
Join us this week for an in-depth discussion on Unlearning in Deep generative models in the context of cutting-edge generative models. We will explore recent breakthroughs and challenges, focusing on how these models handle unlearning tasks and where improvements can be made.
✅ This Week's Presentation:
🔹 Title: Categorical Reparameterization with Gumbel-Softmax
🔸 Presenter: Aryan Komaei
🌀 Abstract:
This paper addresses the challenge of using categorical variables in stochastic neural networks, which traditionally struggle with backpropagation due to non-differentiable sampling. The authors propose the Gumbel-Softmax distribution as a solution — a differentiable approximation of categorical variables that allows for efficient gradient-based optimization. The key benefit is that it can be smoothly annealed to behave like a true categorical distribution. The method outperforms previous gradient estimators in tasks like structured prediction and generative modeling, and also enables significant speedups in semi-supervised classification.
Session Details:
- 📅 Date: Tuesday
- 🕒 Time: 11:00 - 12:30 PM
- 🌐 Location: Online at vc.sharif.edu/ch/rohban
We look forward to your participation! ✌️
arXiv.org
Categorical Reparameterization with Gumbel-Softmax
Categorical variables are a natural choice for representing discrete structure in the world. However, stochastic neural networks rarely use categorical latent variables due to the inability to...
🔐 ML Security Journal Club
✅ This Week's Presentation:
🔹 Title: Jailbreaking Text-to-image Generative Models
🔸 Presenter: Arian Komaei
🌀 Abstract:
This paper introduces SneakyPrompt, an automated attack framework designed to bypass safety filters in text-to-image generative models like Stable Diffusion and DALL·E 2. These models are often equipped with safety filters to prevent the generation of harmful or NSFW (Not-Safe-for-Work) images. SneakyPrompt exploits these systems by using reinforcement learning to perturb blocked prompts in a way that circumvents the filters.
📄 Paper: SneakyPrompt: Jailbreaking Text-to-image Generative Models
Session Details:
- 📅 Date: Wednesday
- 🕒 Time: 5:00 - 6:00 PM
- 🌐 Location: Online at vc.sharif.edu/ch/rohban
We look forward to your participation! ✌️
✅ This Week's Presentation:
🔹 Title: Jailbreaking Text-to-image Generative Models
🔸 Presenter: Arian Komaei
🌀 Abstract:
This paper introduces SneakyPrompt, an automated attack framework designed to bypass safety filters in text-to-image generative models like Stable Diffusion and DALL·E 2. These models are often equipped with safety filters to prevent the generation of harmful or NSFW (Not-Safe-for-Work) images. SneakyPrompt exploits these systems by using reinforcement learning to perturb blocked prompts in a way that circumvents the filters.
📄 Paper: SneakyPrompt: Jailbreaking Text-to-image Generative Models
Session Details:
- 📅 Date: Wednesday
- 🕒 Time: 5:00 - 6:00 PM
- 🌐 Location: Online at vc.sharif.edu/ch/rohban
We look forward to your participation! ✌️
arXiv.org
SneakyPrompt: Jailbreaking Text-to-image Generative Models
Text-to-image generative models such as Stable Diffusion and DALL$\cdot$E raise many ethical concerns due to the generation of harmful images such as Not-Safe-for-Work (NSFW) ones. To address...
🪢 Compositional Learning Journal Club
Join us this week for an in-depth discussion on Compositional Learning in the context of cutting-edge text-to-image generative models. We will explore recent breakthroughs and challenges, focusing on how these models handle compositional tasks and where improvements can be made.
🌟 This Week's Presentation:
📌 Title:
Direct Preference Optimization for Aligning Diffusion Models with Visually Consistent Samples
🎙️ Presenter: Mobina Poulaei
🧠 Abstract:
This work tackles a key challenge in diffusion models: the misalignment between generated images and their text prompts. While Direct Preference Optimization (DPO) has been used to improve alignment, it struggles with visual inconsistency between training samples. To address this, the authors propose D-Fusion, a method that creates visually consistent, DPO-trainable image pairs using mask-guided self-attention fusion. D-Fusion also preserves denoising trajectories necessary for optimization. Experiments show that it effectively improves prompt-image alignment across multiple reinforcement learning settings.
📄 Paper:
D-Fusion: Direct Preference Optimization for Aligning Diffusion Models with Visually Consistent Samples
Session Details:
- 📅 Date: Tuesday, August 5
- 🕒 Time: 11:00 AM - 12:00 PM
- 🌐 Location: Online at vc.sharif.edu/ch/rohban
We look forward to your participation! ✌️
Join us this week for an in-depth discussion on Compositional Learning in the context of cutting-edge text-to-image generative models. We will explore recent breakthroughs and challenges, focusing on how these models handle compositional tasks and where improvements can be made.
🌟 This Week's Presentation:
📌 Title:
Direct Preference Optimization for Aligning Diffusion Models with Visually Consistent Samples
🎙️ Presenter: Mobina Poulaei
🧠 Abstract:
This work tackles a key challenge in diffusion models: the misalignment between generated images and their text prompts. While Direct Preference Optimization (DPO) has been used to improve alignment, it struggles with visual inconsistency between training samples. To address this, the authors propose D-Fusion, a method that creates visually consistent, DPO-trainable image pairs using mask-guided self-attention fusion. D-Fusion also preserves denoising trajectories necessary for optimization. Experiments show that it effectively improves prompt-image alignment across multiple reinforcement learning settings.
📄 Paper:
D-Fusion: Direct Preference Optimization for Aligning Diffusion Models with Visually Consistent Samples
Session Details:
- 📅 Date: Tuesday, August 5
- 🕒 Time: 11:00 AM - 12:00 PM
- 🌐 Location: Online at vc.sharif.edu/ch/rohban
We look forward to your participation! ✌️
arXiv.org
D-Fusion: Direct Preference Optimization for Aligning Diffusion...
The practical applications of diffusion models have been limited by the misalignment between generated images and corresponding text prompts. Recent studies have introduced direct preference...
🔐 ML Security Journal Club
✅ This Week's Presentation:
🔹 Title: Jailbreaking Text-to-image Generative Models
🔸 Presenter: Arian Komaei
🌀 Abstract:
This paper introduces GhostPrompt, an automated jailbreak framework targeting text-to-image (T2I) generation models to bypass integrated safety filters for not-safe-for-work (NSFW) content. Unlike previous token-level perturbation methods, GhostPrompt leverages large language models (LLMs) with multimodal feedback for semantic-level adversarial prompt generation. It combines Dynamic Optimization—an iterative feedback-driven process for generating aligned adversarial prompts—and Adaptive Safety Indicator Injection, which strategically embeds benign visual cues to evade image-level detection. The framework achieves a 99% bypass rate against ShieldLM-7B (up from 12.5% with Sneakyprompt), improves CLIP scores, reduces processing time, and generalizes to unseen models, including GPT-4.1 and DALL·E 3. The work reveals critical vulnerabilities in current multimodal safety systems and calls for further AI safety research under controlled-access protocols.
📄 Paper: GhostPrompt: Jailbreaking Text-to-image Generative Models based on Dynamic Optimization
Session Details:
- 📅 Date: Tuesday
- 🕒 Time: 6:30 - 7:30 PM
- 🌐 Location: Online at vc.sharif.edu/ch/rohban
We look forward to your participation! ✌️
✅ This Week's Presentation:
🔹 Title: Jailbreaking Text-to-image Generative Models
🔸 Presenter: Arian Komaei
🌀 Abstract:
This paper introduces GhostPrompt, an automated jailbreak framework targeting text-to-image (T2I) generation models to bypass integrated safety filters for not-safe-for-work (NSFW) content. Unlike previous token-level perturbation methods, GhostPrompt leverages large language models (LLMs) with multimodal feedback for semantic-level adversarial prompt generation. It combines Dynamic Optimization—an iterative feedback-driven process for generating aligned adversarial prompts—and Adaptive Safety Indicator Injection, which strategically embeds benign visual cues to evade image-level detection. The framework achieves a 99% bypass rate against ShieldLM-7B (up from 12.5% with Sneakyprompt), improves CLIP scores, reduces processing time, and generalizes to unseen models, including GPT-4.1 and DALL·E 3. The work reveals critical vulnerabilities in current multimodal safety systems and calls for further AI safety research under controlled-access protocols.
📄 Paper: GhostPrompt: Jailbreaking Text-to-image Generative Models based on Dynamic Optimization
Session Details:
- 📅 Date: Tuesday
- 🕒 Time: 6:30 - 7:30 PM
- 🌐 Location: Online at vc.sharif.edu/ch/rohban
We look forward to your participation! ✌️
arXiv.org
GhostPrompt: Jailbreaking Text-to-image Generative Models based on...
Text-to-image (T2I) generation models can inadvertently produce not-safe-for-work (NSFW) content, prompting the integration of text and image safety filters. Recent advances employ large language...
🪢 Compositional Learning Journal Club
Join us this week for an in-depth discussion on Compositional Learning in the context of cutting-edge text-to-image generative models. We will explore recent breakthroughs and challenges, focusing on how these models handle compositional tasks and where improvements can be made.
🌟 This Week's Presentation:
📌 Title:
Fast Noise Initialization for Temporally Consistent Video Generation
🎙️ Presenter: Ali Aghayari
🧠 Abstract:
Video generation has advanced rapidly with diffusion models, but ensuring temporal consistency remains challenging. Existing methods like FreeInit address this by iteratively refining noise during inference, though at a significant computational cost. To overcome this, the authors introduce FastInit, a fast noise initialization method powered by a Video Noise Prediction Network (VNPNet). Given random noise and a text prompt, VNPNet produces refined noise in a single forward pass, eliminating the need for iteration. This approach greatly improves efficiency while maintaining high temporal consistency across frames. Trained on a large-scale dataset of text prompts and noise pairs, FastInit consistently enhances video quality in experiments with various text-to-video models. By offering both speed and stability, FastInit provides a practical solution for real-world video generation. The code and dataset will be released publicly.
📄 Paper:
FastInit: Fast Noise Initialization for Temporally Consistent Video Generation
Session Details:
- 📅 Date: Tuesday, August 19
- 🕒 Time: 11:00 AM - 12:00 PM
- 🌐 Location: Online at vc.sharif.edu/ch/rohban
We look forward to your participation! ✌️
Join us this week for an in-depth discussion on Compositional Learning in the context of cutting-edge text-to-image generative models. We will explore recent breakthroughs and challenges, focusing on how these models handle compositional tasks and where improvements can be made.
🌟 This Week's Presentation:
📌 Title:
Fast Noise Initialization for Temporally Consistent Video Generation
🎙️ Presenter: Ali Aghayari
🧠 Abstract:
Video generation has advanced rapidly with diffusion models, but ensuring temporal consistency remains challenging. Existing methods like FreeInit address this by iteratively refining noise during inference, though at a significant computational cost. To overcome this, the authors introduce FastInit, a fast noise initialization method powered by a Video Noise Prediction Network (VNPNet). Given random noise and a text prompt, VNPNet produces refined noise in a single forward pass, eliminating the need for iteration. This approach greatly improves efficiency while maintaining high temporal consistency across frames. Trained on a large-scale dataset of text prompts and noise pairs, FastInit consistently enhances video quality in experiments with various text-to-video models. By offering both speed and stability, FastInit provides a practical solution for real-world video generation. The code and dataset will be released publicly.
📄 Paper:
FastInit: Fast Noise Initialization for Temporally Consistent Video Generation
Session Details:
- 📅 Date: Tuesday, August 19
- 🕒 Time: 11:00 AM - 12:00 PM
- 🌐 Location: Online at vc.sharif.edu/ch/rohban
We look forward to your participation! ✌️
arXiv.org
FastInit: Fast Noise Initialization for Temporally Consistent...
Video generation has made significant strides with the development of diffusion models; however, achieving high temporal consistency remains a challenging task. Recently, FreeInit identified a...