[ResponsibleFM @ NeurIPS 2025] Workshop on Socially Responsible and Trustworthy Foundation Models

NeurIPS 2025 Workshop on
Socially Responsible and Trustworthy Foundation Models (ResponsibleFM)

Hilton Mexico City Reforma · Mexico City · NeurIPS 2025

About ResponsibleFM

The ResponsibleFM Workshop is an interdisciplinary forum focused on advancing ethical, inclusive, and socially responsible research in foundation models (language and multimodal). With growing societal impact, we address fairness, accountability, transparency, and safety throughout model development and deployment—proactively tackling ethical and social risks.

We bring together researchers, practitioners, ethicists, policy-makers, and affected communities to catalyze methods and best practices ensuring foundation model research serves the common good.

Where

Hilton Mexico City Reforma, Mexico City (Room: Don Alberto 1)

When

NeurIPS 2025 (Sun 30 Nov 1 p.m. — 8 p.m. CST)

Overview

Key themes and questions we will explore at ResponsibleFM.

Topics

Defining & Measuring Trustworthiness
- Rigorous definitions across fairness, safety, truthfulness, privacy, explainability, robustness, and cultural awareness.
- Standardized, reproducible evaluation protocols and best practices.
Techniques to Enhance Trustworthiness
- Bias mitigation and fairness methods (pre-training & fine-tuning).
- Knowledge editing, continual learning, and machine unlearning.
- Watermarking and provenance tracking for accountability.
- Defenses against adversarial attacks and jailbreaking; safety layers and red teaming.
Deployment & Social Good
- Case studies in healthcare, education, public policy, social welfare, environment.
- Managing risks in high-stakes applications and maximizing positive impact.
Datasets & Benchmarks
- Diverse, inclusive, and ethically curated datasets; consent and representation.
- Comprehensive benchmarks for fairness, robustness, privacy, and more.
- Transparent documentation (data/model cards, datasheets).
Interdisciplinary Perspectives & Governance
- Insights from social sciences, philosophy, law, and public policy.
- Legal/ethical frameworks: compliance, auditability, regulation.
- Participatory, community-engaged risk assessment and governance.

Call for Papers

Submission Site

OpenReview

Format & Policy

Format: Single PDF; up to 9 pages main text (references/appendix excluded). The main text must be self-contained.
Style: Use the NeurIPS 2025 LaTeX style file. Include references and supplementary in the same PDF.
Interdisciplinary: Cross-disciplinary submissions (non-CS) are welcome if related to foundation models.
Dual-submission / Non-archival: Ongoing/unpublished work and under-review manuscripts are allowed (respect venue policies). The workshop is non-archival.
Visibility: Submissions and reviews are not public. Only accepted papers will be made public.
Double-blind: Anonymize all materials (including linked code/data). No acknowledgements at submission time.

Awards: We will select one Best Paper and one Outstanding Paper.

Template: Download NeurIPS 2025 Styles

Important Dates (AoE)

Submission: Nov 3
Notification: Nov 7
Camera-Ready: Nov 23
Workshop Day: Sun 30 Nov 1 p.m. — 8 p.m. CST

Accepted Papers

We are pleased to announce the accepted papers for the ResponsibleFM workshop. You can view the full list of accepted papers on OpenReview and their posters here.

Invited Keynote Speakers

Yoshua Bengio

Université de Montréal, Mila

Safety and Social Impact of Frontier Foundation Models

Kush R. Varshney

IBM Research

Responsible and Trustworthy Foundation Models in Industry

Diyi Yang

Stanford University

Social-aware Foundation Models

Sanmi Koyejo

Stanford University

Principled Understanding of Trustworthy Foundation Models

Rada Mihalcea

University of Michigan

Joan Nwatu

University of Michigan (Joint with Rada)

Aylin Caliskan

University of Washington

Fairness of Foundation Models

Denghui Zhang

UIUC & Stevens

Event Schedule (CST)

1:00–1:30 pm

Keynote — Diyi Yang

1:30–2:00 pm

Keynote — Yoshua Bengio

2:00–2:30 pm

Keynote — Sanmi Koyejo

2:30–3:00 pm

Keynote — Rada Mihalcea & Joan Nwatu

3:00–3:30 pm

Keynote — Denghui Zhang

3:30–5:00 pm

Oral Presentations

SVIP: Towards Verifiable Inference of Open-source Large Language Models
Yifan Sun, Yuhang Li, Yue Zhang, Yuchen Jin, Huan Zhang
MedPAIR: Measuring Physicians and AI Relevance Alignment in Medical Question Answering
Yuexing Hao, Kumail Alhamoud, Haoran Zhang, Hyewon Jeong, Isha Puri, Grace Yan, Philip Torr, Mike Schaekermann, Ariel Dora Stern, Marzyeh Ghassemi
Benchmarking Large Language Models on Safety Risks in Scientific Labs
Yujun Zhou, Jingdong Yang, Yue Huang, Kehan Guo, Zoe Emory, Bikram Ghosh, Amita Bedar, Sujay Shekar, Zhenwen Liang, Pinyu chen, Tian Gao, Werner Geyer, Nitesh V Chawla, Xiangliang Zhang
Completion ≠ Collaboration: Scaling Collaborative Effort with Agents
Shannon Zejiang Shen, Valerie Chen, Ken Gu, Alexis Ross, Zixian Ma, Jillian Ross, Alex Gu, Chenglei Si, Wayne Chi, Andi Peng, Jocelyn J Shen, Ameet Talwalkar, Tongshuang Wu, David Sontag
ARMs: Adaptive Red-Teaming Agent against Multimodal Models with Plug-and-Play Attacks
Zhaorun Chen, Xun Liu, Mintong Kang, Jiawei Zhang, Minzhou Pan, Shuang Yang, Bo Li
Invisible Tokens, Visible Bills: The Urgent Need to Audit Hidden Operations in Opaque LLM Services
Guoheng Sun, Ziyao Wang, Xuandong Zhao, Bowei Tian, Zheyu Shen, Yexiao He, Jinming Xing, Ang Li
LUMINA: Detecting Hallucinations in RAG System with Context–Knowledge Signals
Samuel Yeh, Sharon Li, Tanwi Mallick
General Exploratory Bonus for Optimistic Exploration in RLHF
Wendi Li, Changdae Oh, Sharon Li
Does higher interpretability imply better utility? A Pairwise Analysis on Sparse Autoencoders
Xu Wang, Yan Hu, Benyou Wang, Difan Zou