I am currently a Researcher and Ph.D. Co-Supervisor at the Language Technology Group and the Hub of Computing & Data Science, University of Hamburg, working with Prof. Chris Biemann.

🎓 Ph.D. Openings: I am looking for motivated Ph.D. students at the University of Hamburg to work on large language models, vision-language models, and agentic systems. Feel free to reach out with your CV, and we can discuss possible funding options.

I received my Ph.D. summa cum laude from the University of Hamburg in May 2026, advised by Prof. Chris Biemann. My doctoral thesis is “Bridging Vision, Language, and Gaze for Trustworthy Foundation Models”. Previously, I obtained my M.Eng. (2019) and B.Eng. (2016) degrees from the School of Computer Science and Engineering, South China University of Technology.

My research focuses on large language models and agentic systems, with an emphasis on training and alignment, evaluation and interpretability, and multilingual and multimodal learning for real-world applications. I have published 20+ papers in top international AI venues such as ACL, EMNLP, NAACL, COLING and ECAI .

💌 xintong.wang@uni-hamburg.de / m.e.xintong@gmail.com

🤝 Opening: I am looking for Ph.D. students, Master’s thesis students, and student assistants at the University of Hamburg, and we are also hiring for full-time and internship positions at Alibaba. Self-motivated students with experience in LLMs, LVLMs, and Agents are welcome to reach out. Please feel free to drop me an email — I am always open to collaborations.

🔥 News

2026.05: 🎓 I successfully defended my Ph.D. dissertation at the University of Hamburg (summa cum laude).
2026.03: 🎉 One paper accepted to ACL 2026 (Findings), and two papers accepted to ACL 2026 Workshops — both received awards 🏆!
2025.08: 🎉 One paper accepted to EMNLP 2025 (Main, Top 15%)!
2025.05: 🎉 Three papers accepted to ACL 2025!
2024.12: 😊 Serving as an Area Chair for ACL ARR / NAACL 2025!
2024.11: 🎉 One paper accepted to COLING 2025 (Oral)!
2024.05: 🎉 One paper accepted to ACL 2024 (Findings)!
2024.03: 🎉 One paper accepted to LREC-COLING NeusymBridge 2024!
2023.08: 🛠 Co-organizer of CLEF-2024 SemEval Task: Multilingual Text Detoxification.
2023.07: 🎉 One paper accepted to ECAI 2023!
2023.06: 🎤 Invited talk at the University of Wuppertal.
2022.12: ✈️ Visiting Researcher at the Institute of Psychology, Chinese Academy of Sciences.
2022.05: 🎉 One paper accepted to LREC 2022!
2021.04: 🎓 Started as a Ph.D. Candidate at the University of Hamburg.

📝 Publications

EMNLP 2025

Chinese Toxic Language Mitigation via Sentiment Polarity Consistent Rewrites 🤗

Xintong Wang, Yixiao Liu, Jingheng Pan, Liang Ding, Longyue Wang, Chris Biemann

The 2025 Conference on Empirical Methods in Natural Language Processing (EMNLP 2025, Main, Top 15%)

We present TOXIREWRITECN, the first Chinese detoxification dataset that explicitly preserves sentiment polarity, with 1,556 carefully annotated triplets covering five real-world scenarios.
Comprehensive evaluation across 17 commercial and open-source LLMs reveals key limitations in sentiment-aware detoxification.

ACL 2025

CogSteer: Cognition-Inspired Selective Layer Intervention for Efficiently Steering Large Language Models

Xintong Wang, Jingheng Pan, Liang Ding, Longyue Wang, Longqin Jiang, Xingshan Li, Chris Biemann

Findings of the Association for Computational Linguistics (ACL 2025)

We leverage eye-movement measures to analyze the layer-wise behavior of LLMs and propose a heuristic strategy for selecting the optimal steering layer for semantic intervention.
Our framework requires only 1/N of LLM parameters and achieves +1.85% gain in toxification and +13.45% in detoxification compared to last-layer intervention.

ACL 2024

Mitigating Hallucinations in Large Vision-Language Models with Instruction Contrastive Decoding

Xintong Wang, Jingheng Pan, Liang Ding, Chris Biemann

Findings of the Association for Computational Linguistics (ACL 2024)

A training-free Instruction Contrastive Decoding (ICD) method that contrasts standard and disturbance-instruction distributions to suppress hallucinations in LVLMs.
Significantly mitigates both object-level and attribute-level hallucinations on POPE, MME, and LLaVA-Bench, while improving general perception and recognition.

Selected Publications

(* denotes equal contribution; ^✉️ denotes corresponding author. For the complete list, please see my Google Scholar.)

Preprint 2026 What Matters for Aggressive Decoding-Time KV Eviction? Temporal Memory, Not Better Scoring, Bo Zeng, Yu Zhao, Yefeng Liu, Zhihong Lu, Xuanfan Ni, Xintong Wang^✉️.
Preprint 2026 CulturalMenuBench: Probing the Knowledge-Application Gap in Multimodal Culinary Reasoning, Bo Zeng, Linfeng Gao, Peiqin Lin, Yu Zhao, Mingyan Zeng, Yu Tong, Xintong Wang^✉️, Linlong Xu, Longyue Wang, Weihua Luo, Qinggang Zhang, Jinsong Su.
Preprint 2026 Beyond Safety Experts: Routing-Mediated Multilingual Safety in Mixture-of-Experts Models, Bo Zeng, Xinwei Wu, Heng Liu, Yu Zhao, Hao Wang, Yangyang Liu, Liangying Shao, Xiaohu Zhao, Xintong Wang^✉️, Jifang Wang, Linlong Xu, Longyue Wang, Weihua Luo.
Preprint 2026 When Routing Reveals the Question: Answer Format and Simpson’s Paradox in MoE VLMs, Bo Zeng, Yu Zhao, Heng Liu, Yefeng Liu, Yiyu Wang, Zhihong Lu, Mingyan Zeng, Xintong Wang^✉️, Liang Ding, Chris Biemann.
Preprint 2026 Beyond Semantic Bottlenecks: A Mechanistic Diagnosis of Multilingual Mathematical Reasoning Failure, Bo Zeng, Xintong Wang^✉️, Yu Zhao, Mingyan Zeng, Yefeng Liu, Yu Tong, Yichao Du, Zhihong Lu, Liang Ding, Chris Biemann.
Preprint 2026 Correct Answers Are Not Enough: Measuring Evidence Reliability in Deep Research Agents, Bo Zeng, Xintong Wang^✉️, Yu Zhao, Mingyan Zeng, Yefeng Liu, Yu Tong, Zhihong Lu, Liang Ding, Chris Biemann.
Preprint 2026 Outcome ≠ Faithfulness: Exposing Systematic False Positives in Multi-Hop Knowledge Editing Evaluation, Yuchen Wu, Liang Ding, Xintong Wang, Li Shen, Dacheng Tao.
Preprint 2026 Grounded Scaling: Why Agentic AI Needs Deterministic Environments, Liang Ding, Xintong Wang.
Preprint 2026 IndustryBench-MIPU: Benchmarking Multi-Image Attribute Value Extraction for Industrial Products 🤗, Haonan Qi, Jin Cao, Yongqi Zhang, Xintong Wang^✉️, Weidong Tang, Bin Chen, Chengfu Huo, Haojun Pan, Hengyu You, Jing Li, Yingde Wang, Liang Ding.
Preprint 2026 ARBOR: Online Process Rewards via a Reusable Rubric Buffer for Search Agents, Zheng Liu, Longxiang Zhang, Xintong Wang, Zhiang Xu, Shaoxiong Zhan, Xin Shan, Wen Huang, Tao Dai, Shu-Tao Xia, Chengfu Huo, Liang Ding.
Preprint 2026 IndustryBench: Probing the Industrial Knowledge Boundaries of LLMs 🤗, Songlin Bai, Xintong Wang^✉️, Linlin Yu, Bin Chen, Zhiang Xu, Yuyang Sheng, Changtong Zan, Xiaofeng Zhu, Yizhe Zhang, Jiru Li, Mingze Guo, Ling Zou, Yalong Li, Chengfu Huo, Liang Ding.
Preprint 2026 A Multimodal Dataset for Visually Grounded Ambiguity in Machine Translation 🤗, Jingheng Pan, Xintong Wang^✉️, Longyue Wang, Liang Ding, Weihua Luo, Chris Biemann.
Preprint 2025 Perception, Reason, Think, and Plan: A Survey on Large Multimodal Reasoning Models, Yunxin Li, Zhenyu Liu, Zitao Li, Xuanyu Zhang, Zhenran Xu, Xinyu Chen, Haoyuan Shi, Shenyuan Jiang, Xintong Wang, Jifang Wang, Shouzheng Huang, Xinping Zhao, Borui Jiang, Lanqing Hong, Longyue Wang, Zhuotao Tian, Baoxing Huai, Wenhan Luo, Zheng Zhang, Baotian Hu, Min Zhang.
Preprint 2025 Rethinking Multilingual Vision-Language Translation: Dataset, Evaluation, and Adaptation, Xintong Wang, Jingheng Pan, Yixiao Liu, Xiaohu Zhao, Chenyang Lyu, Minghao Wu, Chris Biemann, Longyue Wang, Linlong Xu, Weihua Luo, Kaifu Zhang.
Preprint 2025 The Bitter Lesson Learned from 2,000+ Multilingual Benchmarks, Minghao Wu, Weixuan Wang, Sinuo Liu, Huifeng Yin, Xintong Wang, Yu Zhao, Chenyang Lyu, Longyue Wang, Weihua Luo, Kaifu Zhang.
ACL 2026 POLAR: A Benchmark for Multilingual, Multicultural, and Multi-Event Online Polarization, Usman Naseem, Robert Geislinger, Juan Ren, …, Xintong Wang, …, Chris Biemann, Shamsuddeen Hassan Muhammad, Seid Muhie Yimam. Findings.
EMNLP 2025 Chinese Toxic Language Mitigation via Sentiment Polarity Consistent Rewrites 🤗, Xintong Wang, Yixiao Liu, Jingheng Pan, Liang Ding, Longyue Wang, Chris Biemann. Main Conference, Top 15%.
ACL 2025 CogSteer: Cognition-Inspired Selective Layer Intervention for Efficiently Steering Large Language Models , Xintong Wang, Jingheng Pan, Liang Ding, Longyue Wang, Longqin Jiang, Xingshan Li, Chris Biemann. Findings.
ACL 2025 Metagent-P: A Neuro-Symbolic Planning Agent with Metacognition for Open Worlds, Yanfang Zhou, Yuntao Liu, Xiaodong Li, Yongqiang Zhao, Xintong Wang, Qingyu Wu, Jinlong Tian, Zhenyu Li, Xinhai Xu. Findings.
ACL 2025 M2PA: A Multi-Memory Planning Agent for Open Worlds Inspired by Cognitive Theory, Yanfang Zhou, Xiaodong Li, Yuntao Liu, Yongqiang Zhao, Xintong Wang, Zhenyu Li, Jinlong Tian, Xinhai Xu. Findings.
COLING 2025 Multilingual and Explainable Text Detoxification with Parallel Data, Daryna Dementieva, Daniil Moskovskiy, Nikolai Babakov, Abinew Ali Ayele, Naquee Rizwan, Florian Schneider, Xintong Wang, Seid Muhie Yimam, Dmitry Ustalov, Elisei Stakovskii, Alisa Smirnova, Ashraf Elnagar, Animesh Mukherjee, Alexander Panchenko. Oral.
ACL 2024 Mitigating Hallucinations in Large Vision-Language Models with Instruction Contrastive Decoding , Xintong Wang, Jingheng Pan, Liang Ding, Chris Biemann. Findings.
ECAI 2023 Using Self-Supervised Dual Constraint Contrastive Learning for Cross-modal Retrieval, Xintong Wang, Xiaoyu Li, Liang Ding, Sanyuan Zhao, Chris Biemann.
LREC 2022 MOTIF: Contextualized Images for Complex Words to Improve Human Reading 🤗, Xintong Wang*, Florian Schneider*, Özge Alaçam, Prateek Chaudhury, Chris Biemann.
Information Sciences 2020 Plausibility-promoting Generative Adversarial Network for Abstractive Text Summarization with Multi-task Constraint, Min Yang, Xintong Wang, Yao Lu, Jianming Lv, Ying Shen, Chengming Li. JCR-Q1.

🧑‍🏫 Teaching

Winter 2027, Lecturer, Exercises Introduction to Natural Language Processing and Text Mining (Bachelor), University of Hamburg.
Winter 2027, Lecturer, Seminar Recent Advances of Foundation Models (Bachelor), University of Hamburg.
Winter 2025 & 2026, Lecturer, Exercises Natural Language Processing and the Web (Master), University of Hamburg.
Summer 2025 & 2026, Lecturer, Exercises Statistical Methods of Language Technology (Master), University of Hamburg.
Winter 2024, Teaching Assistant, Introduction to Python for Research (Bachelor / Master / PhD), Max Planck Institute.
Winter 2024, Co-Instructor, Deep Learning for Natural Language Processing (Bachelor), University of Hamburg.
Summer 2022 & 2023, Project Mentor, Web Interfaces for Language Processing Systems (Master Project), University of Hamburg.
Winter 2020, Co-Lecturer, Natural Language Processing and the Web (Master), University of Hamburg.

🎓 Supervision

Co-Supervised Doctoral Students

Yixiao Liu (Ph.D. Candidate, Universität Hamburg, 2026 – present) — LLM Post-Training, Interpretability, Cognitive Analysis.
Jingheng Pan (Ph.D. Candidate, Universität Hamburg, 2025 – present) — LVLM Post-Training, Reasoning.

Supervised Master Students

Ayko Schwedler (M.Sc. Student, Universität Hamburg, 2026 – present) — Agentic Memory and Application.
Longqin Jiang (M.Sc. Student, Universität Hamburg, 2025 – present) — In-Image Text Understanding, Vision-Language Models.

Interns

Duo Li (Ph.D. Candidate, Nanyang Technological University, 2026 Summer) — Multimodal Agents.
Qingyu Lu (Ph.D. Candidate, Southeast University, 2026 Summer) — LLM Post-Training, RL.
Yuchen Wu (Ph.D. Candidate, Shanghai Jiao Tong University, 2026 Summer) — Agent Memory.
Haonan Qi (M.Sc. Student, Shanghai Jiao Tong University, 2026 Summer) — Vision-Language Reasoning.
Yongqi Zhang (M.Sc. Student, Southeast University, 2026 Summer) — Multimodal LLMs.
Yan Shi (M.Sc. Student, Harbin Institute of Technology, 2026 Summer) — Multimodal Embeddings.

Previous Students

Jingfan Xin (M.Sc., Universität Hamburg, 2025) — Large Reasoning Models.
Xiaoyu Li (M.Sc., TU Berlin & Beijing Institute of Technology, 2024) — Foundation Models, Cross-Modal Representation Learning.
Fabian Meyer (M.Sc., Universität Hamburg, 2023) — Out-of-Distribution Detection, Robustness.
Anton Orell Wiehe (M.Sc., Universität Hamburg, 2022) — Domain Adaptation, Multi-Modal Foundation Models.
Matthew Ng Cher-Wai (M.Sc., Universität Hamburg, 2022) — Multi-Modal Generation, Transformers.
Ankit Srivastava (M.Sc., Universität Hamburg, 2022) — Lexical Simplification, Educational NLP.
Florian Schneider (M.Sc., Universität Hamburg, 2021) — Self-Supervised Multi-Modal Retrieval. GSCL Best Master’s Thesis Award 2023.
Prateek Chaudhury (B.Sc., IIT Delhi, 2021) — Educational NLP, Reading Comprehension.

🎖 Honors and Awards

2026, Best Paper Award, 9th Workshop on Event Extraction and Understanding: Challenges and Applications.
2026, Runner-Up Award, SemEval-2026 Task 9: Detecting Multilingual, Multicultural and Multi-event Online Polarization.
2025, Alibaba Group AliStar Program — Offer declined.
2025, Alibaba International AliStar Program — Offer declined.
2025, Huawei Genius Youth Program — Offer declined.
2025, Ant Group AntStar Plan A Program — Offer declined.
2023, ECAI 2023 Travel Grant Award, 26th European Conference on Artificial Intelligence.

💬 Invited Talks

2025.03, Multimodal Representation Learning: Understanding and Leveraging. Host: Prof. Jianming Lv, South China University of Technology.
2024.08, Towards Truthfulness, Safeness, and Explainable Advanced Foundation Models. Host: Prof. Xingshan Li, Institute of Psychology, Chinese Academy of Sciences.
2024.07, Mitigating Hallucinations in Large Vision-Language Models (LVLMs). Online Talk.
2024.05, Probing Large Language Models from a Human Behavioral Perspective. Host: Dr. Tiansi Dong, NeusymBridge @ LREC-COLING 2024.
2023.06, Probing Large Language Models (LLMs) for Predicting Human Behavioral Data. Hosts: Prof. Markus Hofmann & Prof. Ralph Radach, Universität Wuppertal.

🛠 Academic Services

Funding Reviewer

Vienna Science and Technology Fund — Vienna Research Groups for Young Investigators (2025).

Workshop & Shared Task Organization

POLAR @ SemEval 2026 — Attitude Polarization Detection in Multilingual Text.
CLEF 2025 SemEval Task — Multilingual Text Detoxification.
CLEF 2024 SemEval Task — Multilingual Text Detoxification.

Area Chair

ACL Rolling Review (ARR) — Multimodal Learning, 2024 – 2025.

Session Chair

ECAI 2023 — Speech and Natural Language Processing.

Conference Reviewer

ACL, EMNLP, NAACL, EACL, COLING, LREC-COLING, IJCNLP-AACL, AAAI, IJCAI, NeurIPS, AISTATS, CVPR, ECCV, ACM MM, ECAI — 2019 – present.
IJCAI 2021 — Senior Program Committee.

Journal Reviewer

ACM TALLIP (2020 – 2021); IEEE Access (2020 – 2021).

🐈 Misc.

I have a cat called Minus — his name is the German word for subtraction, with my hope that he can live life on easy mode.