About me

Hi, I’m Min Cai (蔡旻). I’m an incoming PhD student at the University of Alberta, supervised by Dr. Xi Ye. Previously, I was an M.S. student graduated from Shenzhen University, where I was supervised by Prof. Haodi Zhang. Before that, I obtained my B.A. in Translation from Beijing Language and Culture University. Currently, I’m interning at Zhipu AI, mentored by Dr. Dan Zhang. Meanwhile, I also work closely with Dr. Ziniu Hu, Dr. Shichang Zhang, and Dr. Difan Zou.

I have broad interests in ML and NLP, particularly in understanding the mechanisms behind neural language models (LMs), developing LLM agents capable of solving complex problems, and enhancing LLM reasoning abilities. Currently, my primary focus is on inference-time algorithms for alignment and reasoning in LLMs.

Specifically, my current research focuses on:

Interpreting and controlling LLM behaviors for better alignment with human values (e.g., SelfControl)
LLM Agents capable of solving complex tasks, such as multi-agent social deduction games(e.g., AvalonBench)
Improving LLM reasoning capabilities, particularly by introducing advanced inference-time algorithms like Monte Carlo tree search (e.g., Strategist), controlled text generation and representation engineering.

Selected Publications

How Post-Training Reshapes LLMs: A Mechanistic View on Knowledge, Truthfulness, Refusal, and Confidence

Hongzhe Du*, Weikai Li*, Min Cai, Karim Saraipour, Zimin Zhang, Himabindu Lakkaraju, Yizhou Sun, Shichang Zhang (*equal contribution)

Outstanding Paper Award at the New England NLP Workshop

In this paper we studied how post-training reshapes LLMs on knowledge storation, truthfulness, refusal and confidence, using toolkits like causal tracing, linear probing and entropy neurons.

Strategist: Learning Strategic Skills by LLMs via Bi-Level Tree Search

Jonathan Light, Min Cai, Weiqin Chen, Guanzhi Wang, Xiusi Chen, Wei Cheng, Yisong Yue, Ziniu Hu

International Conference on Learning Representations (ICLR 2025)
Covered by State of AI Report 2024, published by Air Street Capital.

Strategist is an advanced game agent that utilizes LLMs to acquire new skills for playing multi-agent games through a self-improvement process.

Self-Control of LLM Behaviors by Compressing Suffix Gradient into Prefix Controller

Min Cai, Yuchen Zhang, Shichang Zhang, Fan Yin, Dan Zhang, Difan Zou, Yisong Yue, Ziniu Hu

Workshop on Foundation Models in the Wild & Mechanistic Interpretability Workshop, ICML 2024

SelfControl is an inference time LLM control method that leverages LLM self-evaluation to control model behaviors through representation engineering.

AvalonBench: Evaluating LLMs Playing the Game of Avalon

Jonathan Light*, Min Cai*, Sheng Shen, Ziniu Hu (*equal contribution)

Foundation Models for Decision Making Workshop, NeurIPS 2023

AvalonBench is a benchmark that explores the potential of Large Language Models (LLMs) Agents in playing the strategic social deduction game, Resistance Avalon.

Beyond Academics

In my spare time, I enjoy playing and listening to music (jazz, classical, R&B, etc.). I also like playing games such as the Pokémon Trading Card Game (PTCG), and I love trying all kinds of food.