I am a postdoctoral fellow at the Department of Computer Science and Engineering (CSE), Hong Kong University of Science and Technology (HKUST), advised by Prof. Long Chen. Prior to this, I obtained my PhD degree in Computer Science and Technology from the DCD Lab, Zhejiang University (ZJU), under the supervision of Prof. Jun Xiao.
💬 Research Interests
- Computer Vision, Machine Learning
- Vision and Language, Multimodal Generation and Editing
📝 Publications
(Selected works are shown. Full publication list in Google Scholar)

Event-Customized Image Generation
Zhen Wang, Yilei Jiang, Dong Zheng, Jun Xiao, Long Chen
- Customize your event (actions, poses, relations and interactions) with only one single image!
- Training-free, Plug-and-Play and Effective.

IterIS: Iterative Inference-Solving Alignment for LoRA Merging
Hongxu Chen, Zhen Wang, Runshi Li, Bowei Zhu, Long Chen
- A sample-efficient and broadly applicable LoRA merging method based on an iterative inference-solving framework with enhanced performance.

CLIPDrag: Combining Text-based and Drag-based Instructions for Image Editing
Ziqi Jiang, Zhen Wang, Long Chen
- Incorporating text signals into drag-based methods for precise and flexible image editing.

Learning combinatorial prompts for universal controllable image captioning
Zhen Wang, Jun Xiao, Yueting Zhuang, Fei Gao, Jian Shao, Long Chen
- A lightweight prompt-based framework for controllable image captioning.
- Effective and efficient for both single and combined controls.

Decap: Towards generalized explicit caption editing via diffusion mechanism
Zhen Wang, Xinyun Jiang, Jun Xiao, Tao Chen, Long Chen
- A diffusion-based explicit caption editing framework with strong generalization ability across various editing and generation scenarios.

Freetuner: Any subject in any style with training-free diffusion
Youcan Xu, Zhen Wang, Jun Xiao, Wei Liu, Long Chen
- A flexible and training-free method for compositional personalization.

Explicit image caption editing
Zhen Wang, Long Chen, Wenbo Ma, Guangxing Han, Yulei Niu, Jian Shao, Jun Xiao
- An interesting new task: explicit caption editing (ECE), and new benchmarks.
- An effective and efficient ECE model.