I am a PhD student at the University of California, Merced, working on generation, fortunate to be advised by Prof. Ming-Hsuan Yang. Previously, I worked on visual-language learning with Prof. Winston Hsu at Nataional Taiwan University. I’m interested in developing intelligence that can interact with human. My research areas are computer vision, partially at the intersection of natural language processing.
Selected Publications
Exploiting Diffusion Prior for Generalizable Dense Prediction
A universal transferring framework.
Learning Fine-Grained Visual Understanding for Video Question Answering via Decoupling Spatial-Temporal Modeling
A double-stream video-language encoder.
Hsin-Ying Lee, Hung-Ting Su, Bing-Chen Tsai, Tsung-Han Wu, Jia-Fong Yeh, Winston Hsu
BMVC 2022 (spotlight)
BMVC 2022 (spotlight)