🔥 News

2023.02: 🎉 One paper is accepted by ICASSP 2023.

📝 Publications

Inage Captioning

ICASSP 2023

End-to-End Non-Autoregressive Image Captioning

Hong Yu, Yuanqiu Liu, Baokun Qi, Zhaolong Hu, Han Liu

PDF | Code

Most of the existing image captioning models use the autoregressive approach to generate captions, which leads to high latency in the inference process. Non-autoregressive decoding generates words in parallel, which greatly improves the model inference speed. However, non-autoregressive decoding usually leads to performance loss due to the loss of word input. In this paper, we propose a semantic retrieval module that uses image features to retrieve semantic information as input of the non-autoregressive decoder, narrowing the performance gap between the non-autoregressive and the autoregressive model. Furthermore, we adopt Swin-Transformer instead of Faster R-CNN to extract image features, thus building an end-to-end image caption model. Experiments conducted on the MSCOCO dataset show that our model achieves new state-of-the-art performances of 122.6% CIDEr score on the ‘Karpathy’ offline test split with 37× inference speedup.

CVPR 2020 End-to-End Non-Autoregressive Image Captioning, Hong Yu, Yuanqiu Liu, Baokun Qi, Zhaolong Hu, Han Liu*.

🎖 Honors and Awards

2022.11 昇腾AI创新大赛全国总决赛-铜奖
2022.09 昇腾AI创新大赛大连赛区-金奖

📖 Educations

2020.09 - present, Ph.D., School of Software, Dalian University of Technology.
2016.09 - 2020.06, Undergraduate, School of Software, Dalian University of Technology.

💬 Invited Talks

2023.06, xxxxxxxxx. | [video]

💻 Internships

2016.09 - 2023.06, Dalian University of Technology, China.