Image Description

Bio.Qin Jin is a professor in School of Information at Renmin University of China, who is leading the Multi-level Multi-aspect Multimedia Analysis (AI.M3) research group. She received her Ph.D. degree in Language and Information Technologies from Carnegie Mellon University in 2007 and her B.Sc. and M.S. degrees in computer science and technologies from Tsinghua University, Beijing, China in 1996, 1999, respectively. She has research interest in multimedia computing and human computer interaction.

Multi-Level

Multi-Aspect

Multi-Modal

We live in a multi-modal world, we learn, we think and we express through multiple modalities. Therefore for AI systems they should have the ability to understand the multi-modal world. Our research effort for building AI systems foucs on understanding from multi-level, multi-aspect amd multi-modal.

  • Vision & Language

    image/video captioning, multimodal translation, VQA etc.

  • Affective Computing

    multimodal emotion recognition, memorability prediction etc.

  • Natural Dialog

    emotion-elicit dialogs etc.

Awards

  • The End-of-End-to-End A Video Understanding Pentathlon @CVPR 2020 (Rank 2nd)
  • Outstanding Method Award in VATEX Video Captioning Challenge @ ICCV 2019
  • 2019, 之江杯全球人工智能大赛视频内容描述生成 (第一名)
  • CVPR 2019 ActivityNet Large Scale Activity Recognition Challenge (ANET) Temporal Captioning Task (Winner)
  • 2019 TRECVID (Video to Text Description) Grand Challenge (Rank 1st)
  • 2019 Audio-Visual Emotion Challenge @ ACM Multimedia 2019 (Winner)
  • CVPR 2018 ActivityNet Large Scale Activity Recognition Challenge (ANET) Temporal Captioning Task (Winner)
  • 2018 TRECVID (Video to Text Description) Grand Challenge (Rank 1st)
  • 2017 TRECVID (Video to Text Description) Grand Challenge (Rank 1st)
  • Best Grand Challenge Paper Award at ACM Multimedia 2017
  • 2017 ACM Multimedia (Video to Language) Grand Challenge (Rank 1st)
  • 2017 Audio-Visual Emotion Challenge (Rank 1st)
  • 2016 ACM Multimedia (Video to Language) Grand Challenge (Rank 1st)
  • 2016 Audio-Visual Emotion Challenge (AVEC)(Rank 2nd)
  • 2016 MediaEval Movie Emotion Impact Challenge(Rank 1st)
  • 2016 Chinese Multimodal Emotion Challenge (MEC)(Rank 2nd)
  • 2016 NLPCC Chinese Weibo Stance Detection(Rank 1st)
  • "Spoken English Assistant" system won the second place price in IBM Bluemix computing contest.
  • 2015 ImageCLEF(Image Sentence Generation)Evaluation(Rank 1st)