Journal and Conference

Context-aware Goodness of Pronunciation for Computer-Assisted Pronunciation Trainings
Jiatong Shi, Nan Huo, Qin Jin
Interspeech, 2020.
VideoIC: A Video Interactive Comments Dataset and Multimodal Multitask Learning for Comments Generations
Weiying Wang, Jieting Chen, Qin Jin
ACM Multimedia, 2020.
ICECAP: Information Concentrated Entity-aware Image Captioning
Anwen Hu, Shizhe Chen, Qin Jin
ACM Multimedia, 2020.
Semi-supervised Multi-modal Emotion Recognition with Cross-Modal Distribution Matching
Jingjun Liang, Ruichen Li, Qin Jin
ACM Multimedia, 2020.
Say As You Wish: Fine-Grained Control of Image Caption Generation With Abstract Scene Graphs
Shizhe Chen, Qin Jin, Peng Wang, Qi Wu
CVPR, 2020.
Fine-Grained Video-Text Retrieval With Hierarchical Graph Reasoning
Shizhe Chen, Yida Zhao, Qin Jin, Qi Wu
CVPR, 2020.
Better Captioning With Sequence-Level Exploration
Jia Chen, Qin Jin
CVPR, 2020.
Skeleton-based Interactive Graph Network for Human Object Interaction Detection
Sipeng Zheng, Shizhe Chen, Qin Jin
ICME, 2020.
Unsupervised Bilingual Lexicon Induction from Mono-lingual Multimodal Data
Shizhe Chen, Qin Jin, Alexandar Hauptmann
AAAI, 2019.
Cross-culture Multimodal Emotion Recognition with Adversarial Learning
Jingjun Liang, Shizhe Chen, Jinming Zhao, Qin Jin, Haibo Liu, Li Lu
ICASSP, 2019.
Activitynet 2019 Task 3:Exploring Contexts for Dense Captioning Events in Video
Shizhe Chen, Yuqing Song, Yida Zhao, Qin Jin,Zhaoyang Zeng, Bei Liu, Jianlong Fu, Alexander Hauptmann
CVPR 2019, ActivityNet Large Scale Activity Recognition Challenge.
From Words to Sentences: A Progressive Learning Approach for Zero-resource Machine Translation with Visual Pivots
Shizhe Chen, Qin Jin, Jianlong Fu
IJCAI, 2019.
Generating Video Descriptions With Latent Topic Guidance
Shizhe Chen, Qin Jin, Jia Chen, Alexander G. Hauptmann
IEEE TRANSACTIONS ON MULTIMEDIA, 2019.
Speech Emotion Recognition in Dyadic Dialogues
Jinming Zhao, Shizhe Chen, Jingjun Liang, Qin Jin
INTERSPEECH, 2019.
Unpaired Cross-lingual Image Caption Generation with Self-Supervised Rewards
Yuqing Song, Shizhe Chen, Qin Jin
ACM Multimedia, 2019.
Visual Relation Detection with Multi-Level Attention
Sipeng Zheng, Shizhe Chen, Qin Jin
ACM Multimedia, 2019.
Neural Storyboard Artist: Visualizing Stories with Coherent Image Sequences
Shizhe Chen, Bei Liu, Jianlong Fu, Ruihua Song, Qin Jin, Pingping Lin, Xiaoyu Qi, Chunting Wang, Jin Zhou
ACM Multimedia, 2019.
Relation Understanding in Videos
Sipeng Zheng, Xiangyu Chen, Shizhe Chen, Qin Jin
ACM Multimedia, Grand Challenge: Relation Understanding in Videos, 2019.
Adversarial Domain Adaption for Multi-Cultural DimensionalEmotion Recognition in Dyadic Interactions
Jinming Zhao, Ruichen Li, Jingjun Liang, Qin Jin
AVEC, 2019.
Integrating Temporal and Spatial Attentions for VATEX Video Captioning Challenge 2019
Shizhe Chen, Yida Zhao, Yuqing Song, Qin Jin, Qi Wu
ICCV, VATEX Video Captioning Challenge 2019.
YouMakeup: A Large-Scale Domain-Specific Multimodal Dataset for Fine-Grained Semantic Comprehension
Weiying Wang, Yongcheng Wang, Shizhe Chen, Qin Jin
EMNLP, 2019.
RUC_AIM3 at TRECVID 2019: Video to Text
Yuqing Song, Yida Zhao, Shizhe Chen, Qin Jinn
NIST TRECVID, 2019.
Semi-supervised Multimodal Emotion Recognition With Improved Wasserstein GANs
Jingjun Liang, Shizhe Chen, Qin Jin
APSIPA ASC, 2019.
RUC+CMU: System Report for Dense Captioning Events in Videos
Shizhe Chen, Yuqing Song, Yida Zhao, Qin Jin, Alexandar Hauptmann
CVPR ActivityNet Large Scale Activity Recognition Challenge, 2018.
Class-aware Self-Attention for Audio Event Recognition
Shizhe Chen, Jia Chen, Qin Jin, Alexandar Hauptmann
ICMR, 2018. (Best Paper Runner-up)
Multimodal Dimensional and Continuous Emotion Recognition in Dyadic Video Interactions
Jinming Zhao, Shizhe Chen, Qin Jin
Pacific-Rim Conference on Multimedia (PCM), 2018.
iMakeup: Makeup Instructional Video Dataset for Fine-grained Dense Video Captioning
Xiaozhu Lin, Qin Jin, Shizhe Chen, Yuqing Song, Yida Zhao
Pacific-Rim Conference on Multimedia (PCM), 2018.
Multi-modal Multi-cultural Dimensional Continues Emotion Recognition in Dyadic Interactions
Jinming Zhao, Ruichen Li, Shizhe Chen, Qin Jin
ACM Multimedia Audio-Visual Emotion Challenge (AVEC) Workshop, 2018.
Video Captioning with Guidance of Multimodal Latent Topics
Shizhe Chen, Jia Chen, Qin Jin, Alexandar Hauptmann
ACM Multimedia, 2017.
Knowing Yourself: Improving Video Caption via In-depth Recap
Qin Jin, Shizhe Chen, Jia Chen, Alexandar Hauptmann
ACM Multimedia, 2017. (Best Grand Challenge Paper)
Multimodal Multi-task Learning for Dimensional and Continuous Emotion Recognition
Shizhe Chen, Qin Jin, Jinming Zhao and Shuai Wang
ACM Multimedia Audio-Visual Emotion Challenge (AVEC) Workshop, 2017.
Generating Video Descriptions with Topic Guidance
Shizhe Chen, Jia Chen, Qin Jin
ICMR, 2017.
Emotion Recognition with Multimodal Features and Temporal Models
Shuai Wang, Wenxuan Wang, Jinming Zhao, Shizhe Chen, Qin Jin, Shilei Zhang, Yong Qin
ICMI, 2017.
Facial Action Units Detection with Multi-Features and-AUs Fusion
Xinrui Li, Shizhe Chen, and Qin Jin
Automatic Face & Gesture Recognition (FGR), 2017.
Boosting Recommendation in Unexplored Categories by User Price Preference
Jia Chen, Qin Jin, Shiwan Zhao, Shenghua Bao, Li Zhang, Zhong Su, Yong Yu
ACM Transactions on Information Systems (TOIS), 2016.
Video Emotion Recognition in the Wild Based on Fusion of Multimodal Features
Shizhe Chen, Xinrui Li, Qin Jin, Shilei Zhang, Yong Qin
ICMI 2016.
Describing Videos using Multi-modal Fusion
Qin Jin, Jia Chen, Shizhe Chen, Yifan Xiong
ACM Multimedia, 2016.
Semantic Image Profiling for Historic Events: Linking Images to Phrases
Jia Chen, Qin Jin, Yifan Xiong
ACM Multimedia 2016.
Multi-modal Conditional Attention Fusion for Dimensional Emotion Prediction
Shizhe Chen, Qin Jin
ACM Multimedia 2016.
History Rhyme: Searching Historic Events by Multimedia Knowledge
Yifan Xiong, Jia Chen, Qin Jin, Chao Zhang
ACM Multimedia 2016.
Detecting Violence in Video using Subclasses
Xirong Li, Yujia Huo, Qin Jin, Jieping Xu
ACM Multimedia 2016.
Generating Natural Video Descriptions via Multimodal Processing
Qin Jin, Junwei Liang, Xiaozhu Lin
Interspeech 2016.
Improving Image Captioning by Concept-based Sentence Reranking
Xirong Li, Qin Jin
Pacific-Rim Conference on Multimedia (PCM), 2016. (Best Paper Runner-up)
Video Description Generation using Audio and Visual Cues
Qin Jin, Junwei Liang
ICMR 2016.
Exploitation and Exploration Balanced Hierarchical Summary for Landmark Images
Jia Chen, Qin Jin, Shenghua Bao, Junfeng Ye, Zhong Su, Shimin Chen, Yong Yu
IEEE Transactions on Multimedia (TMM), 2015
Lead Curve Detection in Drawings with Complex Cross-Points
Jia Chen, Min Li, Qin Jin, Yongzhe Zhang, Shenghua Bao, Zhong Su, Yong Yu
Neurocomputing, 2015, 168: 35-46.
Image Profiling for History Events on the Fly
Jia Chen, Qin Jin, Yong Yu, Alexander G. Hauptmann
ACM Multimedia 2015.
Persistent B+-Trees in Non-Volatile Main Memory
Shimin Chen and Qin Jin
VLDB, Hawaii, USA, 2015 (VLDB’15).
Semantic Concept Annotation for User Generated Videos Using Soundtracks
Qin Jin, Junwei Liang, Xixi He, Gang Yang, Jieping Xu, Xirong Li,
ICMR 2015.
Speech Emotion Recognition With Acoustic And Lexical Features
Qin Jin, Chengxin Li, Shizhe Chen, Huimin Wu
ICASSP, 2015.
Detecting Semantic Concepts In Consumer Videos Using Audio
Junwei Liang, Qin Jin, Xixi He, Gang Yang, Jieping Xu, Xirong Li
ICASSP, 2015.
Does Product Recommendation Meet its Waterloo in Unexplored Categories? No, Price Comes to Help
Jia Chen, Qin Jin, Shiwan Zhao, Shenghua Bao, Li Zhang, Zhong Su, Yong Yu
SIGIR 2014 (SIGIR’14).
Semantic Concept Annotation of Consumer Videos at Frame-level Using Audio
Junwei Liang, Qin Jin, Xixi He, Xirong Li, Gang Yang, Jieping Xu
Pacific-rim Conference on Multimedia 2014 (PCM’14).
Speech Emotion Classification using Acoustic Features
Shizhe Chen, Qin Jin, Xirong Li, Gang Yang, Jieping Xu
ISCSLP, 2014.