Yansong Tang

I am currently a Sponsored Postdoctoral Researcher in the Department of Engineering Science at the University of Oxford, working with Prof. Philip H. S. Torr. My research interests lie in computer vision. Currently, I am working in the fields of video understanding and 3D reconstruction.

Prior to that, I received my Ph.D degree at Tsinghua University, advised by Prof. Jie Zhou and Prof. Jiwen Lu, and B.S. degree in Automation from Tsinghua University. I have also spent time at Visual Computing Group of Microsoft Research Asia (MSRA), and Prof. Song-Chun Zhu’s VCLA lab of University of California, Los Angeles (UCLA).

Our group at the University of Oxford is looking for self-motivated interns and visitors to work on research projects related to 3D reconstruction and video understanding. If you are interested in joining my group at Oxford, please do not hesitate to drop me an email with your resume. Remote collaboration is also welcome.

Email  /  CV  /  Google Scholar  /  GitHub  /  Research Statement

profile photo
News

  • 2021-06: One paper on self-supervised learning is accepted by the ACM Trans. on Multimedia Computing Communications and Applications.
  • 2020-08: I was awarded Excellent PhD Graduate of Beijing.
  • 2020-07: My Ph.D dissertation was awarded Excellent Doctoral Dissertation of Tsinghua University.
  • Selected Publications

    * indicates equal contribution

    dise Breaking Shortcut: Exploring Fully Convolutional Cycle-Consistency for Video Correspondence Learning
    Yansong Tang*, Zhenyu Jiang*, Zhenda Xie*, Yue Cao, Zheng Zhang, Philip H. S. Torr, Han Hu
    Technical Report, 2021
    [arxiv] [code] (to come)

    We observe a collapse phenomenon when directly applying fully convolutional cycle-consistency method for video correspondence learning, study the underline reason behind it, and propose a spatial transformation approach to address this issue.

    dise Hierarchical Interaction Network for Video Object Segmentation from Referring Expressions
    Zhao Yang*, Yansong Tang*, Luca Bertinetto, Hengshuang Zhao, Philip H. S. Torr
    Technical Report, 2021
    [arxiv] [code] (to come)

    We present an end-to-end hierarchical interaction network for video object segmentation from referring expressions, which leverages the feature pyramid produced by the visual encoder to generate multiple levels of multi-modal features.

    dise Comprehensive Instructional Video Analysis: The COIN Dataset and Performance Evaluation
    Yansong Tang, Jiwen Lu, and Jie Zhou
    IEEE Transactions on Pattern Analysis and Machine Intelligence (TPAMI), 2021
    [arXiv] [Project Page]

    Journal version of the COIN dataset.

    dise Learning from Temporal Spatial Cubism for Cross-Dataset Skeleton-based Action Recognition
    Yansong Tang*, Xingyu Liu*, Xumin Yu, Danyang Zhang, Jiwen Lu, and Jie Zhou
    ACM Transactions on Multimedia Computing Communications and Applications (TOMM), 2021
    [arxiv] [code] (to come)

    We devise a temporal-spatial Cubism strategy, which guides the network to be aware of the permutation of the segments in the temporal domain and the body parts in the spatial domain separately, thus improves the generalization ability of the model for cross-dataset action recognition.

    dise Uncertainty-aware Score Distribution Learning for Action Quality Assessment
    Yansong Tang*, Zanlin Ni*, Jiahuan Zhou, Danyang Zhang, Jiwen Lu, Ying Wu, and Jie Zhou
    IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), 2020
    Oral Presentation
    [arxiv] [Code]

    We propose an uncertainty-aware score distribution learning method and extend it to a multi-path model for action quality assessment.

    dise Graph Interaction Networks for Relation Transfer in Human Activity Videos
    Yansong Tang, Yi Wei, Xumin Yu, Jiwen Lu, and Jie Zhou
    IEEE Transactions on Circuits and Systems for Video Technology (TCSVT), 2020
    [PDF] [code] (to come)

    We propose a graph interaction networks (GINs) model for transferring relation knowledge across two graphs two different scenarios for video analysis, including a new proposed setting for unsupervised skeleton-based action recognition across different datasets, and supervised group activity recognition with multi-modal inputs.

    dise Learning Semantics-Preserving Attention and Contextual Interaction for Group Activity Recognition
    Yansong Tang, Jiwen Lu, Zian Wang, Ming Yang, and Jie Zhou
    IEEE Transaction on Image Processing (TIP), 2019
    [PDF] [Supp]

    We extend of our Semantics-Preserving Attention model with graph convolutional module for group activity recognition.

    dise COIN: A Large-scale Dataset for Comprehensive Instructional Video Analysis
    Yansong Tang, Dajun Ding, Yongming Rao, Yu Zheng, Danyang Zhang, Lili Zhao, Jiwen Lu, Jie Zhou
    IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), 2019
    [arXiv] [Project Page] [Annotation Tool]

    COIN is one of the largest and most comprehensive instructional video analysis datasets with rich annotations.

    dise Multi-stream Deep Neural Networks for RGB-D Egocentric Action Recognition
    Yansong Tang, Zian Wang, Jiwen Lu, Jianjiang Feng, and Jie Zhou
    IEEE Transactions on Circuits and Systems for Video Technology (TCSVT), 2019
    [PDF] [Project Page] [Code]

    We propose a multi-stream deep neural networks and THU-READ dataset for RGB-D egocentric action recognition.

    dise Mining Semantics-Preserving Attention for Group Activity Recognition
    Yansong Tang, Zian Wang, Peiyang Li, Jiwen Lu, Ming Yang, and Jie Zhou
    ACM Multimedia (MM), 2018
    Oral Presentation
    [PDF]

    We present a simple yet effective semantics-preserving attention module for group activity recognition.

    dise Deep Progressive Reinforcement Learning for Skeleton-based Action Recognition
    Yansong Tang*, Yi Tian*, Jiwen Lu, Peiyang Li, and Jie Zhou
    IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), 2018 [PDF]

    We propose a simple yet effective method to select key frames for skeleton-based action recognition using the REINFORCE algorithm.

    Honors and Awards

  • Excellent PhD Graduate of Beijing, 2020.
  • Excellent Doctoral Dissertation of Tsinghua University, 2020.
  • Zijing Scholar Fellowship for Prospective Researcher, Tsinghua University, 2020.
  • National Scholarship, Tsinghua University, 2018.
  • Outstanding Student Cadres, Tsinghua University, 2015.
  • EMC Integrated Merit Scholarship, Tsinghua University, 2014.
  • Zhang-Mingwei Scholarship, Tsinghua University, 2013.
  • MC Integrated Merit Scholarship, Tsinghua University, 2012.
  • Academic Services

  • Conference Reviewer: CVPR 2019/2020, ICCV 2019, ECCV 2020, AAAI 2020/2021, ICME 2019/2020/2021, ICIP 2018/2019
  • Journal Reviewer: TPAMI, TIP, TMM, TCSVT

  • Website Template