Xumin Yu

I am a fifth year Ph.D student in the Department of Automation at Tsinghua University, advised by Prof. Jiwen Lu. In 2020, I obtained my B.Eng. in the Department of Electronic Engineering, Tsinghua University.

I am broadly interested in computer vision and deep learning. My current research focuses on 3D vision and Video analysis.

Email  /  Google Scholar  /  Github

profile photo
News

  • 2022-09: One paper P2P are accepted to NeurIPS 2022.
  • 2022-03: 3 papers on 3D vision and video understanding are accepted to CVPR 2022.
  • 2021-10: Our solution based on PoinTr won the 1st place in the MVP Completion Challenge (ICCV 2021 Workshop).
  • 2021-07: 2 papers (including 1 oral) on 3D vision and video understanding are accepted to ICCV 2021.
  • Publications

    * indicates equal contribution

    dise P2P: Tuning Pre-trained Image Models for Point Cloud Analysis with Point-to-Pixel Prompting
    Ziyi Wang*, Xumin Yu*, Yongming Rao*, Jie Zhou , Jiwen Lu
    Conference on Neural Information Processing Systems (NeurIPS), 2022
    Spotlight
    [arXiv] [Code] [Project Page] [中文解读]

    P2P is a framework to leverage large-scale pre-trained image models for 3D point cloud analysis.

    dise Point-BERT: Pre-Training 3D Point Cloud Transformers with Masked Point Modeling
    Xumin Yu*, Lulu Tang*, Yongming Rao*, Tiejun Huang, Jie Zhou , Jiwen Lu
    IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), 2022
    [arXiv] [Code] [Project Page] [中文解读]

    Point-BERT is a new paradigm for learning Transformers in an unsupervised manner by generalizing the concept of BERT onto 3D point cloud data.

    dise PoinTr: Diverse Point Cloud Completion with Geometry-Aware Transformers
    Xumin Yu*, Yongming Rao *, Ziyi Wang, Zuyan Liu, Jiwen Lu , Jie Zhou
    IEEE International Conference on Computer Vision (ICCV), 2021
    Oral Presentation
    [arXiv] [supp] [Code] [中文解读 (by CVer)]

    PoinTr is a transformer-based framework that reformulates point cloud completion as a set-to-set translation problem.

    dise Group-aware Contrastive Regression for Action Quality Assessment
    Xumin Yu*, Yongming Rao*, Wenliang Zhao, Jiwen Lu , Jie Zhou
    IEEE International Conference on Computer Vision (ICCV), 2021
    [arXiv] [Code]

    We propose a new contrastive regression (CoRe) framework to learn the relative scores by pair-wise comparison, which highlights the differences between videos and guides the models to learn the key hints for assessment.

    dise Graph Interaction Networks for Relation Transfer in Human Activity Videos
    Yansong Tang, Yi Wei , Xumin Yu, Jiwen Lu , Jie Zhou
    IEEE Transactions on Circuits and Systems for Video Technology (TCSVT), 2020
    [Paper]

    We propose a graph interaction networks (GINs) model for transferring relation knowledge across two graphs two different scenarios for video analysis, including a new proposed setting for unsupervised skeleton-based action recognition across different datasets, and supervised group activity recognition with multi-modal inputs.

    dise Learning fine-grained estimation of physiological states from coarse-grained labels by distribution restoration
    Zengyi Qin , Jiansheng Chen , Zhenyu Jiang , Xumin Yu, Chunhua Hu, Yu Ma, Suhua Miao and Rongsong Zhou
    Scientific Reports , 2020
    [Paper] [Code]

    Our method allows machine learning algorithms to perform fine-grained estimation of physiological states (e.g., sleep depth) even if the training labels are coarse-grained.

    Honors and Awards

  • Excellent Undergraduate in Tsinghua University, 2020
  • The First Prize of Microsoft Imagine Cup, China Finals, 2018

  • Website Template


    © Yu Xumin | Last updated: August 3, 2021