Title: Skeleton-based Human Action Understanding via Point Cloud Representation
Date: 2024/04/26 14:20-15:30
Location: CSIE R103
Speaker: Dr. Ryo Hachiuma, Research Scientist ,NVIDIA Taiwan
Host: Prof. Shang-Tse Chen
Abstract:
In this presentation, I will introduce two papers on the human action understanding task that were accepted at CVPR 2023. Unlike appearance-based action understanding approaches, the skeleton-based human action understanding approach is robust to changes in background texture and illumination. Unlike conventional skeleton-based approaches that represent the human skeleton as a graph or heatmap, we propose to represent a sequential human skeleton as a 3D point cloud in spatio-temporal space. This new representation enables accurate, fast, and robust action recognition. Moreover, by combining the recent Vision & Language training paradigm with the anomaly action detection task, our approach allows users to specify the type of abnormal action they want to detect using natural language expressions.
Biography:
Ryo Hachiuma is a research scientist at NVIDIA Taiwan, specializing in computer vision and deep learning. He received his Ph.D. in Computer Science at Keio University, supervised by Prof. Hideo Saito in 2021. Prior to joining NVIDIA, he worked as a computer vision engineer at Konica Minolta in Japan and as a Project Assistant Professor at Keio University. His research interests include video and multi-modal understanding. Ryo joined NVIDIA in July 2023 and has since been involved in the audio-visual understanding and the Large Video Understanding Models.