Towards Fully Mobile 3D Face, Body, and Environment Capture Using Only Head-worn Cameras

Cha, Young-Woon; Price, True; Wei, Zhen; Lu, Xinran; Rewkowski, Nicholas; Chabra, Rohan; Qin, Zihe; Kim, Hyounghun; Su, Zhaoqi; Liu, Yebin; Ilie, Adrian; State, Andrei; Xu, Zhenlin; Frahm, Jan-Michael; Fuchs, Henry

Detailed Information

Cited 23 time in webofscience

Cited 31 time in scopus

Metadata Downloads

Towards Fully Mobile 3D Face, Body, and Environment Capture Using Only Head-worn Cameras

Authors: Cha, Young-Woon; Price, True; Wei, Zhen; Lu, Xinran; Rewkowski, Nicholas; Chabra, Rohan; Qin, Zihe; Kim, Hyounghun; Su, Zhaoqi; Liu, Yebin; Ilie, Adrian; State, Andrei; Xu, Zhenlin; Frahm, Jan-Michael; Fuchs, Henry

Issue Date: Nov-2018

Publisher: IEEE COMPUTER SOC

Keywords: Terms Telepresence; Ego-centric Vision; Motion Capture; Convolutional Neural Networks

Citation: IEEE TRANSACTIONS ON VISUALIZATION AND COMPUTER GRAPHICS, v.24, no.11, pp.2993 - 3004

Journal Title: IEEE TRANSACTIONS ON VISUALIZATION AND COMPUTER GRAPHICS

Volume: 24

Number: 11

Start Page: 2993

End Page: 3004

URI: https://scholarworks.bwise.kr/gachon/handle/2020.sw.gachon/83238

DOI: 10.1109/TVCG.2018.2868527

ISSN: 1077-2626

Abstract: We propose a new approach for 3D reconstruction of dynamic indoor and outdoor scenes in everyday environments, leveraging only cameras worn by a user. This approach allows 3D reconstruction of experiences at any location and virtual tours from anywhere. The key innovation of the proposed ego-centric reconstruction system is to capture the wearer's body pose and facial expression from near-body views, e.g. cameras on the user's glasses, and to capture the surrounding environment using outward-facing views. The main challenge of the ego-centric reconstruction, however, is the poor coverage of the near-body views - that is, the user's body and face are observed from vantage points that are convenient for wear but inconvenient for capture. To overcome these challenges, we propose a parametric-model-based approach to user motion estimation. This approach utilizes convolutional neural networks (CNNs) for near-view body pose estimation, and we introduce a CNN-based approach for facial expression estimation that combines audio and video. For each time-point during capture, the intermediate model-based reconstructions from these systems are used to re-target a high-fidelity pre-scanned model of the user. We demonstrate that the proposed self-sufficient, head-worn capture system is capable of reconstructing the wearer's movements and their surrounding environment in both indoor and outdoor situations without any additional views. As a proof of concept, we show how the resulting 3D-plus-time reconstruction can be immersively experienced within a virtual reality system (e.g., the HTC Vive). We expect that the size of the proposed egocentric capture-and-reconstruction system will eventually be reduced to fit within future AR glasses, and will be widely useful for immersive 3D telepresence, virtual tours, and general use-anywhere 3D content creation.

Files in This Item: There are no files associated with this item.

Appears in Collections: ETC > 1. Journal Articles

Show full item record

qrcode

Related Researcher

Researcher Cha, YoungWoon photo

Cha, YoungWoon: IT (Department of Software)

Read more

Altmetrics

Total Views & Downloads

STATISTICS: Total View :4,207,278; Today View :1,798

RSS_1.0 RSS_2.0 ATOM_1.0

1342, Seongnam-daero, Sujeong-gu, Seongnam-si, Gyeonggi-do, Republic of Korea(13120)031-750-5114

Certain data included herein are derived from the © Web of Science of Clarivate Analytics. All rights reserved.
You may not copy or re-distribute this material in whole or in part without the prior written consent of Clarivate Analytics.

Detailed Information

Related Researcher

Altmetrics

Total Views & Downloads

BROWSE