US 11,816,795 B2
	Photo-video based spatial-temporal volumetric capture system for dynamic 4D human face and body digitization
Kenji Tashiro, San Jose, CA (US); Chuen-Chien Lee, Pleasanton, CA (US); and Qing Zhang, San Jose, CA (US)
Assigned to SONY GROUP CORPORATION, Tokyo (JP)
Appl. No. 17/414,540
Filed by Sony Group Corporation, Tokyo (JP)
PCT Filed Dec. 20, 2019, PCT No. PCT/US2019/068151 § 371(c)(1), (2) Date Jun. 16, 2021, PCT Pub. No. WO2020/132631, PCT Pub. Date Jun. 25, 2020.
Claims priority of provisional application 62/782,862, filed on Dec. 20, 2018.
Claims priority of application No. PCT/US2019/068151 (WO), filed on Dec. 20, 2019.
Prior Publication US 2022/0044478 A1, Feb. 10, 2022
Int. Cl. G06T 17/20 (2006.01); G06V 40/16 (2022.01); G06T 15/04 (2011.01); H04N 23/611 (2023.01); H04N 23/951 (2023.01)

CPC G06T 17/20 (2013.01) [G06T 15/04 (2013.01); G06V 40/165 (2022.01); G06V 40/171 (2022.01); H04N 23/611 (2023.01); H04N 23/951 (2023.01)]

21 Claims

1. A method comprising:

capturing content using one or more photography cameras and one or more video cameras, wherein capturing the content includes capturing dynamic facial expressions and dynamic body actions with the one or more photography cameras and one or more video cameras;

triggering, with a device, the one or more photography cameras and the one or more video cameras to acquire one or more keyframes; and

generating, with the device, one or more models based on the captured content and the one or more keyframes, wherein the one or more models are used to implement: mesh-tracking based temporal shape super-resolution on a first resolution but a first frame-rate video-based 4D scanned volumetric sequence refined by using 3D scanned second resolution, higher than the first resolution, templates at multiple keyframes, captured by both the one or more photography cameras and the one or more video cameras, for recovering second resolution, higher than the first resolution surface dynamics in an action sequence.