M. Andriluka, L. Pishchulin, P. Gehler, and B. Schiele, 2d human pose estimation: New benchmark and state of the art analysis, CVPR, 2001.

A. Benzine and B. Luvison, Quoc Cuong Pham, and Catherine Achard. Deep, robust and single shot 3d multiperson human pose estimation from monocular images, ICIP, vol.7, 2019.

A. Benzine and B. Luvison, Quoc Cuong Pham, and Catherine Achard. Deep, robust and single shot 3d multiperson human pose estimation in complex images, vol.7, 2019.

Z. Cao, T. Simon, S. Wei, and Y. Sheikh, Realtime multi-person 2d pose estimation using part affinity fields, vol.2, p.3, 2017.

F. Chabot, M. Chaouch, and Q. Pham, Lapnet : Automatic balanced loss and optimal assignment for real-time dense object detection, vol.6, p.7, 2005.

Y. Chen, Z. Wang, Y. Peng, Z. Zhang, G. Yu et al., Cascaded pyramid network for multi-person pose estimation, In CVPR, issue.3, 2018.

M. Everingham, S. M. Eslami, L. Van-gool, C. K. Williams, J. Winn et al., The pascal visual object classes challenge: A retrospective, 2015.

M. Fabbri, F. Lanzi, S. Calderara, A. Palazzi, R. Vezzani et al., Learning to detect and track visible and occluded body joints in a virtual world, ECCV, 2008.

Y. Hao-shu-fang, W. Xu, X. Wang, S. Liu, and . Zhu, Learning pose grammar to encode human body configuration for 3d pose estimation, AAAI Conference on Artificial Intelligence, issue.3, 2018.

T. Golda, T. Kalb, A. Schumann, and J. Beyerer, Human Pose Estimation for Real-World Crowded Scenarios, AVSS, 2019.

K. He and G. Gkioxari, Piotr Dollár, and Ross Girshick. Mask r-cnn, ICCV, 2017.

C. Ionescu, D. Papava, V. Olaru, and C. Sminchisescu, Human3. 6m: Large scale datasets and predictive methods for 3d human sensing in natural environments, vol.36, 2014.

S. Johnson and M. Everingham, Clustered pose and nonlinear appearance models for human pose estimation, BMVC, p.1, 2010.

H. Joo, T. Simon, X. Li, H. Liu, L. Tan et al., Panoptic studio: A massively multiview system for social interaction capture, IEEE transactions, p.7, 2006.

A. Kendall, Y. Gal, and R. Cipolla, Multi-task learning using uncertainty to weigh losses for scene geometry and semantics, CVPR, 2018.

M. Kocabas, S. Karagoz, and E. Akbas, Multiposenet: Fast multi-person pose estimation using pose residual network, ECCV, 2018.

S. Kreiss, L. Bertoni, and A. Alahi, Pifpaf: Composite fields for human pose estimation, CVPR, 2019.

J. Li, C. Wang, H. Zhu, Y. Mao, H. Fang et al., Crowdpose: Efficient crowded scenes pose estimation and a new benchmark, 2018.

J. Li, C. Wang, H. Zhu, Y. Mao, H. Fang et al., Crowdpose: Efficient crowded scenes pose estimation and a new benchmark, CVPR, 2019.

S. Li and A. B. Chan, 3d human pose estimation from monocular images with deep convolutional neural network, ACCV, 2014.

T. Lin, P. Dollár, and R. Girshick, Kaiming He, Bharath Hariharan, and Serge Belongie. Feature pyramid networks for object detection, CVPR, vol.4, p.5, 2017.

T. Lin, M. Maire, S. Belongie, J. Hays, P. Perona et al., Microsoft coco: Common objects in context, ECCV, vol.1, p.7, 2014.

W. Liu, D. Anguelov, D. Erhan, C. Szegedy, S. Reed et al., Ssd: Single shot multibox detector, ECCV, 2016.

J. Martinez, R. Hossain, J. Romero, and J. J. Little, A simple yet effective baseline for 3d human pose estimation, ICCV, 2017.

D. Mehta, H. Rhodin, D. Casas, P. Fua, O. Sotnychenko et al., Monocular 3d human pose estimation in the wild using improved cnn supervision, 3D Vision, 2017.

D. Mehta, H. Rhodin, D. Casas, O. Sotnychenko, W. Xu et al., Monocular 3d human pose estimation using transfer learning and improved cnn supervision, 2016.

D. Mehta, O. Sotnychenko, F. Mueller, W. Xu, M. Elgharib et al., Xnect: Real-time multi-person 3d human pose estimation with a single rgb camera, vol.3, 2019.

D. Mehta, O. Sotnychenko, F. Mueller, W. Xu, S. Sridhar et al., Single-shot multi-person 3d body pose estimation from monocular rgb input, vol.3, 2006.

G. Moon, Y. Chang, and K. Lee, Camera distance-aware top-down approach for 3d multiperson pose estimation from a single rgb image, ICCV, vol.3, issue.8, 2019.

G. Moon, Y. Chang, and K. Lee, Multi-scale aggregation r-cnn for 2d multi-person pose estimation. CVPR, 2019.

A. Newell, Z. Huang, and J. Deng, Associative embedding: End-to-end learning for joint detection and grouping, NIPS, vol.2, p.3, 2017.

G. Papandreou, T. Zhu, N. Kanazawa, A. Toshev, and J. Tompson, Chris Bregler, and Kevin Murphy. Towards accurate multi-person pose estimation in the wild, CVPR, 2017.

S. Park, J. Hwang, and N. Kwak, 3d human pose estimation using convolutional neural networks with 2d pose information, ECCV, 2016.

G. Pavlakos, X. Zhou, G. Konstantinos, K. Derpanis, and . Daniilidis, Coarse-to-fine volumetric prediction for single-image 3d human pose, CVPR, 2017.

J. Redmon and A. Farhadi, Yolo9000: better, faster, stronger, CVPR, 2017.

J. Redmon and A. Farhadi, Yolov3: An incremental improvement, 2018.

K. Shaoqing-ren, R. He, J. Girshick, and . Sun, Faster r-cnn: Towards real-time object detection with region proposal networks, NIPS, vol.4, 2015.

H. Rezatofighi, N. Tsoi, J. Gwak, A. Sadeghian, I. Reid et al., Generalized intersection over union: A metric and a loss for bounding box regression, CVPR, 2019.

G. Rogez, P. Weinzaepfel, and C. Schmid, Lcr-net: Localization-classification-regression for human pose, CVPR, vol.3, 2017.
URL : https://hal.archives-ouvertes.fr/hal-01505085

G. Rogez, P. Weinzaepfel, and C. Schmid, Lcr-net++: Multi-person 2d and 3d pose detection in natural images, TPAMI, vol.3, issue.8, 2019.
URL : https://hal.archives-ouvertes.fr/hal-01961189

W. Shi, J. Caballero, F. Huszár, J. Totz, P. Andrew et al., Real-time single image and video super-resolution using an efficient sub-pixel convolutional neural network, CVPR, 2016.

X. Sun, J. Shang, S. Liang, and Y. Wei, Compositional human pose regression, ICCV, 2017.

X. Sun, B. Xiao, F. Wei, S. Liang, and Y. Wei, Integral human pose regression, ECCV, vol.3, 2018.

I. Bugra-tekin, M. Katircioglu, and . Salzmann, Structured prediction of 3d human pose with deep neural networks, 2016.

Z. Tian, C. Shen, H. Chen, and T. He, Fcos: Fully convolutional one-stage object detection, 2019.

B. Xiao, H. Wu, and Y. Wei, Simple baselines for human pose estimation and tracking, ECCV, 2018.

W. Yang, W. Ouyang, X. Wang, J. Ren, H. Li et al., 3d human pose estimation in the wild by adversarial learning, In CVPR, issue.3, 2018.

A. Zanfir, E. Marinoiu, and C. Sminchisescu, Monocular 3d pose and shape estimation of multiple people in natural scenes-the importance of multiple scene constraints, CVPR, vol.7, 2018.

A. Zanfir, E. Marinoiu, M. Zanfir, A. Popa, and C. Sminchisescu, Deep network for the integrated 3d sensing of multiple people in natural images, NIPS, vol.7, 2018.

X. Zhou, Q. Huang, X. Sun, X. Xue, and Y. Wei, Towards 3d human pose estimation in the wild: a weakly-supervised approach, ICCV, 2017.