G. Bertasius, X. Y. Stella, H. S. Park, and J. Shi, Am i a baller? basketball skill assessment using firstperson cameras, 2016.

J. Bromley, I. Guyon, Y. Lecun, E. Säckinger, and R. Shah, Signature verification using a siamese time delay neural network, NIPS, 1993.

A. Burns, R. Kulpa, A. Durny, B. Spanlang, M. Slater et al., Using virtual humans and computer animations to learn complex motor skills: a case study in karate, BIO Web of Conferences, 2011.
URL : https://hal.archives-ouvertes.fr/hal-00640208

W. Chen, Y. Liu, Z. Kira, Y. F. Wang, and J. Huang, A closer look at few-shot classification, 2019.

D. Chung, K. Tahboub, and E. J. Delp, A two stream siamese convolutional neural network for person re-identification, Proceedings of the IEEE International Conference on Computer Vision, pp.1983-1991, 2017.

H. Doughty, D. Damen, and W. W. Cuevas, Who's better? who's best? pairwise deep ranking for skill determination, IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2018.

H. Doughty, W. W. Mayol-cuevas, and D. Damen, The pros and cons: Rank-aware temporal attention for skill determination in long videos, Computer Vision and Pattern Recognition, 2019.

H. Fawaz, G. Forestier, J. Weber, L. Idoumghar, and P. Muller, Evaluating surgical skills from kinematic data using convolutional neural networks, Medical Image Computing and Computer-Assisted Intervention -MICCAI, p.11073, 2018.

I. Funke, S. T. Mees, J. Weitz, and S. Speidel, Video-based surgical skill assessment using 3d-convolutional neural networks, 2019.

Y. Gao, S. S. Vedula, C. E. Reiley, N. Ahmidi, B. Varadarajan et al., Jhu-isi gesture and skill assessment working set ( jigsaws ) : A surgical activity dataset for human motion modeling, 2014.

S. Hochreiter and J. Schmidhuber, Long short-term memory, Neural Computation, 1997.

A. Karpathy, G. Toderici, S. Shetty, T. Leung, R. Sukthankar et al., Large-scale video classification with convolutional neural networks, IEEE Conference on Computer Vision and Pattern Recognition, 2014.

Y. Kim, Convolutional neural networks for sentence classification, EMNLP, 2014.

D. P. Kingma and J. Ba, Adam: A method for stochastic optimization, 2015.

T. Komura, B. Lam, R. W. Lau, and H. Leung, elearning martial arts, International Conference on Web-Based Learning, 2006.

C. Ledig, L. Theis, F. Huszár, J. Caballero, A. P. Aitken et al., Photo-realistic single image super-resolution using a generative adversarial network, IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2017.

Q. Lei, J. Du, H. Zhang, S. Ye, C. et al., A survey of vision-based human action evaluation methods, Sensors, 2019.

Z. Li, Y. Huang, M. Cai, and Y. Sato, Manipulation-skill assessment from videos with spatial attention network, 2019.

M. Morel, C. Achard, R. Kulpa, and S. Dubuisson, Automatic evaluation of sports motion: A generic computation of spatial and temporal errors, Image and Vision Computing, 2017.
URL : https://hal.archives-ouvertes.fr/hal-01586401

M. Morel, C. Achard, R. Kulpa, and S. Dubuisson, Time-series averaging using constrained dynamic time warping with tolerance, 2018.
URL : https://hal.archives-ouvertes.fr/hal-01630288

P. Parmar and B. T. Morris, Learning to score olympic events, IEEE Conference on Computer Vision and Pattern Recognition Workshops, 2017.

P. Parmar and B. T. Morris, Action quality assessment across multiple actions, IEEE Winter Conference on Applications of Computer Vision (WACV), 2018.

H. Pirsiavash, C. Vondrick, and A. Torralba, Assessing the quality of actions, ECCV, 2014.

F. Sung, Y. Yang, L. Zhang, T. Xiang, P. H. Torr et al., Learning to compare: Relation network for few-shot learning, IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2018.

Y. Taigman, M. Yang, M. Ranzato, and L. Wolf, Deepface: Closing the gap to human-level performance in face verification, Proceedings of the IEEE conference on computer vision and pattern recognition, pp.1701-1708, 2014.

D. Tran, L. D. Bourdev, R. Fergus, L. Torresani, and M. Paluri, Learning spatiotemporal features with 3d convolutional networks, IEEE International Conference on Computer Vision (ICCV), 2015.

J. Wang, Y. J. Song, T. K. Leung, C. Rosenberg, J. Wang et al., Learning fine-grained image similarity with deep ranking, IEEE Conference on Computer Vision and Pattern Recognition, 2014.

Z. Wang and A. Fey, Deep Learning with Convolutional Neural Network for Objective Skill Evaluation in Robot-assisted Surgery, International Journal of Computer Assisted Radiology and Surgery, 2018.

R. E. Ward, Biomechanical perspectives on classical ballet technique and implications for teaching practice, 2012.

T. Yao, T. Mei, and Y. Rui, Highlight detection with pairwise deep ranking for first-person video summarization, Proceedings of the IEEE conference on computer vision and pattern recognition, pp.982-990, 2016.