Multivariate methods for the analysis of complex and big data in forensic sciences. Application to age estimation in living persons - Sorbonne Université Access content directly
Journal Articles Forensic Science International Year : 2016

Multivariate methods for the analysis of complex and big data in forensic sciences. Application to age estimation in living persons

Abstract

Researchers handle increasingly higher dimensional datasets, with many variables to explore. Such datasets pose several problems, since they are difficult to handle and present unexpected features. As dimensionality increases, classical statistical analysis becomes inoperative. Variables can present redundancy, and the reduction of dataset dimensionality to its lowest possible value is often needed. Principal components analysis (PCA) has proven useful to reduce dimensionality but present several shortcomings. As others, forensic sciences will face the issues specific related to an evergrowing quantity of data to be integrated. Age estimation in living persons, an unsolved problem so far, could benefit from the integration of various sources of data, e.g. clinical, dental and radiological data. We present here novel multivariate techniques (nonlinear dimensionality reduction techniques, NLDR), applied to a theoretical example. Results were compared to those of PCA. NLDR techniques were then applied to clinical, dental and radiological data (13 variables) used for age estimation. The correlation dimension of these data was estimated. NLDR techniques outperformed PCA results. They showed that two living persons sharing similar characteristics may present rather different estimated ages. Moreover, data presented a very high informational redundancy, i.e. a correlation dimension of 2. NLDR techniques should be used with or preferred to PCA techniques to analyze complex and big data. Data routinely used for age estimation may not be considered suitable for this purpose. How integrating other data or approaches could improve age estimation in living persons is still uncertain.
Fichier principal
Vignette du fichier
Lefevre_Multivariate.pdf (448.65 Ko) Télécharger le fichier
Origin : Files produced by the author(s)
Loading...

Dates and versions

hal-01320693 , version 1 (24-05-2016)

Identifiers

Cite

Thomas Lefèvre, Patrick Chariot, Pierre Chauvin. Multivariate methods for the analysis of complex and big data in forensic sciences. Application to age estimation in living persons. Forensic Science International, 2016, ⟨10.1016/j.forsciint.2016.05.014⟩. ⟨hal-01320693⟩
304 View
223 Download

Altmetric

Share

Gmail Facebook X LinkedIn More