Machine learning and co-evolution methods for protein-protein interactions

Maureen Muscat

Thèse Année : 2022

Machine learning and co-evolution methods for protein-protein interactions

Méthodes d’apprentissage et coévolution pour les interactions protéines-protéines

(1, 2)

1
2

Maureen Muscat

Fonction : Auteur

Sorbonne Université - UFR d'Ingénierie

Statistical Genomics and Biological Physics [LCQB]

Résumé

In this thesis, we focus on the use of machine learning to solve the problem of the prediction of protein-protein interactions (PPI). The study of PPI is a central problem in biology, as proteins interact with each other to form complex networks that carry out the biological functions of cells. Experimental techniques to determine when and how proteins interact are very costly and time-consuming, so there is a great need for computational methods that can predict PPIs. We will explore the use of machine learning based on coevolution and deep learning for PPI prediction. Coevolutionary methods such as Direct Coupling Analysis have been used successfully for a number of different tasks, such as the prediction of intra-protein contacts, inter-protein contacts, and the prediction of mutational landscape. During my Ph.D., I developed a supervised machine learning algorithm to predict inter-domain and inter-protein contact maps called FilterDCA. The aim was to add some supervision, using typical contact patterns, while keeping the tool interpretable. I have also worked on PPIs in the SARS-CoV2 virus and on a multi-protein complex present in some bacteria membranes.

Dans cette thèse, nous nous intéresserons à l'utilisation de l'apprentissage automatique pour le problème de la prédiction des interactions protéine-protéine (IPP). L'étude des interactions protéine-protéine est un problème central en biologie, car les protéines interagissent entre elles pour former les réseaux complexes qui assurent les fonctions biologiques des cellules. Les techniques expérimentales permettant de déterminer quand et comment les protéines interagissent sont très coûteuses et prennent beaucoup de temps. Il existe donc un grand besoin de méthodes informatiques permettant de prédire les IPP. Nous allons explorer l'utilisation de l'apprentissage automatique basé sur la coévolution et l'apprentissage profond pour la prédiction des IPP. Les méthodes de coévolution ont été utilisées avec succès pour un certain nombre de tâches différentes, telles que la prédiction des contacts intra-protéines, des contacts inter-protéines et la prédiction du paysage mutationnel. Au cours de mon doctorat, j'ai développé un algorithme d'apprentissage automatique supervisé pour prédire les contacts inter-domaines et inter-protéines, appelé FilterDCA. L'objectif était d'ajouter une certaine supervision, en utilisant des patterns de contact typiques, tout en gardant l'outil interprétable. J'ai également travaillé sur les interactions protéines-protéines dans le virus SARS-CoV2 et dans le cas d'un complexe multi-protéique présent dans les membranes de certaines bactéries.

Mots clés

Coevolution Interaction protein-protein Machine learning

Coévolution Intéraction protéine-protéine Prédiction de contacts Prédiction de structure Machine learning Direct coupling analysis

Domaines

Interactions cellulaires [q-bio.CB] Evolution [q-bio.PE]

Fichier principal

MUSCAT_Maureen_theseV1_2022.pdf (105.29 Mo)

Origine	Version validée par le jury (STAR)

ABES STAR : Contact

https://hal.sorbonne-universite.fr/tel-04029405

Soumis le : vendredi 21 avril 2023-16:11:20

Dernière modification le : jeudi 28 novembre 2024-03:24:13

Archivage à long terme le : samedi 22 juillet 2023-19:03:22

Dates et versions

tel-04029405 , version 1 (21-04-2023)

Identifiants

HAL Id : tel-04029405 , version 1

Citer

Maureen Muscat. Machine learning and co-evolution methods for protein-protein interactions. Cell Behavior [q-bio.CB]. Sorbonne Université, 2022. English. ⟨NNT : 2022SORUS507⟩. ⟨tel-04029405⟩

Exporter

BibTeX XML-TEI Dublin Core DC Terms EndNote DataCite

Collections

CNRS STAR LCQB LCQB-SGBP IBPS SORBONNE-UNIVERSITE THESES-SU SU-SCIENCES THESES-UNC

293 Consultations

29 Téléchargements

Machine learning and co-evolution methods for protein-protein interactions

Méthodes d’apprentissage et coévolution pour les interactions protéines-protéines

Résumé

Mots clés

Domaines

Dates et versions

Identifiants

Citer

Exporter

Collections

Partager