Algorithmes de bandits pour la collecte d'informations en temps réel dans les réseaux sociaux

Abstract : In this thesis, we study the problem of real time data capture on social media. Due to the different limitations imposed by those media, but also to the very large amount of information, it is not possible to collect all the data produced by social networks such as Twitter. Therefore, to be able to gather enough relevant information related to a predefined need, it is necessary to focus on a subset of the information sources. In this work, we focus on user-centered data capture and consider each account of a social network as a source that can be listened to at each iteration of a data capture process, in order to collect the corresponding produced contents. This process, whose aim is to maximize the quality of the information gathered, is constrained at each time step by the number of users that can be monitored simultaneously. The problem of selecting a subset of accounts to listen to over time is a sequential decision problem under constraints, which we formalize as a bandit problem with multiple selections. Therefore, we propose several bandit models to identify the most relevant users in real time. First, we study of the case of the so-called stochastic bandit, in which each user corresponds to a stationary distribution. Then, we introduce two contextual banditmodels, one stationary and the other non stationary, in which the utility of each user can be estimated more efficiently by assuming some underlying structure in the reward space. In particular, the first approach introduces the notion of profile, which corresponds to the average behavior of each user. On the other hand, the second approach takes into account the activity of a user at a given instant in order to predict his future behavior. Finally, we are interested in models that are able to take into account complex temporal dependencies between users, with the use of a latent space within which the information transits from one iteration to the other. Moreover, each of the proposed approaches is validated on both artificial and real datasets.
Keywords : bandit
Complete list of metadatas

Cited literature [177 references]  Display  Hide  Download

https://hal.sorbonne-universite.fr/tel-02320864
Contributor : Thibault Gisselbrecht <>
Submitted on : Saturday, October 19, 2019 - 5:15:02 PM
Last modification on : Wednesday, October 23, 2019 - 1:46:02 AM

File

ManuscritTheseGisselbrechtThib...
Files produced by the author(s)

Identifiers

  • HAL Id : tel-02320864, version 1

Citation

Thibault Gisselbrecht. Algorithmes de bandits pour la collecte d'informations en temps réel dans les réseaux sociaux. Informatique [cs]. Sorbonne Université / Université Pierre et Marie Curie - Paris VI, 2017. Français. ⟨tel-02320864⟩

Share

Metrics

Record views

33

Files downloads

15