Large-scale online social networks such as Twitter or FaceBook provide a powerful means of accessing information. They rely on "social filtering", whereby pieces of information are collectively evaluated and sorted by users. This gives rise to information cascades when one item reaches a large population after spreading much like an epidemics from user to user in a viral manner.
Nevertheless, such OSN’s expose their users to a large amount of content of no interest to them, a sign of poor « precision » according to the terminology of information retrieval. At the same time, many more relevant content items never reach those users most interested in them. In other words, OSN’s also suffer from poor « recall » performance.
The goal of this postdoctoral research will be to study the fundamental limitations in terms of precision/recall of information access through OSN’s. A companion objective will be to identify distributed mechanisms for improving such information access, ideally approaching the optimal precision/recall trade-off curve.
The researcher will elaborate models of information propagation in OSN’s, featuring representations of information types, user interests and user expertise. This can rely on existing work in machine learning and theoretical computer science (e.g. latent variable models, or latent semantic space for describing contents) and evaluation of available data traces.
These models will be used to perform a characterization of precision/recall trade-offs. To this end, several approaches can be envisioned. For instance, the objective can be considered as an active learning problem, whereby one aims to infer a new content’s type as quickly as possible, relying on implicit feedback from users chosen adaptively to that end.
Guided by this study, efficient mechanisms will be sought for assisting information dissemination in OSN’s. One key issue is scalability with respect to the population size of the OSN. Inference of users’ tastes and expertise will be considered under such constraints.
Proficiency in the following areas would be desirable : probabilistic modeling and optimization theory ; distributed algorithms ; applied statistics and machine learning ; distributed systems architecture and experimentation.
In compliance with articles 39 and 40 of the French data protection act (Act no. 2004-801 of August 6th 2004) concerning data processing, files and individual liberties, you have the right to access and modify your personal information.
Inria has designated a special correspondent (CIL) to handle data processing and privacy issues with the CNIL, in compliance with article 22-III of the amended Act of 6th January 1978. This correspondent keeps records of data processed internally and normally subject to declaration. He is notably in charge of ensuring the right to access and correct personal information.