« Atrás PhD position in AI, machine learning and Knowledge Graphs in Montpellier, France[ Starting April - June 2022 ]
KeywordsKnowledge graphs, Machine learning, Representation learning, Semantic Web, Linked Data, NLP Title: AI-driven Bottom-Up Data Linking: from Knowledge Graph Profiling to Meaningful and Interpretable Links
Supervisors
ContextThe PhD position is part of the ANR project “DACE-DL: DAta-CEntric AI-driven Data Linking”, funded by the French National Research Agency (ANR). The project is a partnership between LIRMM (Montepllier), INRAE (Montpellier) and IRIT (Toulouse). The successful candidate will integrate the Web-Cube group at LIRMM for a period of 3 years and will collaborate with a team of two postdoctoral and six senior researchers from the three above-mentioned institutes.
Overview and challengesLinked data [1,13] and knowledge graphs (KG) [2,3,4,5] have been gaining popularity over the years, due to the means they offer for information access, (meta)data reuse, federation, increased visibility and sharing on the Web. Linked data weave the Web of structured knowledge and are a relevant technical answer to the challenges of open and FAIR data [6], carrying the promise to enable interoperability between resources and communities that adopt these standards. Data linking is defined as the scientific challenge of automatically establishing typed links between entities coming from two or more different structured datasets or KGs [1,13].[2] It is as crucial for the Web of today as HTTP links were for the Web of the 90s. A variety of data linking systems has been proposed over the years [13] and a number of benchmarks has been shared publicly in order to enable the evaluation of these systems, driven largely by the Ontology Alignment Evaluation Initiative (OAEI) [7]. While this has allowed for the generation of vast amounts of linked data, as demonstrated by the well-known LOD project (https://lod-cloud.net) or schema.org-related initiatives [14], designing data generic solutions benchmarked over competition-oriented datasets has also led to undesired effects, such as benchmark overfitting [8,9]. This limits the applicability of these solutions in real-world scenarios where data, in addition to being highly heterogeneous, incomplete and dynamic, are often very strongly domain-specific [1]. DACE-DL proposes a paradigm shift in the way the data linking problem is approached. Instead of devising incremental generic solutions, the project will develop data-centric bottom-up approaches leveraging artificial intelligence (AI), specifically machine learning (ML) and representation learning (RL) models. Instead of trying to fit a generic solution to any linking problem and dataset, we propose to enable a better understanding of the underlying data before applying a targeted solution best suited to the datasets at hand. DACE-DL will deliver hybrid AI-based data linking approaches and tools that can learn from the large number of existing links and systems, as well as from the semantic structure of the linked datasets, reducing the end-user effort in this process.
Research agendaDACE-DL’s paradigm is based on the idea of the automatic identification of the data linking problem types (LPTs) that two knowledge graphs manifest via machine learning techniques and the application of modular linking solutions that best fit the problem types that have been identified. The PhD project will focus on:
The PhD project will produce a large number of high quality publications in top ranked scientific journals and conferences in the fields of AI, machine learning, web data science and semantic web.
Expected profileWe are looking for a motivated junior researcher with experience in machine learning, knowledge graphs, semantic web and linked data. The candidate will demonstrate matches with most of the following aspects:
ApplicationApplications for this position will be received EXCLUSIVELY in a single PDF document containing your name in its title accessible for download via email sent to Konstantin Todorov (todorov@lirmm.fr). Please avoid attached documents and include links if you would like to send additional documents.
Required documents are:
ContractThe successful candidate will be employed by the University of Montpellier for a three years period of time (approx. 1700€/month). Social security and benefits are included. It will be possible (but not mandatory) to complement the salary with teaching activities.
LIRMM - Laboratory of Computer Science, Robotics and Microelectronics of Montpellier; UM - University of Montpellier |