Solving some ontology alignment issues in agronomy and biodiversity by building and leveraging ontology-based background resources based on property graphs
EmployerUniversity of Montpellier
SchoolDoctorate School I2S, PhD in Informatics
Duration 36 months
WhereLIRMM, Montpellier, France
Collaboration Project ANR D2KAB (www.d2kab.org) , Project AgroPortal (http://agroportal.lirmm.fr)
Semantic web, AI, ontology alignment, background knowledge, property graphs, NoSQL.
Semantic web technologies (OWL, RDF, SPARQL, triplestore, Linked data), Property Graphs (neo4j)
This PhD task is part of an ANR project to start in 2019 called D2KAB which primary objective is to create a framework to turn agronomy and biodiversity data into –semantically described, interoperable, actionable, open– knowledge, along with investigating scientific methods and tools to exploit this knowledge for applications in science & agriculture. The project will provide the means –ontologies and linked open data– for agronomy/agriculture and biodiversity to embrace the semantic Web to produce and exploit FAIR data. To do so, we will develop new original methods and algorithms in the following areas: data integration, text mining, semantic annotation, ontology alignment and linked data exploitation.
During this PhD project, our goal is to push the state-of-the-art in ontology alignment research  using background knowledge (BK) approaches and experimenting in agronomy and & biodiversity. We will use AgroPortal’s  mapping repository (also produced during D2KAB) as a background knowledge resource to improve state-of-the-art ontology alignment algorithms.
In latest Ontology Alignment Evaluation Initiative campaigns (OAEI – http://oaei.ontologymatching.org), machine-learning based BK-based (or content-based ) approaches are the ones obtaining now the best results; but they are only applicable when relevant and clean knowledge sources are available. We have recently finished a PhD at LIRMM (A. Annane, supervised by Z. Bellahsène and C. Jonquet) in which we have investigated the use of biomedical ontology mappings to build efficient BK (cf. ). However, the theoretical results obtain during (OAEI) benchmarking are hardly transferable to the reality of heterogeneous ontologies and user needs. In addition, the absence of KB resources in other domain than biomedicine (e.g., Anatomy and LargeBio tracks) prevent the reproducibility of the results in other domains.
Our approach is to adopt a graph-based mapping repository (exploiting NoSQL property graphs ) to facilitate the exploitation of concept-to-concept mapping paths to identify and select new ontology alignments. We make the hypothesis that property graphs based engines such as graph databases being particularly relevant for traversal related queries, will help us to push state-of-the-art performance and treat some issues such as scalability  and use of In-Memory architecture . Our preliminary work (on biomedical ontologies)  have shown interesting results that have obtained very good results at OAEI 2017. The scientific challenge is also now to extend this work and demonstrate its portability to the real world (outside of the OAEI benchmark) and new domains (agronomy & biodiversity). Eventually, we will offer our community-curated ontology mapping repository (also co-developed within D2KAB) as a resource for future OAEI campaigns and evaluate our results within this context.
Considering two ontologies to align, the background knowledge resources will be other related ontologies merged together in a single graph in which mapping paths will be identified.
We are looking for a motivated young researcher with experience in semantic web technologies and graph databases. The candidate will demonstrate aptitudes or matches with most of the following aspects:
- High motivation for scientific research
- Knowledge with semantic web technologies, especially JSON/RDF/SPARQL
- Knowledge with graph databases (e.g., NoSQL, etc.)
- Excellent technical and development skills to conduct experiments with real-world and benchmark data
- Perfect English oral and writing skills
- Autonomy and initiative, take on technical decisions within the project and justify choices
- Excellent writing skills as reports, documentation, and technical notes will always be necessary
- Basic knowledge of French with the objective to learn the language during the contract
Documents required are (include everything in one single PDF file):
- a curriculum vitae describing your education and experience;
- a motivation letter describing your interest in the position and the matches with the expected profile;
- link to your master thesis or relevant related publications;
- copies of your transcripts of records (master, bachelor);
- names and contact details of referees.
The successful candidate will hold a scholarship from the French Ministry of Higher Education Research and Innovation (1600€ net per month) for a three years period of time. Social security and benefits are included. Possibility to complement with teaching activities.
 Jérôme Euzenat and Pavel Shvaiko. Ontology Matching (second edition). Springer, 2013.
 Jonquet, C., Toulet, A., Arnaud, E., Aubin, S., Yeumo, E. D., Emonet, V., ... & Larmande, P. (2018). AgroPortal: A vocabulary and ontology repository for agronomy. Computers and Electronics in Agriculture, 144, 126-143.
 Angela Locoro, Jérôme David, and Jérôome Euzenat. Context-based matching: design of a flexible framework and experiment. Journal on Data Semantics, 3(1): 25–46, 2014
 Annane, A., Bellahsene, Z., Azouaou, F., & Jonquet, C. (2018). Building an effective and efficient background knowledge resource to enhance ontology matching. Journal of Web Semantics. In press, 2018.
 Robinson, I., Webber, J. and Eifrem, E., 2013. Graph databases. " O'Reilly Media, Inc."
 A. Castelltort et T. Martin. Handling scalable approximate queries over NoSQL graph databases: Cypherf and the Fuzzy4S framework. Fuzzy Sets and Systems348: 21-49 (2018)
 A. Castelltort et A. Laurent. Exploiting NoSQL Graph Databases and in Memory Architectures for Extracting Graph Structural Data Summaries. International Journal of Uncertainty, Fuzziness and Knowledge-Based Systems25(1): 81-110 (2017).
Job Types: Full-time, Contract
Salary: 22,000.00€ to 25,000.00€ /year