Know more

Our use of cookies

Cookies are a set of data stored on a user’s device when the user browses a web site. The data is in a file containing an ID number, the name of the server which deposited it and, in some cases, an expiry date. We use cookies to record information about your visit, language of preference, and other parameters on the site in order to optimise your next visit and make the site even more useful to you.

To improve your experience, we use cookies to store certain browsing information and provide secure navigation, and to collect statistics with a view to improve the site’s features. For a complete list of the cookies we use, download “Ghostery”, a free plug-in for browsers which can detect, and, in some cases, block cookies.

Ghostery is available here for free:

You can also visit the CNIL web site for instructions on how to configure your browser to manage cookie storage on your device.

In the case of third-party advertising cookies, you can also visit the following site:, offered by digital advertising professionals within the European Digital Advertising Alliance (EDAA). From the site, you can deny or accept the cookies used by advertising professionals who are members.

It is also possible to block certain third-party cookies directly via publishers:

Cookie type

Means of blocking

Analytical and performance cookies

Google Analytics

Targeted advertising cookies


The following types of cookies may be used on our websites:

Mandatory cookies

Functional cookies

Social media and advertising cookies

These cookies are needed to ensure the proper functioning of the site and cannot be disabled. They help ensure a secure connection and the basic availability of our website.

These cookies allow us to analyse site use in order to measure and optimise performance. They allow us to store your sign-in information and display the different components of our website in a more coherent way.

These cookies are used by advertising agencies such as Google and by social media sites such as LinkedIn and Facebook. Among other things, they allow pages to be shared on social media, the posting of comments, and the publication (on our site or elsewhere) of ads that reflect your centres of interest.

Our EZPublish content management system (CMS) uses CAS and PHP session cookies and the New Relic cookie for monitoring purposes (IP, response times).

These cookies are deleted at the end of the browsing session (when you log off or close your browser window)

Our EZPublish content management system (CMS) uses the XiTi cookie to measure traffic. Our service provider is AT Internet. This company stores data (IPs, date and time of access, length of the visit and pages viewed) for six months.

Our EZPublish content management system (CMS) does not use this type of cookie.

For more information about the cookies we use, contact INRA’s Data Protection Officer by email at or by post at:

24, chemin de Borde Rouge –Auzeville – CS52627
31326 Castanet Tolosan CEDEX - France

Dernière mise à jour : Mai 2018

Menu Logo Principal

Scientific documents annotation with @Web

Lilia Berrahou PhD defense took place the 29th of September 2015 at LIRMM Montpellier

The title of the PhD is N-ary relation arguments extraction from texts guided by a domain OTR.

Today, a huge amount of data is made available to the research community through several
web-based libraries. Enhancing data collected from scientific documents is a major
challenge in order to analyze and reuse efficiently domain knowledge. To be enhanced,
data need to be extracted from documents and structured in a common representation
using a controlled vocabulary as in ontologies. Our research deals with knowledge engineering
issues of experimental data, extracted from scientific articles, in order to reuse
them in decision support systems. Experimental data can be represented by n-ary relations
which link a studied object (e.g. food packaging, transformation process) with its
features (e.g. oxygen permeability in packaging, biomass grinding) and capitalized in an
Ontological and Terminological Ressource (OTR). An OTR associates an ontology with
a terminological and/or a linguistic part in order to establish a clear distinction between
the term and the notion it denotes (the concept).
Our work focuses on n-ary relation extraction from scientific documents in order to populate
a domain OTR with new instances. Our contributions are based on Natural Language
Processing (NLP) together with data mining approaches guided by the domain
OTR. More precisely, firstly, we propose to focus on unit of measure extraction which
are known to be difficult to identify because of their typographic variations. We propose
to rely on automatic classification of texts, using supervised learning methods, to reduce
the search space of variants of units, and then, we propose a new similarity measure that
identifies them, taking into account their syntactic properties. Secondly, we propose to
adapt and combine data mining methods (sequential patterns and rules mining) and syntactic
analysis in order to overcome the challenging process of identifying and extracting
n-ary relation instances drowned in unstructured texts.