Task 1.1. Predictive metabolomic biomarkers of sensitivity to Western diets (PIs: P Even (PNCA), D Marniche (PNCA))
Aims. It is hypothesized that some biomarkers related to metabolic phenotype in young subjects sign the risk of Western diet for inducing metabolic disease in later age (obesity, impaired glucose tolerance, type 2 diabetes). A rodent model of sensitivity/resistance to Western diets for metabolic dysfunction and overweight has shown that such predictive biomarkers of sensitivity could be identified (Chaumont et al, 2015; Azzout-Marniche et al, 2014, 2016; Fedry et al, 2016). This model will be used to further identify predictive metabolomics biomarkers of sensitivity to these diets (12-month goal) and to test the predictive potency of these biomarkers for different diet formulations (e.g. high fat, high sucrose, or high fructose diets) in rodents and in humans (36-month goal).
Methods and data. Young rodents will receive a standard diet during two subsequent months, and then a Western diet that induces overweight and impaired glucose tolerance in some animals (sensitive type) but not all (resistant type). Animals will be characterized for phenotype at the end of each period, including body weight, body composition (longitudinal follow-up by MRI), glucose tolerance, and insulin sensitivity, substrate oxidation, and energy expenditure. Urine and blood samples will be collected before and at the end of each period for analyzing the metabolome (urine, blood), and relate it to the sensitivity or resistance to a Western diet. In addition, in a pilot study, the candidate metabolome profiles will be analyzed in T2D subjects compared with healthy subjects. A metabolomic mathematical model will be developed in order to propose a metabolomic score of sensitivity.
Expected results. It is expected to identify blood and urine metabolomic profiles associated with the sensitivity of subjects to high fat / Western diets for inducing metabolic dysfunction, and to formulate more specific recommendations according to subject and population groups sensitivities to the diet.
Risks. The main risk is the time required for analysing the metabolomics data.
Collaborations. D Rutledge (GENIAL), Marie-Christine Boutron-Ruault (CESP).
Task 1.2. Predictive intestinal microbiota profile for sensitivity to high fat diets (PIs : AM Davila (PNCA), P Even (PNCA))
Aims. Subjects are not uniformly sensitive to Western diets for inducing metabolic diseases and T2D, and this sensitivity has been associated to specific microbiota profiles (Yassour et al, 2016). The predictive capacity of these profiles remains unclear. It is hypothesized that microbiota profile in young subjects should sign the risk for a Western diet to induce metabolic diseases in later age (obesity, impaired glucose tolerance, or type 2 diabetes). The task proposes to evaluate if some specific microbiota profiles and microbial metabolite profiles could predict the sensitivity to a Western diet in a rodent model (12-month goal) and to further show which specific bacterial strains are the more predictive of the sensitivity (36-month goal). Additionally, and according to the animal results, a pilot study will compare the selected microbiota profiles, microbial metabolite profiles, and bacterial strains in T2D and control human subjects (36-month objective).
Methods and data. Same experimental design than in Task 1.1. Faecal and blood samples will be collected at the end of each period for analysis of microbiota and microbial metabolites. These microbiota and microbial metabolite profiles are hypothesized to be related to each animal’s sensitivity or resistance to high fat diet. In addition, microbiota and microbial metabolite profiles, and candidate bacterial strains will be analyzed in control and T2D human subjects in a pilot study. Mathematical models of the relationship between microbiota and microbial metabolite profiles, and risk of T2D will be developed in order to propose a risk score that could be used to characterize subjects.
Expected results. Some specific microbiota profile and bacterial strains are predictive of the sensitivity to Western diets.
Risk in the task. The main difficulty is the time required for analysis of the microbiota results.
Collaboration. Mathematical models for microbiota analysis and score development.
Task 1.3. Early microbial markers of dysbiosis in diabetic patients associated with lifestyle and dietary factors (PIs : Marion Leclerc (MICALIS), Marie-Christine Boutron-Ruault (CESP))
Aims. Individuals with type 2 diabetes display a gut microbiota dysbiosis characterized by a loss of diversity (Qin et al, 2012). Furthermore, it is now known that a systemic effect of the gut microbiota can be detected in the blood through circulating microbiota resulting from translocation of bacteria through the epithelium. This has indeed been reported in diabetic individuals (Serino et al., 2012). In addition, the mouth microbiota has been associated with a number of metabolic and inflammatory conditions. The 12-month goal is to provide a description of the buffy coat and saliva microbiota in diabetic subjects prior to T2D onset (considering both subjects with and without later specific complications of T2D), to compare them to those in non-diabetic subjects, and to derive early microbial markers of T2D-associated dysbiosis from these results (12-month goal). At 36 months, the task should provide results on the relationships between findings from the studied microbiotas and lifestyle, especially dietary characteristics, and on the comparison of the findings with fecal microbiomes in a subsample, in order to establish predictive scores for diabetes onset and for diabetic complications.
Methods and data. The task will be based on a nested case-control study within the E3N prospective cohort. It will analyze the microbiota from the saliva (n=500) and buffy coat (n=500) collected before disease onset in subjects with an established T2D diagnosis. These microbiotas will be compared to the corresponding microbiotas of subjects who would still be non-diabetic at the end of follow-up. They will be matched on age at blood or saliva collection. Among diabetic women, we will compare those with and without major diabetes complications. In subsets of cases and non-cases, samples will be collected to compare the fecal microbiota to early collected buffy coat and saliva microbiota in order to test whether the latter can predict later onset dysbiosis. We will analyze the relationships between the microbiota characteristics and data from dietary and lifestyle questionnaires. In addition, findings for the saliva and buffy coat microbiota will be compared with those from fecal samples collected in a subset of the diabetic and non-diabetic subjects. Mathematical models will be performed to study associations between microbial species or species networks and lifestyle parameters, dietary components, and antibiotic intake. To specifically integrate nutritional data from the E3N database, we will collaborate with data scientists involved in Task 1.5. After integration of those heterogeneous data, computing and validation of the model will lead to diabetes dysbiosis indexes.
Expected results. Composition of the microbiota from saliva and buffy coat in diabetic women prior to disease onset; Microbial markers associated with diabetes complications; Effect of antibiotic intake on the circulating microbiota in diabetics; Microbiota-Nutrition based criterion indexes for diabetes.
Risk in the task. We will investigate the saliva and buffy coat microbiotas. Threats will be related to the reduced quantities of microbial DNA, and possibly to the absence of microbial DNA in a number of samples.
Collaborations. The team include epidemiologists, MDs specialized in nutrition and digestive diseases, microbiologists, and bio-informaticians.
Task 1.4. Exploring the pathways from nutrition, body mass index, obesity, and diabetes to health at older ages: A life-course approach (PIs: Alexis Elbaz, Archana Singh-Manoux (INSERM U1018 CESP)
Aims. It is thought that the importance of dietary factors to a range of old-age outcomes plays out over the lifecourse. It is possible that their impact on health is greater at older ages. However, assessment of dietary factors in large population based studies is not straightforward. Other options include using biomarkers and body composition phenotypes, such as Body Mass Index (BMI), waist circumference and waist to hip ratio. Biomarkers may be used to characterize nutritional status (e.g., albumin) or measures of cardiovascular (lipids) and metabolic (glucose, insulin, HbA1c) health. The overall aim is to examine the role of nutritional markers with ageing outcomes in relation with body composition measures, metabolic, cardiovascular and T2D markers.
Methods and data. Data will primarily be drawn from the Whitehall II study, an ongoing cohort study of 10,308 men and women, aged 35-55 years (https://www.ucl.ac.uk/whitehallII), recruited to the study in 1985 (Singh-Manoux et al., 2012). The study design consists of a questionnaire and a uniform, structured clinical examination every 4 or 5 years. A particular strength of the study is linkage to national registers for cancer, mortality, and in- and out-patient hospitalization records for all participants, including those who have dropped out of the study. This study provides an ideal setting to examine pathways from nutritional factors and its markers to health at older ages. Nutritional behaviors, BMI and obesity, diabetes, and chronic conditions have been collected repeatedly over more than 30 years. In the past years, the cohort has increasingly focused on ageing outcomes, with objective cognitive and physical assessments starting in 1997-1999 (Elbaz et al., 2012; Artaud et al., 2016); measures of disability are also available. The study is well known in the field of diabetes research with major contributions over the past years (Tabak et al., 2009). Therefore, a large amount of very detailed data is available to study the pathways linking high BMI/obesity and diabetes to disability at each step. The long period of data collection allows us to address issues such as selection and reverse causation biases, and our team has an important experience in longitudinal modeling using a variety of statistical approaches.
Expected results. At the 12 months’ time horizon we expect to have examined the association of nutritional behaviors with intermediate phenotypes (body composition, metabolic markers, and cardiovascular biomarkers); at the 36 months’ horizon we expect to have examined the entire causal chain outlined (Nutritional beahaviors/metabolic and cardiovascular markers, body composition/ ageing outcomes (T2D, disability…).
Risks. At the present time, data on risk factors, covariates, intermediate phenotypes, and outcomes have all been collected.
Collaborations. All partners and University College London.
Task 1.5. Machine Learning: data augmentation and robust inference (PIs : Michele Sebag, CNRS-INRIA; Isabelle Guyon, LR-INRIAI, Univ. Paris-Sud; Flora Jay, CNRS; J.N. Patillon, CEA))
This task provides methodological and algorithmic resources to the model building aspects involved in Tasks 1.1 to 1.4, regarding the Machine Learning methodology, data representation and learning scores required to select robust models.
Aims. The goal of the task is to address issues related to learning from complex and expensive data. More generally, the more complex the data, the more data is required to build up efficient predictive models; in the Nutriperso project, the existing data resources are rich but limited in size relatively to their complexity (e.g. longitudinal data); the acquisition of new data during the project is also limited. The first aim (T0+12 months) is to develop a compressed representation of the data, amenable to robust learning. The second aim (T0+36 months) is to design and deploy data augmentation approaches, aimed to generate complementary data that comp with the statistical representation and prior knowledge of the experts and enables data-based regularization. While the methodology relies on general principles (Vapnik and Izmailov, 2015; Journal of Machine Learning Research), its application to the different types of data considered in T1.1 to T1.4 requires specific studies and developments.
Methods and data. Feature selection (FS) is a primary methodological step in biology-related studies (Guyon et al., 2002; Jong et al., 2004). In Task 1.2, the two datasets (animal and human) will be exploited in a domain adaptation approach (Ganin and Lempitsky, 2015) to enforce a general representation of the data. Last, prior knowledge will be used to elicit noise models; such noise models will support data augmentation via generating additional perturbed data, enabling the search for compact representation and robust learning (Ciresan et al., 2012).
Expected results. Efficient representations of the structure-rich data; completion of available resources with additional plausible data; rigorous assessment of the learned models; principled methodology for transferring models learned from different data distributions.
Risks. The main expected difficulty is related to the multi-disciplinary nature of the task. The counterpart for this risk is the high potential gain of the whole Nutriperso research strategy.
Collaborations. This task relies on the tight collaboration of datascientists from CNRS, INRIA and CEA, with epidemiologists (INSERM CESP) and microbiologists (INRA Micalis).