IntDaMuS‎ > ‎

Work Packages


WP1: Network meta-analysis for mixed-treatment comparisons

This WP will be focused on network meta-analysis for MTC and serves a manifold role. Firstly, we are interested in showing that complex models for network meta-analysis can easily be fitted in a frequentist framework using available software, contrary to what it is generally believed. Secondly, we will extend the methods in order to allow for estimation of measures other than the odds ratio, such as the risk ratio or the incidence rate ratio. Such measures are more preferable in clinical trials and the estimates are more easily interpreted. In order to allow a network meta-analysis with such measures, we will rely on the general multivariate meta-analysis framework, following closely a recent work of our team in which we presented mixed effects Poisson regression methods for meta-analysis of studies with constant or varying duration providing estimates for incidence rate ratio and incidence rate difference and we will develop methods that make direct use of the binary nature of the data without assuming normality of the estimates. Finally, we are going to develop software that can accommodate such analysis in a frequentist framework. This way, the analysis will be easy for non-experts since currently such models are fitted only in a Bayesian framework using WinBUGS which poses severe limitations to its use by non-statisticians (writing code, monitoring the convergence of MCMC and so on).

WP2: Meta-analysis for detecting gene-gene and gene-environment interactions

In this WP we will deal with the development of methodology for meta-analysis of studies investigating gene-gene (GxG) and gene-environment (GxE) interactions. The standard case-control methodology is used successfully to study the GxE and GxG interactions but an alternative strategy of analysis is the use of the case-only design which is very effective and valid. With this method we avoid recruiting controls, a process which is time-consuming and costly, whereas we obtain greater statistical power. The classic meta-analysis methodologies can be applied to assess gene-environment interaction. Nevertheless, one major problem remains the fact that the available studies have to follow all the same design (for instance, case-only studies and so on). The purpose of this WP is to develop methodology for meta-analysis that can incorporate studies with different types of data, performed under different designs. Thus, we will develop methods and easy-to-use software in order to combine case-only studies with case-control studies, as well as studies, in which exposure to environmental factors is not specified in homogeneous way. Furthermore, by analysing in parallel the results of studies of different designs (e.g. combining case-only and case-control studies) we will be also able to check the assumptions inherently made by these methods.

WP3: Genetic network meta-analysis

Mendelian randomization is an approach that has been proposed as a method to test for, or at least estimate, the causal effect of an intermediate phenotype on a disease. Meta-analysis can be applied in a Mendelian randomization setting and multivariate methods have been proposed. In this WP, we will attempt to present an extension of the Mendelian randomization methodology incorporating measures of association of the genotype with more than one phenotype and the disease. This will be accomplished by utilizing ideas and methods developed for meta-analysis of network meta-analysis. By the joint modelling of various genotype-phenotype relations in a single meta-analysis, we will be able to gain some insight in the causal pathways that underlie the disease formation and presumably understand the nature of the interactions among several genetic factors (in linkage disequilibrium) leading to the formation of the disease. By integrating evidence from multiple studies, we will be able to increase the statistical power, as well as, to include as many as possible disparate studies in the analysis. For instance, the multivariate framework will allow the incorporation of studies reporting one, two or more genetic variants in linkage disequilibrium and one or more continuous phenotypes that are supposed to be under the control of the genotypes and are likely to influence the overall risk for a disease. The approach we are going to investigate, being an extension of Mendelian randomization, it will have much in common (in a meta-analytical sense) with causal inference analysis involving intermediate phenotypes, as well as with approaches for constructing endophenotypes based on the clustering of genetic pathways.

WP4: High-throughput methods - Microarrays and GWAS

The particular WP will deal exclusively with high-throuput agnostic methods such as gene-expression analysis using microarrays and genome-wide association studies which exhibit explosive popularity during the last years and are in part responsible for the above-mentioned convergence of genetic epidemiology with bioinformatics. Due to their agnostic nature, both methods suffer from a lack of repeatability and replication validity and thus, meta-analysis has been proven very useful in summarizing the results of both GWAS and gene-expression studies from microarrays. Although meta-analysis of GWAS have provided significant results for common disorders that are under investigation by our proposal, meta-analyses of similar outcomes in microarray studies are limited, even though there are enough available data, for instance for stroke or myocardial infarction. Thus, an obvious goal of the particular WP would be to perform well-conducted meta-analyses of available data of microarray studies on these outcomes. A subsequent goal would be to develop methods for constructing molecular signatures for detecting such outcomes. The signature approach allows only the most informative markers to be considered in predictive modelling. Such approaches have been proposed, but in most of the cases the approach consists of simply selecting the top-scoring genes, among which in several occasions included genes with highly correlated expressions. To this end, we are going to develop a methodology based on principal components analysis, a method that has been shown useful in the past for identifying a small subset of SNPs. For the same purpose we will also study algorithms based on artificial neural networks that address specific constraints, such as minimizing the number of synaptic weights, i.e. "pruning" of network, as well as, statistical methods (LASSO) which also leads to the selection of the most "informative" variables. Finally, combining the results of GWAS and gene-expression studies should also be pursuit. Toward this approach the construction of endophenotypes will be an important goal, taking under consideration both genetic variation and gene-expression (extending the approach of WP3).

WP5: Synthesis analysis and construction of multivariate risk scores

The particular WP will have the objective to derive mathematical models that would be used for assessing the overall risk of an individual, extending in several respects the work done in WP4. Traditionally, such models are used in order to derive the risk by analysing a large cohort and thus, the available methods require access to individual data. A major obstacle usually encountered when synthesizing data from multiple sources, is the fact that we usually need to synthesize evidence across studies that comprise data of diverse populations. Thus, new analytic methods are clearly needed. In this respect we are going to rely on, and extend, the recently proposed method of "synthesis analysis". Whereas traditional meta-analysis searches for a single summary measure for the relation of two variables (Y, X, i.e. exposure and disease) across k studies, synthesis analysis seeks to integrate estimates for two or more predictors (X1, X2), in a multivariate model for predicting Y, using only information from the pairwise comparisons. The methodology of research synthesis was originally proposed for continuous variables and it is clear that an extension to the case in which Y is binary is needed, since we are going to synthesize mainly odds ratios derived from different studies. Thus, an important problem arises since in many cases multiple genetic markers are correlated (i.e. they are in linkage disequilibrium) and thus, the covariance matrix needs to be calculated in order to derive a general model. Thus, it is clear that we need to develop mathematical methods for calculating the covariances between correlated estimates in order for them to be integrated afterwards.

WP6: Validation - computational, functional and structural studies

In this WP, structural, computational and functional studies will be performed in order to determine the molecular basis of a given gene-disease association. We are going to focus on the complex diseases stated previously (diabetes, hypertension, cardiovascular disease and so on), but the particular genetic variants that will be studied here, depend largely on the results of the previous WPs. Our research will focus on the proteins and the polymorphisms that will be tracked in previous WPs. For non-synonymous polymorphisms that will be identified, structural studies will be facilitated mostly through comparative modelling and docking analyses of mutant proteins. For regulatory and various non-synonymous polymorphisms, several of which are found to be implicated in diseases with a previously unknown mechanism, recent studies have shown that these act through the action of microRNAs. In case of a strong and newly found association, an approach using computational methods for microRNA target prediction will be used, and an attempt to identify the microRNA and its target will be performed using standard molecular biology techniques. Functional studies will be performed in cases of other regulatory polymorphisms. Usually, non-synonymous SNPs are located in promoter regions and in many cases are thought to influence the expression of the adjacent gene. Various methods are available in molecular biology for studying the functional role of these polymorphisms and we ready to undertake such efforts, since in the Department of Computer Science and Biomedical Informatics currently operates a modern fully equipped molecular biology laboratory capable of performing such an analysis.

WP7: Dissemination of the results

This last WP will include activities spanning all the above-mentioned WPs. We are guided by our belief that complex computational methods should be available to the scientific community. In this spirit, all methods and software that will be developed during the project will be made directly available to the public, through our website and public repositories. We will create a dedicated web-page listing the results of the WPs, the developed software, instructions and supplementary material such as video, online lectures and tutorials that can be used through the internet as well as discussion pages and forums. The recently formed Hellenic Society for Computational Biology and Bioinformatics, of which Dr Pantelis Bagos is a member of the board of directors, has reached to a decision that encourage regional committees to be actively involved in the organization of tutorials, practical courses and other activities under the auspices of the society. In this spirit, we will organise tutorials and practical courses that can either be organised as stand-alone events or as satellite events to the annual conference of the society. Finally, a meeting will be organised in the end of the project in which the results will be discussed and will be presented to a wider audience.