meta-tdt

Meta-analysis of Family-based and Case-control Genetic Association Studies That Use the Same Cases

Pantelis G. Bagos, Niki L. Dimou, Theodore D. Liakopoulos, Georgios K. Nikolopoulos

In many cases in genetic epidemiology, the investigators in an effort to control for different sources of confounding and simultaneously to increase the power perform a family-based and a population-based case-control study within the same population, using the same or largely overlapping, set of cases. Various methods have been proposed for performing a combined analysis, but they all require access to individual data that are difficult to gather in a meta-analysis. Here, we propose a simple and efficient summary-based method for performing the meta-analysis. The key point, contrary to the methods presented earlier that need individual data, is the calculation of the covariance between the study estimates (log-Odds Ratios), using only data derived from the literature in the form of a 2x2 contingency table. Afterwards, the studies can easily be combined either in a two-step procedure using traditional methods for univariate meta-analysis or in a single-step approach using hierarchical models. In any case, the meta-analysis can be performed using standard software and because of the increased sample size the statistical power of the meta-analysis is increased whereas the procedure allows performing several diagnostics (publication bias, cumulative meta-analysis, sensitivity analysis). The method is evaluated on a dataset of 356 Single Nucleotide polymorphisms (SNPs) which were evaluated for their potential association with Respiratory Syncytial Virus Bronchiolitis (RSV) and subsequently is applied in a meta-analysis concerning the association of the 10-Repeat Allele of a VNTR Polymorphism in the 3’-UTR of Dopamine Transporter Gene with Attention Deficit Hyperactivity Disorder (ADHD), as well as in a genome-wide association study for Multiple Sclerosis. Implementation of the method is straightforward and in the Appendix, a Stata program is given for implementing the methods presented here.

Bagos PG, Dimou NL, Liakopoulos TD, Nikolopoulos GK. Meta-Analysis of Family-Based and Case-Control Genetic Association Studies that Use the Same CasesStatistical Applications in Genetics and Molecular Biology. 2011  [PDF] [Pubmed] [Google Scholar]

** the data are from the meta-analysis for the association of
** the 10-Repeat Allele of a VNTR Polymorphism in the 
** 3’-UTR of Dopamine Transporter Gene with Attention De?cit 
** Hyperactivity Disorder (Yang et al, 2007) 
input id str20 author year b c n11 n10 n01 n00 w x y z
1 Waldman 1998 90 47 . . . . . . . .
2 Swanson 2000 10 16 . . . . 60 20 66 14
3 Lunetta 2000 17 10 . . . . . . . .
4 Holmes 2000 40 45 . . . . . . . .
5 Todd 2001 55 67 . . . . . . . .
6 Curran 2001 39 20 . . . . . . . .
7 Curran 2001 39 48 . . . . . . . .
8 Kirley 2002 49 30 . . . . . . . .
9 CEDAR 2002 9 9 . . . . . . . .
10 Chen 2003 16 5 . . . . . . . .
11 Qian 2004 43 49 578 86 392 40 . . . .
12 Kustanovich 2004 119 130 . . . . . . . .
13 Wang 2004 13 7 . . . . . . . .
14 Kim 2005 17 16 . . . . . . . .
15 Feng 2005 76 76 . . . . . . . .
16 Bobb 2005 20 12 88 238 65 193 . . . .
17 Brookes 2006 65 32 . . . . . . . .
18 Brookes 2006 28 9 . . . . . . . .
19 Cheuk 2006 . . 116 12 119 9 . . . .
20 Langley 2005 . . 387 139 424 150 . . . .
21 Roman 2001 . . 98 34 166 58 105 30 106 29
22 Simseka 2005 . . 59 33 67 43 . . . .
23 Hawi 2003 . . . . . . 145 42 121 66
24 Wang 2004 . . . . . . 100 8 94 14
25 Cook 1995 . . . . . . 72 12 57 27
26 Jiang 1999 . . . . . . 136 12 136 12
end


** calclulation of the population-based allelic OR and its variance – Eq. (1) and (2)
gen logorcc=log( (n11* n00)/( n10*n01))
gen varcc=1/ n11+1/ n10+1/ n01+1/ n00
gen secc=sqrt(varcc)

** calclulation of the tdt-based OR and its variance – Eq. (6) and (7)
gen logortdt=log(b/c)
gen vartdt=1/b+1/c
gen setdt=sqrt(vartdt)

** calclulation of the HHRR and its variance – Eq. (8) and (9)
gen logorhhrr=log((w*z)/(x*y))
gen varhhrr=1/x+1/z+1/y+1/w
gen sehhrr=sqrt(varhhrr)
 
** code the different types of studies
gen type=1 if logortdt!=. &logorcc==.&logorhhrr==.
replace type=2 if logorcc!=. & logortdt==.&logorhhrr==.
replace type=3 if logorhhrr!=. & logortdt==.&logorcc==.
replace type=4 if logorcc!=. & logortdt!=.&logorhhrr==.
replace type=5 if logorcc!=. &logorhhrr !=.&logortdt==.
replace type=6 if logorhhrr!=. & logortdt!=.&logorcc==.

*calculation of the covariances - Eq. (21) and (22)
gen cov_tdt_cc=1/n11+1/n10 if type==4
gen cov_hhrr_cc=1/n11+1/n10 if type ==5

** generate the combined logOR
gen logor=  logortdt if type==1
replace logor= logorcc if type==2
replace logor= logorhhrr if type==3

** generate the variance of the combined logOR
gen var=vartdt if type==1
replace var=varcc if type==2
replace var=varhhrr if type==3
 
**calculate lambda from Eq. (16)
gen lambda=(vartdt-cov_tdt_cc)/(vartdt+varcc-2*cov_tdt_cc) if type==4
replace lambda=(varhhrr-cov_hhrr_cc)/(varhhrr+varcc-2*cov_hhrr_cc) if type== 5

** compute the combined logOR - Eq. (14)
replace logor=lambda*logorcc+(1-lambda )*logortdt  if type==4
replace logor=lambda*logorcc+(1-lambda )*logorhhrr  if type==5

** compute the variance of the combined logOR - Eq. (15)
replace var=lambda^2*varcc+(1-lambda)^2*vartdt+2*lambda*(1-lambda)*cov_tdt_cc  if type ==4
replace var=lambda^2*varcc +(1-lambda)^2*varhhrr+2*lambda*(1-lambda)*cov_hhrr_cc if type ==5

replace logor=logortdt if type==6
replace var=vartdt if type ==6

gen se=sqrt(var)

label variable logor "combined log-Odds Ratio"
label variable se "S.E. of combined log-Odds Ratio"

** perform the meta-analysis with the method of DerSimonian and Laird
label define type 1 "TDT" 2 "CC" 3 "HHRR" 4 "CC+TDT" 5 "CC+HHRR" 6 "TDT+HHRR"
metan logor  se,randomi by(type) label(namevar=author,yearvar=year) eform xlab(0.5, 1, 2, 4)

** meta-analysis using the ML or REML
metareg logor,wsse(se) bse(ml)
metareg logor,wsse(se) bse(reml)

**tests for publication bias
metabias logor se

** cumulative meta-analysis
sort year 
metacum  logor se, id(author) effect(r) gr xlab
metatrend  logor se

**display the combined results graphically
replace setdt=0 if setdt ==.
replace secc=0 if secc ==.
replace sehhrr=0 if sehhrr==.
replace logortdt =0 if logortdt  ==.
replace logorhhrr=0 if logorhhrr==.
replace logorcc=0 if logorcc==.

** this is due to a bug in metagraph command
rename y y00


metan logortdt  setdt,randomi eform nograph
metagraph logortdt setdt, id(author ) combined(`exp(r(ES))' `exp(r(ci_low))' `exp(r(ci_upp))') x(0.5, 1, 2, 4) eform nodraw name(graph1) title(TDT studies)
metan logorcc secc,randomi eform nograph
metagraph logorcc secc, id(author ) combined(`exp(r(ES))' `exp(r(ci_low))' `exp(r(ci_upp))') x(0.5, 1, 2, 4) eform nodraw name(graph2) title(Case-Control studies)
metan logorhhrr sehhrr,randomi eform nograph
metagraph logorhhrr sehhrr, id(author ) combined(`exp(r(ES))' `exp(r(ci_low))' `exp(r(ci_upp))') x(0.5, 1, 2, 4) eform nodraw name(graph3) title(HHRR studies)
metan logor se,randomi eform nograph
metagraph logor se, id(author ) combined(`exp(r(ES))' `exp(r(ci_low))' `exp(r(ci_upp))') x(0.5, 1, 2, 4) eform nodraw name(graph4) title(Combined analysis)
graph combine graph1 graph2 graph3 graph4 , cols(2) xcommon altshrink

** data rearrangements in order to use a linear mixed model
drop  b c n11 n10 n01 n00 w x y00 z secc setdt sehhrr logor var
rename id study
rename  logortdt logor1
rename  vartdt var1
rename  logorcc logor2
rename varcc var2
rename logorhhrr logor3
rename  varhhrr var3
reshape long logor var, i(study) j(newtype)
keep if  logor~=.
drop type
rename newtype type
gen id=_n

** gllamm needs to have the logarithm of the standard error
gen s=log(sqrt(var))
eq wgt: s
constraint define 1 [lns1]s=1

**fitting the model of Eq. (17) using the sandwich variance estimator
gllamm logor ,i(study) s(wgt) constraint(1) adapt nip(8) cluster(study)

**fitting the model of Eq. (19) using the sandwich variance estimator
gllamm logor ,i(id study) s(wgt) constraint(1) adapt nip(8) cluster(study)




Comments