<?xml version='1.0'?>
<!DOCTYPE art SYSTEM 'http://www.biomedcentral.com/xml/article.dtd'>
<art><ui>gm123</ui><ji>1756-994X</ji><fm>
<dochead>Method</dochead>
<bibl>
<title>
<p>Family-based genetic risk prediction of multifactorial disease</p>
</title>
<aug>
<au id="A1"><snm>Ruderfer</snm><mi>M</mi><fnm>Douglas</fnm><insr iid="I1"/><insr iid="I2"/><insr iid="I3"/><email>druderfer@chgr.mgh.harvard.edu</email></au>
<au id="A2"><snm>Korn</snm><fnm>Joshua</fnm><insr iid="I3"/><email>jkorn@broadinstitute.org</email></au>
<au ca="yes" id="A3"><snm>Purcell</snm><mi>M</mi><fnm>Shaun</fnm><insr iid="I1"/><insr iid="I2"/><insr iid="I3"/><insr iid="I4"/><email>shaun@pngu.mgh.harvard.edu</email></au>
</aug>
<insg>
<ins id="I1"><p>Psychiatric and Neurodevelopmental Genetics Unit, Center for Human Genetic Research, Mass General Hospital, Boston, MA, USA</p></ins>
<ins id="I2"><p>The Stanley Center for Psychiatric Research, The Broad Institute of Harvard and MIT, Cambridge, MA, USA</p></ins>
<ins id="I3"><p>Broad Institute of Harvard and MIT, Cambridge, MA, USA</p></ins>
<ins id="I4"><p>Department of Psychiatry, Harvard Medical School, Boston, MA, USA</p></ins>
</insg>
<source>Genome Medicine</source>
<issn>1756-994X</issn>
<pubdate>2010</pubdate>
<volume>2</volume>
<issue>1</issue>
<fpage>2</fpage>
<url>http://genomemedicine.com/content/2/1/2</url>
<xrefbib><pubidlist><pubid idtype="pmpid">20193047</pubid><pubid idtype="doi">10.1186/gm123</pubid></pubidlist></xrefbib>
</bibl>
<history><rec><date><day>22</day><month>7</month><year>2009</year></date></rec><revrec><date><day>2</day><month>10</month><year>2009</year></date></revrec><acc><date><day>15</day><month>1</month><year>2010</year></date></acc><pub><date><day>15</day><month>1</month><year>2010</year></date></pub></history>
<cpyrt><year>2010</year><collab>Ruderfer et al.; licensee BioMed Central Ltd.</collab><note>This is an open access article distributed under the terms of the Creative Commons Attribution License (<url>http://creativecommons.org/licenses/by/2.0</url>), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.</note></cpyrt>
<abs>
<sec>
<st>
<p>Abstract</p>
</st>
<p>Genome-wide association studies have detected dozens of variants underlying complex diseases, although it is uncertain how often these discoveries will translate into clinically useful predictors. Here, to improve genetic risk prediction, we consider including phenotypic and genotypic information from related individuals. We develop and evaluate a family-based liability-threshold prediction model and apply it to a simulation of known Crohn's disease risk variants. We show that genotypes of a relative of known phenotype can be informative for an individual's disease risk, over and above the same locus genotyped in the individual. This approach can lead to better-calibrated estimates of disease risk, although the overall benefit for prediction is typically only very modest.</p>
</sec>
</abs>
</fm><bdy>
<sec>
<st>
<p>Background</p>
</st>
<p>Although whole-genome association studies have detected dozens of common variants for a broad range of complex diseases, and are likely to detect many more, the total variance explained by the known variants is typically modest <abbrgrp>
<abbr bid="B1">1</abbr>
<abbr bid="B2">2</abbr>
</abbrgrp>. As such, realising the goals of accurate genetic risk prediction and the subsequent opportunities of personalised medicine remains difficult <abbrgrp>
<abbr bid="B3">3</abbr>
<abbr bid="B4">4</abbr>
</abbrgrp>. Indeed, it has often been noted that family history alone will perform substantially better as a predictor of risk, compared to genotype data for known risk variants <abbrgrp>
<abbr bid="B5">5</abbr>
</abbrgrp>. It is true that a positive family history will likely remain an important factor in prediction for the many complex diseases with substantial heritabilties and shared familial environmental components. (A caveat is that family history information might sometimes not be straightforwardly available, for example, for phenotypes such as response to a particular drug treatment.) However, analogous to clinical genetic testing for Mendelian disease, it is plausible that in many cases a positive family history will itself be a motivating factor for pursuing a genetic test. For example, an individual whose older sibling developed a particular disease might be particularly concerned with their own personal risk, which they assume will be higher than average. In this context, in which a genetic test is sought because a first-degree relative has disease, we developed a family-based model for risk prediction incorporating genotype data from both the index individual and a relative of known phenotype. As such, we do not ask "how well do SNPs predict disease compared to family history", but rather, "how well do SNPs predict disease given a positive family history, and to what extent does including genotype data from the affected relatives help?".</p>
<sec>
<st>
<p>Information from relatives of known phenotype</p>
</st>
<p>For diseases with polygenic and shared environmental components of risk, the genotype of a relative of known phenotype can be informative for an individual's disease risk, over and above the individual's own genotype at that locus. Below, the term genotype here refers to both single and multi-locus genotypes, unless explicitly stated. We assume that genotypes at the locus or loci under consideration only account for a proportion of the total familial covariance, meaning that unmeasured residual polygenic and/or shared environmental factors still exist, as would be expected for a complex disease.</p>
<p>Ignoring the relative's phenotype, then as expected, in an unselected population a relative's genotype does not predict the index individual's disease risk given the index's own genotype. That is, if index disease <it>D</it>
<sub>
<it>I </it>
</sub>is modeled as a function of index genotype <it>G</it>
<sub>
<it>I </it>
</sub>and, for example, sibling genotype <it>G</it>
<sub>
<it>S</it>
</sub>
</p>
<p>
<display-formula>
<graphic file="gm123-i1.gif"/>
</display-formula>
</p>
<p>then <it>E</it>(<it>b</it>
<sub>2</sub>) = 0 even if <it>E</it>(<it>b</it>
<sub>1</sub>) &#8800; 0. However, if we know the phenotype of the sibling, <it>D</it>
<sub>
<it>S</it>
</sub>, and include it in the model</p>
<p>
<display-formula>
<graphic file="gm123-i2.gif"/>
</display-formula>
</p>
<p>then if <it>E</it>(<it>b</it>
<sub>1</sub>) &gt; 0, for example, <it>E</it>(<it>b</it>
<sub>2</sub>) will no longer equal zero. In fact, in this case, <it>E</it>(<it>b</it>
<sub>2</sub>) &lt; 0 meaning that the sibling's genotype is informative for the index's disease risk, in the opposite direction compared to <it>b</it>
<sub>1</sub>.</p>
<p>Why is the sibling genotype conditional on index genotype and sibling phenotype informative for index disease risk? For a given risk locus, if the sibling is affected but has a low-risk genotype, this implies that the index is at <it>higher </it>risk than if the affected sibling has a high-risk genotype, conditional on the index's own genotype at that locus. In this scenario, the affected sibling's genotype acts as a surrogate for all other <it>unmeasured </it>risk factors: if the sibling has the low-risk genotype but still is affected, he or she is likely to have a higher rate of other, unobserved risk factors, either genetic or environmental. To the extent that these unobserved risk factors are shared among siblings, the affected sibling's genotype will therefore act as a surrogate for the index's unobserved risks. This is analogous to the epidemiological phenomenon of selection bias, in which an association arises due to shared but unmeasured factors.</p>
<p>In general, a lower genetic load of known risk variants in an affected relative will tend to increase the index's risk of disease, over and above the level of risk predicted by the index's own genotype. For the index, a higher genetic load still leads, as usual, to a higher predicted risk. (Note that if we did not know the index genotype, the affected relative's genotype would act as a surrogate for it. In this case, a higher load of known risk variants in the affected relative would predict a higher, not lower, risk in the index. Unless the affected relative is an MZ twin, prediction would naturally be worse than if we knew the actual index genotype.) In the rest of this report, we applied this observation to the problem of genetic risk prediction, asking whether the inclusion of genotypes from a relative of known phenotype can improve the accuracy of prediction.</p>
</sec>
</sec>
<sec>
<st>
<p>Methods</p>
</st>
<sec>
<st>
<p>Prediction model incorporating family information</p>
</st>
<p>Here we introduce a model in which the relative of known phenotype is an affected sibling; the basic approach can be easily extended to other and multiple relative types. Specifically, we wish to predict disease risk for the index individual, conditional on a) their multilocus genotype at <it>V </it>known disease variants, b) their affected sibling's disease state and c) additionally including the affected sibling's multilocus genotype.</p>
<p>For two siblings (with subscripts <it>I </it>and <it>S </it>for the index and affected sibling, respectively), we model disease state <it>D </it>given genotypes <it>G </it>at one or more loci. Estimates of population allele frequencies and relative risks for <it>G </it>are assumed to be known in advance. The probability that the index develops disease given both their and their affected sibling's genotype at a single locus is</p>
<p>
<display-formula>
<graphic file="gm123-i3.gif"/>
</display-formula>
</p>
<p>where <it>P</it>(<it>D</it>
<sub>
<it>I</it>
</sub>, <it>D</it>
<sub>
<it>S</it>
</sub>|<it>G</it>
<sub>
<it>I</it>
</sub>, <it>G</it>
<sub>
<it>S</it>
</sub>) and <inline-formula>
<graphic file="gm123-i4.gif"/>
</inline-formula> are directly obtained from the multivariate normal cumulative distribution function, assuming a liability-threshold model for disease risk.</p>
<p>The liability-threshold model assumes an unobserved, normally-distributed liability (<it>Q</it>); individuals with liability values above a threshold are affected. For threshold <it>t</it>, <it>P</it>(<it>Q </it>&#8805; <it>t</it>) = <it>k </it>where <it>k </it>is the specified population prevalence of disease. For two family members, the probability of joint sibling disease state <it>D </it>given genotypes <it>G </it>is</p>
<p>
<display-formula>
<graphic file="gm123-i5.gif"/>
</display-formula>
</p>
<p>and the joint cumulative distribution of <it>Q </it>is given by the multivariate normal distribution function</p>
<p>
<display-formula>
<graphic file="gm123-i6.gif"/>
</display-formula>
</p>
<p>The expected value of <it>Q </it>is a function of the genotypes for each sibling, <it>G</it>
<sub>
<it>I </it>
</sub>and <it>G</it>
<sub>
<it>S</it>
</sub>; the residual variance is partitioned into the components of variance representing polygenes (<inline-formula>
<graphic file="gm123-i7.gif"/>
</inline-formula>), family-wide common environmental factors (<inline-formula>
<graphic file="gm123-i8.gif"/>
</inline-formula>) and individual-specific, or nonshared, factors, including measurement error (<inline-formula>
<graphic file="gm123-i9.gif"/>
</inline-formula>). These variance components must be specified in advance, for example, from twin and family studies. For a given individual, we use the likelihood ratio as a measure of risk of being affected, <it>D</it>
<sub>
<it>I</it>
</sub>, versus unaffected, <inline-formula>
<graphic file="gm123-i10.gif"/>
</inline-formula>
<abbrgrp>
<abbr bid="B6">6</abbr>
</abbrgrp>, extended here to incorporate genotypic and phenotypic information on the sibling, <it>G</it>
<sub>
<it>S </it>
</sub>and <it>D</it>
<sub>
<it>S</it>
</sub>,</p>
<p>
<display-formula>
<graphic file="gm123-i11.gif"/>
</display-formula>
</p>
<p>where</p>
<p>
<display-formula>
<graphic file="gm123-i12.gif"/>
</display-formula>
</p>
<p>and <it>P</it>(<it>G</it>
<sub>
<it>I</it>
</sub>, <it>G</it>
<sub>
<it>S</it>
</sub>, <it>D</it>
<sub>
<it>S</it>
</sub>) = <it>P</it>(<it>G</it>
<sub>
<it>I</it>
</sub>, <it>G</it>
<sub>
<it>S</it>
</sub>|<it>D</it>
<sub>
<it>S</it>
</sub>)<it>P</it>(<it>D</it>
<sub>
<it>S</it>
</sub>). The population joint sibship genotype frequencies <it>P</it>(<it>G</it>
<sub>
<it>I</it>
</sub>, <it>G</it>
<sub>
<it>S</it>
</sub>) are calculated assuming random mating and Hardy-Weinberg equilibrium in the population, summing over all possible parental mating and transmission types. Conditioning on proband disease state, then</p>
<p>
<display-formula>
<graphic file="gm123-i13.gif"/>
</display-formula>
</p>
<p>These likelihoods can be combined across multiple independent loci, as log(<it>L</it>
<sub>
<it>M</it>
</sub>) = &#8721;<sub>
<it>v</it>
</sub>log(<it>L</it>
<sub>
<it>v</it>
</sub>) where <it>L</it>
<sub>
<it>v </it>
</sub>is the likelihood ratio for variant <it>v</it>. Then, following Yang et al. <abbrgrp>
<abbr bid="B6">6</abbr>
</abbrgrp>, the risk of disease for the index is given by</p>
<p>
<display-formula>
<graphic file="gm123-i14.gif"/>
</display-formula>
</p>
</sec>
<sec>
<st>
<p>Simulation study of Crohn's disease variants</p>
</st>
<p>We simulated data to approximate the set of 30 risk variants reported in Barrett et al as follows. We set the disease prevalence to <it>k </it>= 1/250. (In practice, determination of affection status was based on fixed threshold on the normal liability scale, and so the implied prevalence will vary slightly around 1/250 when non-null genetic effects are specified. This effect is very small and does not impact the comparisons of methods and conclusions, however.) The risk allele frequency (RAF) and genotypic relative risk (GRR) for each variant are reported in Table <tblr tid="T1">1</tblr>. Given <it>k</it>, RAF and GRR for each variant, we estimated the implied additive genetic value <it>a </it>by numerical optimization.</p>
<tbl id="T1"><title><p>Table 1</p></title><caption><p>Crohn's disease model specification</p></caption><tblbdy cols="4">
      <r>
         <c ca="center">
            <p>
               <b>RAF</b>
            </p>
         </c>
         <c ca="center">
            <p>
               <b>GRR</b>
            </p>
         </c>
         <c ca="center">
            <p>
               <b>
                  <it>a</it>
               </b>
            </p>
         </c>
         <c ca="center">
            <p>
               <b>VE</b>
            </p>
         </c>
      </r>
      <r>
         <c cspan="4">
            <hr/>
         </c>
      </r>
      <r>
         <c ca="center">
            <p>0.018</p>
         </c>
         <c ca="center">
            <p>3.99</p>
         </c>
         <c ca="center">
            <p>0.504</p>
         </c>
         <c ca="center">
            <p>.0090</p>
         </c>
      </r>
      <r>
         <c ca="center">
            <p>0.533</p>
         </c>
         <c ca="center">
            <p>1.28</p>
         </c>
         <c ca="center">
            <p>0.098</p>
         </c>
         <c ca="center">
            <p>.0048</p>
         </c>
      </r>
      <r>
         <c ca="center">
            <p>0.425</p>
         </c>
         <c ca="center">
            <p>1.25</p>
         </c>
         <c ca="center">
            <p>0.083</p>
         </c>
         <c ca="center">
            <p>.0034</p>
         </c>
      </r>
      <r>
         <c ca="center">
            <p>0.899</p>
         </c>
         <c ca="center">
            <p>1.31</p>
         </c>
         <c ca="center">
            <p>0.135</p>
         </c>
         <c ca="center">
            <p>.0033</p>
         </c>
      </r>
      <r>
         <c ca="center">
            <p>0.387</p>
         </c>
         <c ca="center">
            <p>1.25</p>
         </c>
         <c ca="center">
            <p>0.083</p>
         </c>
         <c ca="center">
            <p>.0032</p>
         </c>
      </r>
      <r>
         <c ca="center">
            <p>0.152</p>
         </c>
         <c ca="center">
            <p>1.35</p>
         </c>
         <c ca="center">
            <p>0.106</p>
         </c>
         <c ca="center">
            <p>.0029</p>
         </c>
      </r>
      <r>
         <c ca="center">
            <p>0.677</p>
         </c>
         <c ca="center">
            <p>1.22</p>
         </c>
         <c ca="center">
            <p>0.080</p>
         </c>
         <c ca="center">
            <p>.0028</p>
         </c>
      </r>
      <r>
         <c ca="center">
            <p>0.463</p>
         </c>
         <c ca="center">
            <p>1.21</p>
         </c>
         <c ca="center">
            <p>0.071</p>
         </c>
         <c ca="center">
            <p>.0025</p>
         </c>
      </r>
      <r>
         <c ca="center">
            <p>0.478</p>
         </c>
         <c ca="center">
            <p>1.20</p>
         </c>
         <c ca="center">
            <p>0.067</p>
         </c>
         <c ca="center">
            <p>.0023</p>
         </c>
      </r>
      <r>
         <c ca="center">
            <p>0.678</p>
         </c>
         <c ca="center">
            <p>1.20</p>
         </c>
         <c ca="center">
            <p>0.072</p>
         </c>
         <c ca="center">
            <p>.0022</p>
         </c>
      </r>
      <r>
         <c ca="center">
            <p>0.780</p>
         </c>
         <c ca="center">
            <p>1.21</p>
         </c>
         <c ca="center">
            <p>0.079</p>
         </c>
         <c ca="center">
            <p>.0022</p>
         </c>
      </r>
      <r>
         <c ca="center">
            <p>0.221</p>
         </c>
         <c ca="center">
            <p>1.25</p>
         </c>
         <c ca="center">
            <p>0.079</p>
         </c>
         <c ca="center">
            <p>.0022</p>
         </c>
      </r>
      <r>
         <c ca="center">
            <p>0.933</p>
         </c>
         <c ca="center">
            <p>2.50</p>
         </c>
         <c ca="center">
            <p>0.130</p>
         </c>
         <c ca="center">
            <p>.0021</p>
         </c>
      </r>
      <r>
         <c ca="center">
            <p>0.125</p>
         </c>
         <c ca="center">
            <p>1.32</p>
         </c>
         <c ca="center">
            <p>0.097</p>
         </c>
         <c ca="center">
            <p>.0021</p>
         </c>
      </r>
      <r>
         <c ca="center">
            <p>0.565</p>
         </c>
         <c ca="center">
            <p>1.18</p>
         </c>
         <c ca="center">
            <p>0.062</p>
         </c>
         <c ca="center">
            <p>.0019</p>
         </c>
      </r>
      <r>
         <c ca="center">
            <p>0.565</p>
         </c>
         <c ca="center">
            <p>1.18</p>
         </c>
         <c ca="center">
            <p>0.062</p>
         </c>
         <c ca="center">
            <p>.0019</p>
         </c>
      </r>
      <r>
         <c ca="center">
            <p>0.697</p>
         </c>
         <c ca="center">
            <p>1.18</p>
         </c>
         <c ca="center">
            <p>0.064</p>
         </c>
         <c ca="center">
            <p>.0017</p>
         </c>
      </r>
      <r>
         <c ca="center">
            <p>0.271</p>
         </c>
         <c ca="center">
            <p>1.20</p>
         </c>
         <c ca="center">
            <p>0.065</p>
         </c>
         <c ca="center">
            <p>.0016</p>
         </c>
      </r>
      <r>
         <c ca="center">
            <p>0.090</p>
         </c>
         <c ca="center">
            <p>1.33</p>
         </c>
         <c ca="center">
            <p>0.099</p>
         </c>
         <c ca="center">
            <p>.0016</p>
         </c>
      </r>
      <r>
         <c ca="center">
            <p>0.243</p>
         </c>
         <c ca="center">
            <p>1.19</p>
         </c>
         <c ca="center">
            <p>0.061</p>
         </c>
         <c ca="center">
            <p>.0014</p>
         </c>
      </r>
      <r>
         <c ca="center">
            <p>0.386</p>
         </c>
         <c ca="center">
            <p>1.16</p>
         </c>
         <c ca="center">
            <p>0.053</p>
         </c>
         <c ca="center">
            <p>.0013</p>
         </c>
      </r>
      <r>
         <c ca="center">
            <p>0.289</p>
         </c>
         <c ca="center">
            <p>1.17</p>
         </c>
         <c ca="center">
            <p>0.055</p>
         </c>
         <c ca="center">
            <p>.0013</p>
         </c>
      </r>
      <r>
         <c ca="center">
            <p>0.345</p>
         </c>
         <c ca="center">
            <p>1.16</p>
         </c>
         <c ca="center">
            <p>0.053</p>
         </c>
         <c ca="center">
            <p>.0013</p>
         </c>
      </r>
      <r>
         <c ca="center">
            <p>0.682</p>
         </c>
         <c ca="center">
            <p>1.14</p>
         </c>
         <c ca="center">
            <p>0.049</p>
         </c>
         <c ca="center">
            <p>.0010</p>
         </c>
      </r>
      <r>
         <c ca="center">
            <p>0.389</p>
         </c>
         <c ca="center">
            <p>1.13</p>
         </c>
         <c ca="center">
            <p>0.043</p>
         </c>
         <c ca="center">
            <p>.0009</p>
         </c>
      </r>
      <r>
         <c ca="center">
            <p>0.473</p>
         </c>
         <c ca="center">
            <p>1.12</p>
         </c>
         <c ca="center">
            <p>0.040</p>
         </c>
         <c ca="center">
            <p>.0008</p>
         </c>
      </r>
      <r>
         <c ca="center">
            <p>0.348</p>
         </c>
         <c ca="center">
            <p>1.12</p>
         </c>
         <c ca="center">
            <p>0.040</p>
         </c>
         <c ca="center">
            <p>.0007</p>
         </c>
      </r>
      <r>
         <c ca="center">
            <p>0.017</p>
         </c>
         <c ca="center">
            <p>1.54</p>
         </c>
         <c ca="center">
            <p>0.149</p>
         </c>
         <c ca="center">
            <p>.0007</p>
         </c>
      </r>
      <r>
         <c ca="center">
            <p>0.708</p>
         </c>
         <c ca="center">
            <p>1.11</p>
         </c>
         <c ca="center">
            <p>0.038</p>
         </c>
         <c ca="center">
            <p>.0006</p>
         </c>
      </r>
      <r>
         <c ca="center">
            <p>0.619</p>
         </c>
         <c ca="center">
            <p>1.08</p>
         </c>
         <c ca="center">
            <p>0.027</p>
         </c>
         <c ca="center">
            <p>.0004</p>
         </c>
      </r>
   </tblbdy><tblfn>
      <p>Values used to generate simulated CD samples. RAF = risk allele frequency; GRR = genotypic relative risk, estimated from the reported odds ratios; <it>a </it>= additive genetic value; VE = variance explained.</p>
   </tblfn></tbl>
<p>In all cases, we set the polygenic variance components <inline-formula>
<graphic file="gm123-i7.gif"/>
</inline-formula> = 0.7, <inline-formula>
<graphic file="gm123-i8.gif"/>
</inline-formula> = 0.2 and <inline-formula>
<graphic file="gm123-i9.gif"/>
</inline-formula> = 0.1, which implies a risk to individuals with at least one affected sibling of 0.11, and therefore, a sibling relative risk of 28.6 <abbrgrp>
<abbr bid="B7">7</abbr>
</abbrgrp>. Note that the performance of the family model depends on the residual sibling correlation <inline-formula>
<graphic file="gm123-i15.gif"/>
</inline-formula> and not just the individual values of values of <inline-formula>
<graphic file="gm123-i7.gif"/>
</inline-formula> and <inline-formula>
<graphic file="gm123-i8.gif"/>
</inline-formula> (i.e. all pairs of values that yield the same implied sibling correlation will show identical performance).</p>
<p>For the unselected population we simulated 500,000 nuclear families, each with two siblings. For the family-history positive population, we simulated 100,000. Fewer replicates were required due to the much higher baseline rate for <it>D</it>
<sub>
<it>I </it>
</sub>in this population.</p>
</sec>
</sec>
<sec>
<st>
<p>Results and discussion</p>
</st>
<sec>
<st>
<p>Single locus example</p>
</st>
<p>To illustrate the approach, we analytically calculated the expected risk under a variety of models, based on information from a single locus - rs2188962, one of the Crohn's disease (CD) loci identified in a recent meta-analysis <abbrgrp>
<abbr bid="B8">8</abbr>
</abbrgrp>, setting the genotypic relative risk (GRR) to 1.25 and the risk allele frequency (RAF) to 0.425. Prevalence, additive polygenic and shared environmental components of variance were set to approximate known values for CD, as described above. Figure <figr fid="F1">1</figr> shows the predicted disease risks under five models:</p>
<p indent="1">no information, <it>P</it>(<it>D</it>
<sub>
<it>I</it>
</sub>);</p>
<p indent="1">conditional on index genotype, <it>P</it>(<it>D</it>
<sub>
<it>I</it>
</sub>|<it>G</it>
<sub>
<it>I</it>
</sub>);</p>
<p indent="1">conditional on having an affected sibling status alone, <it>P</it>(<it>D</it>
<sub>
<it>I</it>
</sub>|<it>D</it>
<sub>
<it>S</it>
</sub>);</p>
<p indent="1">as above, including index genotype, <it>P</it>(<it>D</it>
<sub>
<it>I</it>
</sub>|<it>G</it>
<sub>
<it>I</it>
</sub>, <it>D</it>
<sub>
<it>S</it>
</sub>);</p>
<p indent="1">as above, including sibling genotype, <it>P</it>(<it>D</it>
<sub>
<it>I</it>
</sub>|<it>G</it>
<sub>
<it>I</it>
</sub>, <it>G</it>
<sub>
<it>S</it>
</sub>, <it>D</it>
<sub>
<it>S</it>
</sub>).</p>
<fig id="F1"><title><p>Figure 1</p></title><caption><p>Predicted index disease risk</p></caption><text>
   <p><b>Predicted index disease risk</b>. Predicted index disease risks from a single locus (MAF = 0.425, GRR = 1.25): unconditonal, <it>P</it>(<it>D</it><sub><it>I</it></sub>); conditional on index genotype, <it>P</it>(<it>D</it><sub><it>I</it></sub>|<it>G</it><sub><it>I</it></sub>); conditional on affected sibling phenotype, <it>P</it>(<it>D</it><sub><it>I</it></sub>|<it>D</it><sub><it>S</it></sub>); conditional on index genotype and affected sibling phenotype, <it>P</it>(<it>D</it><sub><it>I</it></sub>|<it>G</it><sub><it>I</it></sub>, <it>D</it><sub><it>S</it></sub>); conditional on index and sibling genotypes and affected sibling phenotype, <it>P</it>(<it>D</it><sub><it>I</it></sub>|<it>G</it><sub><it>I</it></sub>, <it>G</it><sub><it>S</it></sub>, <it>D</it><sub><it>S</it></sub>). The inserted table contains frequencies of sibling pair genotype combinations conditional on at least one sibling being affected. Red represents the homozygous risk-increasing genotype; green the heterozygous genotype; blue the homozygous risk-decreasing genotype.</p>
</text><graphic file="gm123-1"/></fig>
<p>Conditional on index genotype, the affected sibling's genotype further stratifies risk, but with the low-risk genotype predicting increased risk for the index. Values of <it>P</it>(<it>D</it>
<sub>
<it>I</it>
</sub>|<it>G</it>
<sub>
<it>I</it>
</sub>) only range around <it>P</it>(<it>D</it>
<sub>
<it>I</it>
</sub>), from 0.32% to 0.52% for the low-risk to high-risk homozygotes, whereas <it>P</it>(<it>D</it>
<sub>
<it>I</it>
</sub>|<it>G</it>
<sub>
<it>I</it>
</sub>, <it>G</it>
<sub>
<it>S</it>
</sub>, <it>D</it>
<sub>
<it>S</it>
</sub>) shows a much greater range around P(<it>D</it>
<sub>
<it>I</it>
</sub>|<it>D</it>
<sub>
<it>S</it>
</sub>), from 8.9% to 14.6%. The predicted risks shown here were reproduced by simulating data under this model and calculating the proportion of index cases for each configuration (data not shown).</p>
<p>Figure <figr fid="F2">2</figr> illustrates the relative performance of the different models under varying levels of effect size and background residual familial variance. In general, the absolute and relative impact of the affected sibling's genotype increases with both of these factors.</p>
<fig id="F2"><title><p>Figure 2</p></title><caption><p>Predicted index disease risk stratified by (a) effect size and (b) total sibling relative risk</p></caption><text>
   <p><b>Predicted index disease risks from a single locus, under a variety of genetic models.</b> Predicted index disease risk stratified by <b>(a)</b> effect size and <b>(b)</b> total sibling relative risk. See Figure 1 legend for details. In all cases, risk allele frequency is 0.425, disease prevalence is 1/250. (a) Varying the familial variance component of the residual variance from 20%, 50% to 80%, with corresponding sibling relative risks of 3.25, 12.25 and 35.5. (b) Varying additive genetic effect from <it>a </it>= 0.01, <it>a </it>= 0.05 to <it>a </it>= 0.1, with corresponding genotypic relative risks of 1.03, 1.16 and 1.30.</p>
</text><graphic file="gm123-2"/></fig>
</sec>
<sec>
<st>
<p>Crohn's disease simulation</p>
</st>
<p>We next performed a simulation as described above that included all 30 CD variants reported in Barrett et al <abbrgrp>
<abbr bid="B8">8</abbr>
</abbrgrp> that collectively account for 6.4% of the total variance (calculated assuming a liability-threshold model and assuming additivity across loci on the scale of liability). We first simulated a simple unascertained sample of nuclear families, each with two siblings (i.e. <it>D</it>
<sub>
<it>S </it>
</sub>will only be affected at the usual population prevalence). Second, we used rejection sampling to simulate an ascertained sample in which at least one sibling was affected (<it>D</it>
<sub>
<it>S </it>
</sub>is always affected). For each simulated family, we calculated the risk for the index being affected, <it>D</it>
<sub>
<it>I </it>
</sub>using the methods described above.</p>
<p>We evaluated performance using three metrics: 1) the area under ROC curve (AUC), 2) the squared correlation between true disease state and predicted risk (<it>R</it>
<sup>2</sup>), and 3) the enrichment in the rate of cases versus the population prevalence for individuals in the highest 1, 5, or 10% of estimated risk (<it>T</it>
<sub>1</sub>, <it>T</it>
<sub>5 </sub>and <it>T</it>
<sub>10</sub>). We assessed performance for three models: <it>P</it>(<it>D</it>
<sub>
<it>I</it>
</sub>|<it>G</it>
<sub>
<it>I</it>
</sub>), <it>P</it>(<it>D</it>
<sub>
<it>I</it>
</sub>|<it>G</it>
<sub>
<it>I</it>
</sub>, <it>D</it>
<sub>
<it>S</it>
</sub>) and <it>P</it>(<it>D</it>
<sub>
<it>I</it>
</sub>|<it>G</it>
<sub>
<it>I</it>
</sub>, <it>G</it>
<sub>
<it>S</it>
</sub>, <it>D</it>
<sub>
<it>S</it>
</sub>). All results are shown in Table <tblr tid="T2">2</tblr>.</p>
<tbl id="T2"><title><p>Table 2</p></title><caption><p>Crohn's disease simulation results</p></caption><tblbdy cols="6">
      <r>
         <c ca="left">
            <p>
               <b>Model</b>
            </p>
         </c>
         <c ca="center">
            <p>
               <b>AUC</b>
            </p>
         </c>
         <c ca="center">
            <p>
               <b>
                  <it>R</it>
                  <sup>2</sup>
               </b>
            </p>
         </c>
         <c ca="center">
            <p>
               <b>
                  <it>T</it>
                  <sub>1</sub>
               </b>
            </p>
         </c>
         <c ca="center">
            <p>
               <b>
                  <it>T</it>
                  <sub>5</sub>
               </b>
            </p>
         </c>
         <c ca="center">
            <p>
               <b>
                  <it>T</it>
                  <sub>10</sub>
               </b>
            </p>
         </c>
      </r>
      <r>
         <c cspan="6">
            <hr/>
         </c>
      </r>
      <r>
         <c ca="left">
            <p>General population</p>
         </c>
         <c ca="center">
            <p>0.708</p>
         </c>
         <c ca="center">
            <p>0.054</p>
         </c>
         <c ca="center">
            <p>7.39</p>
         </c>
         <c ca="center">
            <p>4.21</p>
         </c>
         <c ca="center">
            <p>3.23</p>
         </c>
      </r>
      <r>
         <c ca="left" indent="1">
            <p><it>P</it>(<it>D</it><sub><it>I</it></sub>|<it>G</it><sub><it>I</it></sub>)</p>
         </c>
         <c ca="center">
            <p>0.726</p>
         </c>
         <c ca="center">
            <p>0.085</p>
         </c>
         <c ca="center">
            <p>15.90</p>
         </c>
         <c ca="center">
            <p>5.71</p>
         </c>
         <c ca="center">
            <p>3.91</p>
         </c>
      </r>
      <r>
         <c ca="left" indent="1">
            <p><it>P</it>(<it>D</it><sub><it>I</it></sub>|<it>G</it><sub><it>I</it></sub>, <it>D</it><sub><it>S</it></sub>)</p>
         </c>
         <c ca="center">
            <p>0.735</p>
         </c>
         <c ca="center">
            <p>0.094</p>
         </c>
         <c ca="center">
            <p>15.88</p>
         </c>
         <c ca="center">
            <p>5.80</p>
         </c>
         <c ca="center">
            <p>3.94</p>
         </c>
      </r>
      <r>
         <c ca="left" indent="1">
            <p><it>P</it>(<it>D</it><sub><it>I</it></sub>|<it>G</it><sub><it>I</it></sub>, <it>G</it><sub><it>S</it></sub>, <it>D</it><sub><it>S</it></sub>)</p>
         </c>
         <c>
            <p/>
         </c>
         <c>
            <p/>
         </c>
         <c>
            <p/>
         </c>
         <c>
            <p/>
         </c>
         <c>
            <p/>
         </c>
      </r>
      <r>
         <c>
            <p/>
         </c>
         <c>
            <p/>
         </c>
         <c>
            <p/>
         </c>
         <c>
            <p/>
         </c>
         <c>
            <p/>
         </c>
         <c>
            <p/>
         </c>
      </r>
      <r>
         <c ca="left">
            <p>Selected population (affected sibling)</p>
         </c>
         <c>
            <p/>
         </c>
         <c>
            <p/>
         </c>
         <c>
            <p/>
         </c>
         <c>
            <p/>
         </c>
         <c>
            <p/>
         </c>
      </r>
      <r>
         <c ca="left" indent="1">
            <p><it>P</it>(<it>D</it><sub><it>I</it></sub>|<it>G</it><sub><it>I</it></sub>, <it>D</it><sub><it>S</it></sub>)</p>
         </c>
         <c ca="center">
            <p>0.628</p>
         </c>
         <c ca="center">
            <p>0.042</p>
         </c>
         <c ca="center">
            <p>71.25</p>
         </c>
         <c ca="center">
            <p>60.25</p>
         </c>
         <c ca="center">
            <p>53.75</p>
         </c>
      </r>
      <r>
         <c ca="left" indent="1">
            <p><it>P</it>(<it>D</it><sub><it>I</it></sub>|<it>G</it><sub><it>I</it></sub>, <it>G</it><sub><it>S</it></sub>, <it>D</it><sub><it>S</it></sub>)</p>
         </c>
         <c ca="center">
            <p>0.648</p>
         </c>
         <c ca="center">
            <p>0.056</p>
         </c>
         <c ca="center">
            <p>82.00</p>
         </c>
         <c ca="center">
            <p>67.20</p>
         </c>
         <c ca="center">
            <p>58.48</p>
         </c>
      </r>
   </tblbdy><tblfn>
      <p>Performance characteristics for tests based on the 30 Crohn's disease variants. Index individuals and their siblings were simulated in the unselected and selected (family history positive/affected sibling) scenarios. The prediction models estimate risk based on the index genotype <it>G</it><sub><it>I</it></sub>, and optionally sibling's phenotype <it>D</it><sub><it>S </it></sub>and genotype <it>G</it><sub><it>S</it></sub>. The metrics are the area under the ROC curve (AUC), the squared correlation between disease state and risk (<it>R</it><sup>2</sup>) and the relative enrichment of cases in the top 1, 5 and 10% of individuals with the highest risk scores relative to the baseline risk for that population (<it>T</it><sub>1</sub>, <it>T</it><sub>5 </sub>and <it>T</it><sub>10</sub>). See main text for details.</p>
   </tblfn></tbl>
<p>We first describe results for the general population, in which nuclear families were generated without any ascertainment on disease. As expected, compared to the basic model <it>P</it>(<it>D</it>
<sub>
<it>I</it>
</sub>|<it>G</it>
<sub>
<it>I</it>
</sub>), the inclusion of a sibling phenotype <it>D</it>
<sub>
<it>S </it>
</sub>(which might be affected or unaffected) improved both risk prediction for the index, particularly as indexed by <it>R</it>
<sup>2 </sup>(0.054 to 0.085). The enrichment of cases in the highest-ranked 1% (<it>T</it>
<sub>1</sub>) more than doubled (7.39 to 15.9). In this population, however the addition of the sibling's genotypes <it>G</it>
<sub>
<it>S </it>
</sub>added only marginal benefit in terms of AUC and <it>R</it>
<sup>2</sup>, and no benefit for the <it>T </it>metrics.</p>
<p>In the second population, we ascertained for a positive family history (i.e. <it>D</it>
<sub>
<it>S </it>
</sub>is always affected). Of note, compared to the unselected population, the AUC and <it>R</it>
<sup>2 </sup>metrics are considerably lower in this high-risk population, whereas the <it>T </it>metrics are substantially higher (largely reflecting the high sibling relative risk for this disease). That the discriminative performance of a test may vary depending on the characteristics of the population it is deployed in may have important implications for the generalizability of studies that claim a certain AUC, which is not an invariant property of the test alone but depends on the context in which it is used.</p>
<p>In terms of discrimination, the basic <it>P</it>(<it>D</it>
<sub>
<it>I</it>
</sub>|<it>G</it>
<sub>
<it>I</it>
</sub>) model as expected yields near identical results compared to <it>P</it>(<it>D</it>
<sub>
<it>I</it>
</sub>|<it>G</it>
<sub>
<it>I</it>
</sub>, <it>D</it>
<sub>
<it>S</it>
</sub>), as all siblings are affected in this population; we therefore omit this model here. However, the absolute values of predicted risk based on <it>P</it>(<it>D</it>
<sub>
<it>I</it>
</sub>|<it>G</it>
<sub>
<it>I</it>
</sub>) will be very poorly calibrated, as this model ignores the presence of a positive family history. For example, for individuals with a predicted risk of 0.1 &#177; 0.01 from the <it>P</it>(<it>D</it>
<sub>
<it>I</it>
</sub>|<it>G</it>
<sub>
<it>I</it>
</sub>, <it>D</it>
<sub>
<it>S</it>
</sub>) model, we observed a rate of 0.099 cases in the simulated data. However, based on <it>P</it>(<it>D</it>
<sub>
<it>I</it>
</sub>|<it>G</it>
<sub>
<it>I</it>
</sub>), these same individuals had a mean predicted risk of only 0.0037. In other words, by not conditioning on known affected sibling status, the prediction model will dramatically under-estimate the absolute risks.</p>
<p>Finally, we considered whether adding sibling genotypes improved prediction in this family-history positive population. We observed negligible improvement in AUC (1.03-fold increase) but a larger increase for <it>R</it>
<sup>2 </sup>(1.33-fold, 0.042 to 0.056). There were also increases in the already-large <it>T </it>metrics. As expected, the benefit derived from including sibling genotypes is larger in the ascertained population, as for a relatively rare but highly familial disease, affected siblings will be more informative than unaffected siblings. In the family-history positive population, adding affected sibling genotypes offers some advantage, although likely not enough to ever fundamentally change the discriminative utility of a test.</p>
<p>Including affected sibling genotypes can improve the calibration of predicted risks somewhat and lead to a greater stratification of risk, as apparent in Figure <figr fid="F1">1</figr>. We can quantify the risk stratification depicted in Figure <figr fid="F1">1</figr> in terms of a metric <it>&#948;</it>. Comparing two sets of predicted risks, we define <it>&#948; </it>as the expected change in risk, calculated as &#8721;<sub>
<it>i</it>
</sub>|<it>P</it>
<sub>
<it>i</it>
</sub>-<it>Q</it>
<sub>
<it>i</it>
</sub>|/<it>N </it>of <it>N </it>total individuals, <it>P</it>
<sub>
<it>i </it>
</sub>is the probability of disease in the individual before the test and <it>Q</it>
<sub>
<it>i </it>
</sub>is the probability afterwards. This is one way of characterizing the personal impact of a test: the expected change in estimated risk pre- versus post-test. In the family-history positive population, <it>&#948; </it>for <it>P</it>(<it>D</it>
<sub>
<it>I</it>
</sub>|<it>G</it>
<sub>
<it>I</it>
</sub>, <it>D</it>
<sub>
<it>S</it>
</sub>) is 0.035; the incremental <it>&#948; </it>going from the risks estimated based on <it>P</it>(<it>D</it>
<sub>
<it>I</it>
</sub>|<it>G</it>
<sub>
<it>I</it>
</sub>, <it>D</it>
<sub>
<it>S</it>
</sub>) to <it>P</it>(<it>D</it>
<sub>
<it>I</it>
</sub>|<it>G</it>
<sub>
<it>I</it>
</sub>, <it>G</it>
<sub>
<it>S</it>
</sub>, <it>D</it>
<sub>
<it>S</it>
</sub>) is 0.02. In other words, updating one's risk based on an affected sibling's genotype would be expected to change one's predicted risk 57% (0.02/0.035) as much as the initial test (in the unselected population, this value is 50%).</p>
</sec>
<sec>
<st>
<p>Including additional and/or unaffected family members</p>
</st>
<p>We also considered models in which additional affected family members are included in the model: for example, individuals in multiplex families with an affected sibling and an affected parent, or two affected siblings. In general, we do see improvement from incorporating the genotypes of these additional affected relatives, although there tends to be a diminishing return (data not shown).</p>
<p>In practice, for most diseases, being of relatively low frequency (e.g. under 10%), only affected relatives will contribute information, compared to relatives known to be disease-free. In addition, determination that an individual is disease-free with respect to life-time risk might be difficult.</p>
</sec>
<sec>
<st>
<p>Limitations</p>
</st>
<p>One caveat is that if the known variants used in the test themselves account for the entire familial covariance, then genotypes from phenotyped relatives will not contribute any additional information. This is unlikely to be the case in the foreseeable future for most diseases, however; it would imply that we have already maximized the potential of genetic risk prediction.</p>
<p>For this work we have assumed a particular model for risk, additivity on the scale of liability, which in practice approximates a multiplicative model on the scale of risk. This implies that the same risk ratio will correspond to a larger absolute risk difference if there is a higher baseline risk: for example, 1% versus 2% and 5% versus 10% both imply risk ratios of 2, but varying absolute risk differences. This effect is evident in Figure <figr fid="F1">1</figr>, in which genotype leads to a greater stratification of absolute risk in individuals with an affected sibling. Whether or not the implied penetrances for individuals with a positive family history actually follow this model is a question that ultimately should be empirically addressed, to indicate the adequacy of the risk model. However, this does not alter the qualitative principle outlined here that relatives' genotypes and phenotypes are informative for an individual's disease risk.</p>
</sec>
</sec>
<sec>
<st>
<p>Conclusions</p>
</st>
<p>We observed that the genotypes of relatives of known phenotype are informative for an individual's risk, independent of the same risk variants measured in the index individual. We sought to determine whether this phenomenon could be of use in the context of genetic disease risk prediction. We described and evaluated a prediction model for individuals with one or more affected first-degree relatives. Our model has the key feature of incorporating genotype information from relatives to improve the accuracy of prediction. The basic insight - that affected relatives' genotypes are informative about an individual's risk for a multifactorial, polygenic disease - is not confined to the particular analytic approach presented here and could be used with other prediction methodologies. In this work, we focused on the additive effects of confirmed disease alleles, although others have incorporated other sources of information, including non-genetic risk factors <abbrgrp>
<abbr bid="B9">9</abbr>
</abbrgrp> and interactions between risk factors <abbrgrp>
<abbr bid="B6">6</abbr>
</abbrgrp>. To the extent that such risk factors are shared between relatives, the approach outlined here to include information from affected relatives could also be applied in these other contexts. Methodologically, we used a liability threshold model. Others have developed prediction models using logistic regression <abbrgrp>
<abbr bid="B6">6</abbr>
</abbrgrp>, optimal ROC curves <abbrgrp>
<abbr bid="B10">10</abbr>
</abbrgrp>, Bayesian networks <abbrgrp>
<abbr bid="B11">11</abbr>
</abbrgrp> and support vector machines <abbrgrp>
<abbr bid="B12">12</abbr>
</abbrgrp>, using diverse criteria to evaluate performance in terms of, for example, discrimination, calibration and reclassification <abbrgrp>
<abbr bid="B13">13</abbr>
</abbrgrp>. Again, information from affected relatives could in theory be included using any of these approaches. In fact, our approach is conceptually similar to methods in livestock genetics and animal breeding that use genetic marker data for prediction, using all the data and taking into account familial relationships in complex pedigrees <abbrgrp>
<abbr bid="B14">14</abbr>
</abbrgrp>. However, in the context of human disease risk prediction, our simulations suggest that, in most cases, only incremental improvements are to be expected, meaning it is unlikely that the overall applicability of a test will be fundamentally altered.</p>
</sec>
<sec>
<st>
<p>Abbreviations</p>
</st>
<p>AUC: area under the curve; CD: Crohn's disease; GRR: genotypic relative risk; MAF: minor allele frequency; MZ: monozygotic; RAF: risk allele frequency; ROC: receiver operating characteristic; SNP: single nucleotide polymorphism; VE: variance explained.</p>
</sec>
<sec>
<st>
<p>Competing interests</p>
</st>
<p>The authors declare that they have no competing interests.</p>
</sec>
<sec>
<st>
<p>Authors' contributions</p>
</st>
<p>All authors contributed to the conception of this project. SMP and DMR developed and implemented the methods. DMR and SMP designed and performed the simulations. All authors contributed to the drafting of the manuscript.</p>
</sec>
</bdy><bm>
<ack>
<sec>
<st>
<p>Acknowledgements</p>
</st>
<p>This work was supported by a NARSAD Young Investigator Award (SMP). We thank Colm O'Dushlaine, James Wilkins, Ben Neale and Mark Daly for helpful discussion.</p>
</sec>
</ack>
<refgrp><bibl id="B1"><title><p>A HapMap harvest of insights into the genetics of common disease.</p></title><aug><au><snm>Manolio</snm><fnm>TA</fnm></au><au><snm>Brooks</snm><fnm>LD</fnm></au><au><snm>Collins</snm><fnm>FS</fnm></au></aug><source>J Clin Invest</source><pubdate>2008</pubdate><volume>118</volume><fpage>1590</fpage><lpage>1605</lpage><xrefbib><pubidlist><pubid idtype="doi">10.1172/JCI34772</pubid><pubid idtype="pmcid">2336881</pubid><pubid idtype="pmpid">18451988</pubid></pubidlist></xrefbib></bibl><bibl id="B2"><title><p>Personal genomes: The case of the missing heritability.</p></title><aug><au><snm>Maher</snm><fnm>B</fnm></au></aug><source>Nature</source><pubdate>2008</pubdate><volume>456</volume><fpage>18</fpage><lpage>21</lpage><xrefbib><pubidlist><pubid idtype="doi">10.1038/456018a</pubid><pubid idtype="pmpid" link="fulltext">18987709</pubid></pubidlist></xrefbib></bibl><bibl id="B3"><title><p>Prediction of individual genetic risk of complex disease.</p></title><aug><au><snm>Wray</snm><fnm>NR</fnm></au><au><snm>Goddard</snm><fnm>ME</fnm></au><au><snm>Visscher</snm><fnm>PM</fnm></au></aug><source>Curr Opin Genet Dev</source><pubdate>2008</pubdate><volume>18</volume><fpage>257</fpage><lpage>263</lpage><xrefbib><pubidlist><pubid idtype="doi">10.1016/j.gde.2008.07.006</pubid><pubid idtype="pmpid" link="fulltext">18682292</pubid></pubidlist></xrefbib></bibl><bibl id="B4"><title><p>Genome-based prediction of common diseases: advances and prospects.</p></title><aug><au><snm>Janssens</snm><fnm>AC</fnm></au><au><snm>van Duijn</snm><fnm>CM</fnm></au></aug><source>Hum Mol Genet</source><pubdate>2008</pubdate><volume>17</volume><fpage>R166</fpage><lpage>173</lpage><xrefbib><pubidlist><pubid idtype="doi">10.1093/hmg/ddn250</pubid><pubid idtype="pmpid" link="fulltext">18852206</pubid></pubidlist></xrefbib></bibl><bibl id="B5"><title><p>Predicting human height by Victorian and genomic methods.</p></title><aug><au><snm>Aulchenko</snm><fnm>YS</fnm></au><au><snm>Struchalin</snm><fnm>MV</fnm></au><au><snm>Belonogova</snm><fnm>NM</fnm></au><au><snm>Axenovich</snm><fnm>TI</fnm></au><au><snm>Weedon</snm><fnm>MN</fnm></au><au><snm>Hofman</snm><fnm>A</fnm></au><au><snm>Uitterlinden</snm><fnm>AG</fnm></au><au><snm>Kayser</snm><fnm>M</fnm></au><au><snm>Oostra</snm><fnm>BA</fnm></au><au><snm>van Duijn</snm><fnm>CM</fnm></au><au><snm>Janssens</snm><fnm>AC</fnm></au><au><snm>Borodin</snm><fnm>PM</fnm></au></aug><source>Eur J Hum Genet</source><pubdate>2009</pubdate><volume>17</volume><fpage>1070</fpage><lpage>1075</lpage><xrefbib><pubidlist><pubid idtype="doi">10.1038/ejhg.2009.5</pubid><pubid idtype="pmpid" link="fulltext">19223933</pubid></pubidlist></xrefbib></bibl><bibl id="B6"><title><p>Improving the prediction of complex diseases by testing for multiple disease-susceptibility genes.</p></title><aug><au><snm>Yang</snm><fnm>Q</fnm></au><au><snm>Khoury</snm><fnm>MJ</fnm></au><au><snm>Botto</snm><fnm>L</fnm></au><au><snm>Friedman</snm><fnm>JM</fnm></au><au><snm>Flanders</snm><fnm>WD</fnm></au></aug><source>Am J Hum Genet</source><pubdate>2003</pubdate><volume>72</volume><fpage>636</fpage><lpage>649</lpage><xrefbib><pubidlist><pubid idtype="doi">10.1086/367923</pubid><pubid idtype="pmcid">1180239</pubid><pubid idtype="pmpid">12592605</pubid></pubidlist></xrefbib></bibl><bibl id="B7"><title><p>The relative risk of inflammatory bowel disease among parents and siblings of Crohn's disease patients.</p></title><aug><au><snm>Fielding</snm><fnm>JF</fnm></au></aug><source>J Clin Gastroenterol</source><pubdate>1986</pubdate><volume>8</volume><fpage>655</fpage><lpage>657</lpage><xrefbib><pubidlist><pubid idtype="doi">10.1097/00004836-198612000-00013</pubid><pubid idtype="pmpid">3805664</pubid></pubidlist></xrefbib></bibl><bibl id="B8"><title><p>Genome-wide association defines more than 30 distinct susceptibility loci for Crohn's disease.</p></title><aug><au><snm>Barrett</snm><fnm>JC</fnm></au><au><snm>Hansoul</snm><fnm>S</fnm></au><au><snm>Nicolae</snm><fnm>DL</fnm></au><au><snm>Cho</snm><fnm>JH</fnm></au><au><snm>Duerr</snm><fnm>RH</fnm></au><au><snm>Rioux</snm><fnm>JD</fnm></au><au><snm>Brant</snm><fnm>SR</fnm></au><au><snm>Silverberg</snm><fnm>MS</fnm></au><au><snm>Taylor</snm><fnm>KD</fnm></au><au><snm>Barmada</snm><fnm>MM</fnm></au><au><snm>Bitton</snm><fnm>A</fnm></au><au><snm>Dassopoulos</snm><fnm>T</fnm></au><au><snm>Datta</snm><fnm>LW</fnm></au><au><snm>Green</snm><fnm>T</fnm></au><au><snm>Griffiths</snm><fnm>AM</fnm></au><au><snm>Kistner</snm><fnm>EO</fnm></au><au><snm>Murtha</snm><fnm>MT</fnm></au><au><snm>Regueiro</snm><fnm>MD</fnm></au><au><snm>Rotter</snm><fnm>JI</fnm></au><au><snm>Schumm</snm><fnm>LP</fnm></au><au><snm>Steinhart</snm><fnm>AH</fnm></au><au><snm>Targan</snm><fnm>SR</fnm></au><au><snm>Xavier</snm><fnm>RJ</fnm></au><au><snm>Libioulle</snm><fnm>C</fnm></au><au><snm>Sandor</snm><fnm>C</fnm></au><au><snm>Lathrop</snm><fnm>M</fnm></au><au><snm>Belaiche</snm><fnm>J</fnm></au><au><snm>Dewit</snm><fnm>O</fnm></au><au><snm>Gut</snm><fnm>I</fnm></au><au><snm>Heath</snm><fnm>S</fnm></au><etal/></aug><source>Nat Genet</source><pubdate>2008</pubdate><volume>40</volume><fpage>955</fpage><lpage>962</lpage><xrefbib><pubidlist><pubid idtype="doi">10.1038/ng.175</pubid><pubid idtype="pmcid">2574810</pubid><pubid idtype="pmpid">18587394</pubid></pubidlist></xrefbib></bibl><bibl id="B9"><title><p>Defining high-risk individuals in a population-based molecular-epidemiological study of lung cancer.</p></title><aug><au><snm>Cassidy</snm><fnm>A</fnm></au><au><snm>Myles</snm><fnm>JP</fnm></au><au><snm>Liloglou</snm><fnm>T</fnm></au><au><snm>Duffy</snm><fnm>SW</fnm></au><au><snm>Field</snm><fnm>JK</fnm></au></aug><source>Int J Oncol</source><pubdate>2006</pubdate><volume>28</volume><fpage>1295</fpage><lpage>1301</lpage><xrefbib><pubid idtype="pmpid" link="fulltext">16596247</pubid></xrefbib></bibl><bibl id="B10"><title><p>Using the optimal receiver operating characteristic curve to design a predictive genetic test, exemplified with type 2 diabetes.</p></title><aug><au><snm>Lu</snm><fnm>Q</fnm></au><au><snm>Elston</snm><fnm>RC</fnm></au></aug><source>Am J Hum Genet</source><pubdate>2008</pubdate><volume>82</volume><fpage>641</fpage><lpage>651</lpage><xrefbib><pubidlist><pubid idtype="doi">10.1016/j.ajhg.2007.12.025</pubid><pubid idtype="pmcid">2664997</pubid><pubid idtype="pmpid">18319073</pubid></pubidlist></xrefbib></bibl><bibl id="B11"><title><p>Bayesian and classical estimation of mixed logit: an application to genetic testing.</p></title><aug><au><snm>Regier</snm><fnm>DA</fnm></au><au><snm>Ryan</snm><fnm>M</fnm></au><au><snm>Phimister</snm><fnm>E</fnm></au><au><snm>Marra</snm><fnm>CA</fnm></au></aug><source>J Health Econ</source><pubdate>2009</pubdate><volume>28</volume><fpage>598</fpage><lpage>610</lpage><xrefbib><pubidlist><pubid idtype="doi">10.1016/j.jhealeco.2008.11.003</pubid><pubid idtype="pmpid" link="fulltext">19345433</pubid></pubidlist></xrefbib></bibl><bibl id="B12"><title><p>Gene-based multiclass cancer diagnosis with class-selective rejections.</p></title><aug><au><snm>Jrad</snm><fnm>N</fnm></au><au><snm>Grall-Ma&#235;s</snm><fnm>E</fnm></au><au><snm>Beauseroy</snm><fnm>P</fnm></au></aug><source>J Biomed Biotechnol</source><pubdate>2009</pubdate><volume>2009</volume><fpage>608701</fpage><xrefbib><pubidlist><pubid idtype="pmcid">2703706</pubid><pubid idtype="pmpid">19584932</pubid></pubidlist></xrefbib></bibl><bibl id="B13"><title><p>Risk scores for prediction of coronary heart disease: an update.</p></title><aug><au><snm>Wilson</snm><fnm>PW</fnm></au></aug><source>Endocrinol Metab Clin North Am</source><pubdate>2009</pubdate><volume>38</volume><fpage>33</fpage><lpage>44</lpage><xrefbib><pubidlist><pubid idtype="doi">10.1016/j.ecl.2008.11.001</pubid><pubid idtype="pmpid" link="fulltext">19217511</pubid></pubidlist></xrefbib></bibl><bibl id="B14"><title><p>Genomic selection.</p></title><aug><au><snm>Goddard</snm><fnm>ME</fnm></au><au><snm>Hayes</snm><fnm>BJ</fnm></au></aug><source>J Anim Breed Genet</source><pubdate>2007</pubdate><volume>124</volume><fpage>323</fpage><lpage>330</lpage><xrefbib><pubidlist><pubid idtype="doi">10.1111/j.1439-0388.2007.00702.x</pubid><pubid idtype="pmpid" link="fulltext">18076469</pubid></pubidlist></xrefbib></bibl></refgrp>
</bm></art>
