<?xml version='1.0'?>
<!DOCTYPE art SYSTEM 'http://www.biomedcentral.com/xml/article.dtd'>
<art><ui>1741-7015-9-103</ui><ji>1741-7015</ji><fm>
<dochead>Research article</dochead>
<bibl>
<title>
<p>Developing risk prediction models for type 2 diabetes: a systematic review of methodology and reporting</p>
</title>
<aug>
<au ca="yes" id="A1"><snm>Collins</snm><mi>S</mi><fnm>Gary</fnm><insr iid="I1"/><email>gary.collins@csm.ox.ac.uk</email></au>
<au id="A2"><snm>Mallett</snm><fnm>Susan</fnm><insr iid="I1"/><email>susan.mallett@csm.ox.ac.uk</email></au>
<au id="A3"><snm>Omar</snm><fnm>Omar</fnm><insr iid="I1"/><email>omar.omar@csm.ox.ac.uk</email></au>
<au id="A4"><snm>Yu</snm><fnm>Ly-Mee</fnm><insr iid="I1"/><email>ly-mee.yu@csm.ox.ac.uk</email></au>
</aug>
<insg>
<ins id="I1"><p>Centre for Statistics in Medicine, University of Oxford, Wolfson College Annexe, Linton Road, Oxford, OX2 6UD, UK</p></ins>
</insg>
<source>BMC Medicine</source>
<issn>1741-7015</issn>
<pubdate>2011</pubdate>
<volume>9</volume>
<issue>1</issue>
<fpage>103</fpage>
<url>http://www.biomedcentral.com/1741-7015/9/103</url>
<xrefbib><pubidlist><pubid idtype="doi">10.1186/1741-7015-9-103</pubid><pubid idtype="pmpid">21902820</pubid></pubidlist></xrefbib>
</bibl>
<history><rec><date><day>9</day><month>5</month><year>2011</year></date></rec><acc><date><day>8</day><month>9</month><year>2011</year></date></acc><pub><date><day>8</day><month>9</month><year>2011</year></date></pub></history>
<cpyrt><year>2011</year><collab>Collins et al; licensee BioMed Central Ltd.</collab><note>This is an Open Access article distributed under the terms of the Creative Commons Attribution License (<url>http://creativecommons.org/licenses/by/2.0</url>), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.</note></cpyrt>
<abs>
<sec>
<st>
<p>Abstract</p>
</st>
<sec>
<st>
<p>Background</p>
</st>
<p>The World Health Organisation estimates that by 2030 there will be approximately 350 million people with type 2 diabetes. Associated with renal complications, heart disease, stroke and peripheral vascular disease, early identification of patients with undiagnosed type 2 diabetes or those at an increased risk of developing type 2 diabetes is an important challenge. We sought to systematically review and critically assess the conduct and reporting of methods used to develop risk prediction models for predicting the risk of having undiagnosed (prevalent) or future risk of developing (incident) type 2 diabetes in adults.</p>
</sec>
<sec>
<st>
<p>Methods</p>
</st>
<p>We conducted a systematic search of PubMed and EMBASE databases to identify studies published before May 2011 that describe the development of models combining two or more variables to predict the risk of prevalent or incident type 2 diabetes. We extracted key information that describes aspects of developing a prediction model including study design, sample size and number of events, outcome definition, risk predictor selection and coding, missing data, model-building strategies and aspects of performance.</p>
</sec>
<sec>
<st>
<p>Results</p>
</st>
<p>Thirty-nine studies comprising 43 risk prediction models were included. Seventeen studies (44%) reported the development of models to predict incident type 2 diabetes, whilst 15 studies (38%) described the derivation of models to predict prevalent type 2 diabetes. In nine studies (23%), the number of events per variable was less than ten, whilst in fourteen studies there was insufficient information reported for this measure to be calculated. The number of candidate risk predictors ranged from four to sixty-four, and in seven studies it was unclear how many risk predictors were considered. A method, not recommended to select risk predictors for inclusion in the multivariate model, using statistical significance from univariate screening was carried out in eight studies (21%), whilst the selection procedure was unclear in ten studies (26%). Twenty-one risk prediction models (49%) were developed by categorising all continuous risk predictors. The treatment and handling of missing data were not reported in 16 studies (41%).</p>
</sec>
<sec>
<st>
<p>Conclusions</p>
</st>
<p>We found widespread use of poor methods that could jeopardise model development, including univariate pre-screening of variables, categorisation of continuous risk predictors and poor handling of missing data. The use of poor methods affects the reliability of the prediction model and ultimately compromises the accuracy of the probability estimates of having undiagnosed type 2 diabetes or the predicted risk of developing type 2 diabetes. In addition, many studies were characterised by a generally poor level of reporting, with many key details to objectively judge the usefulness of the models often omitted.</p>
</sec>
</sec>
</abs>
</fm><bdy>
<sec>
<st>
<p>Background</p>
</st>
<p>The global incidence of type 2 diabetes is increasing rapidly. The World Health Organisation predicts that the number of people with type 2 diabetes will double to at least 350 million worldwide by 2030 unless appropriate action is taken <abbrgrp>
<abbr bid="B1">1</abbr>
</abbrgrp>. Diabetes is often associated with renal complications, heart disease, stroke and peripheral vascular disease, which lead to increased morbidity and premature mortality, and individuals with diabetes have mortality rates nearly twice as high as those without diabetes <abbrgrp>
<abbr bid="B2">2</abbr>
</abbrgrp>. Thus the growing healthcare burden will present an overwhelming challenge in terms of health service resources around the world. Early identification of patients with undiagnosed type 2 diabetes or those at an increased risk of developing type 2 diabetes is thus a crucial issue to be resolved.</p>
<p>Risk prediction models have considerable potential to contribute to the decision-making process regarding the clinical management of a patient. Typically, they are multivariable, combining several patient risk predictors that are used to predict an individual's treatment outcome. Healthcare interventions or lifestyle changes can then be targeted towards those at an increased risk of developing a disease. Similarly, the function of these models can also be to screen individuals to identify those who are at an increased risk of having an undiagnosed condition, for which diagnosis management and treatment can be initiated and ultimately improve patient outcomes.</p>
<p>However, despite the large number of risk prediction models being developed, only a very small minority end up being routinely used in clinical practice. Reasons for the uptake of one risk prediction model and not another is unclear, though poor design, conduct and ultimately reporting will inevitably be leading causes for apprehension. Lack of objective and unbiased evaluation (validation) is a clear concern, but also, when performance is evaluated, poor performance data to support the uptake of a risk prediction model can contribute to scepticism regarding the reliability and ultimately the clinical usefulness of a model. Dictating the performance is how the risk prediction model was originally developed.</p>
<p>There is a growing concern that the majority of risk prediction models are poorly developed because they are based on a small and inappropriate selection of the cohort, questionable handling of continuous risk predictors, inappropriate treatment of missing data, use of flawed or unsuitable statistical methods and, ultimately, a lack of transparent reporting of the steps taken to derive the model <abbrgrp>
<abbr bid="B3">3</abbr>
<abbr bid="B4">4</abbr>
<abbr bid="B5">5</abbr>
<abbr bid="B6">6</abbr>
<abbr bid="B7">7</abbr>
<abbr bid="B8">8</abbr>
<abbr bid="B9">9</abbr>
<abbr bid="B10">10</abbr>
<abbr bid="B11">11</abbr>
<abbr bid="B12">12</abbr>
</abbrgrp>.</p>
<p>Whilst a number of guidelines in the medical literature exist for the reporting of randomised, controlled trials <abbrgrp>
<abbr bid="B13">13</abbr>
</abbrgrp>, observational studies <abbrgrp>
<abbr bid="B14">14</abbr>
</abbrgrp>, diagnostic accuracy <abbrgrp>
<abbr bid="B15">15</abbr>
</abbrgrp>, systematic reviews and meta-analyses <abbrgrp>
<abbr bid="B16">16</abbr>
</abbrgrp> and tumour marker prognostic studies <abbrgrp>
<abbr bid="B17">17</abbr>
</abbrgrp>, there are currently no consensus guidelines for developing and evaluating multivariable risk prediction models in terms of conduct or reporting. Although a number of texts and guidance exist that cover many of the issues in developing a risk prediction model <abbrgrp>
<abbr bid="B18">18</abbr>
<abbr bid="B19">19</abbr>
<abbr bid="B20">20</abbr>
</abbrgrp>, these are spread across the literature at varying levels of prior knowledge and expertise. Raising the quality of studies is likely to require a single, concise resource for easy use by authors, peer reviewers and ultimately consumers of risk prediction models to objectively evaluate the reliability and usefulness of new risk prediction models. Furthermore, there is currently no guidance on what aspects of model development and validation should be reported so that readers can objectively judge the value of the prediction model.</p>
<p>The aim of this article is to review the methodological conduct and reporting of articles deriving risk prediction models for predicting the risk of having undiagnosed (prevalent) type 2 diabetes or the future risk of developing (incident) type 2 diabetes.</p>
</sec>
<sec>
<st>
<p>Methods</p>
</st>
<p>We identified articles that presented new risk prediction models for predicting the risk of detecting undiagnosed (prevalent) diabetes or predicting the risk of developing (incident) type 2 diabetes. The PubMed and EMBASE databases were initially searched on 25 February 2010 (a final search was conducted on 13 May 2011). The search string is given in Appendix 1. Articles were restricted to the English-language literature. Searches included articles from all years in the PubMed (from 1965) and EMBASE (from 1980) databases. Additional articles were identified by searching the references in papers identified by the search strategy and our own personal reference lists.</p>
<sec>
<st>
<p>Inclusion criteria</p>
</st>
<p>Articles were included if they met our inclusion criteria: the primary aim of the article had to be the development of a multivariable (more than two variables) risk prediction model for type 2 diabetes (prediabetes, undiagnosed diabetes or incident diabetes). Articles were excluded if (1) they included only validation of a preexisting risk prediction model (that is, the article did not develop a model), (2) the outcome was gestational diabetes, (3) the outcome was type 1 diabetes, (4) participants were children or (5) the authors developed a genetic risk prediction model.</p>
</sec>
<sec>
<st>
<p>Data extraction, analysis and reporting</p>
</st>
<p>One person (GSC) screened the titles and abstracts of all articles identified by the search string to exclude articles not pertaining to risk prediction models. Items were recorded by duplicate data extraction by combinations of two from four reviewers (GSC, SM, LMY and OO). One reviewer (GSC) assessed all articles and all items, whilst the other reviewers collectively assessed all articles (SM, LMY and OO). Articles were assigned to reviewers (SM, LMY and OO) in a random manner using variable block randomisation. In articles that presented more than one model, the model that was recommended by the authors was selected. No study protocol is available. Data items extracted for this review include study design, sample size and number of events, outcome definition, risk predictor selection and coding, missing data, model-building strategies and aspects of performance. The data extraction form for this article was based largely on two previous reviews of prognostic models in cancer <abbrgrp>
<abbr bid="B3">3</abbr>
<abbr bid="B21">21</abbr>
<abbr bid="B22">22</abbr>
</abbrgrp> and can be obtained on request from the first author (GSC).</p>
<p>For the primary analysis, we calculated the proportion of studies and, where appropriate, the number of risk prediction models for each of the items extracted. We have reported our systematic review in accordance with the PRISMA guidelines <abbrgrp>
<abbr bid="B16">16</abbr>
</abbrgrp>, with the exception of items relating to meta-analysis, as our study includes no formal meta-analysis.</p>
</sec>
</sec>
<sec>
<st>
<p>Results</p>
</st>
<p>The search string retrieved 779 articles in PubMed and 792 articles in EMBASE, and, after removing duplicates, our database search yielded 799 articles (see Figure <figr fid="F1">1</figr>). Thirty-five articles met our inclusion criteria, and a further four articles were retrieved by hand-searching reference lists or citation searches. In total, 39 studies were eligible for review, among which 32 studies (83%) were published between January 2005 and May 2011. Thirteen studies (33%) were published in <it>Diabetes Care</it>, five studies (13%) were published in <it>Diabetes Research and Clinical Practice</it>, four studies (10%) were published in <it>Diabetic Medicine </it>and three studies (8%) were published in the <it>Annals of Internal Medicine</it>. Four studies reported separate risk prediction models for men and women <abbrgrp>
<abbr bid="B23">23</abbr>
<abbr bid="B24">24</abbr>
<abbr bid="B25">25</abbr>
<abbr bid="B26">26</abbr>
</abbrgrp>, thus our review assesses a total of 43 risk prediction models from 39 articles. Thus the denominator is 39 when reference is made to studies and 43 when reference is made to risk prediction models. The outcomes predicted by the models varied because of different definitions of diabetes and patients included (Tables <tblr tid="T1">1</tblr>, <tblr tid="T2">2</tblr> and <tblr tid="T3">3</tblr>). Seventeen studies (44%) described a model to predict the development of diabetes (incident diabetes) <abbrgrp>
<abbr bid="B23">23</abbr>
<abbr bid="B25">25</abbr>
<abbr bid="B27">27</abbr>
<abbr bid="B28">28</abbr>
<abbr bid="B29">29</abbr>
<abbr bid="B30">30</abbr>
<abbr bid="B31">31</abbr>
<abbr bid="B32">32</abbr>
<abbr bid="B33">33</abbr>
<abbr bid="B34">34</abbr>
<abbr bid="B35">35</abbr>
<abbr bid="B36">36</abbr>
<abbr bid="B37">37</abbr>
<abbr bid="B38">38</abbr>
<abbr bid="B39">39</abbr>
<abbr bid="B40">40</abbr>
</abbrgrp>, fifteen (38%) described the development of a model to predict the risk of having undiagnosed diabetes <abbrgrp>
<abbr bid="B41">41</abbr>
<abbr bid="B42">42</abbr>
<abbr bid="B43">43</abbr>
<abbr bid="B44">44</abbr>
<abbr bid="B45">45</abbr>
<abbr bid="B46">46</abbr>
<abbr bid="B47">47</abbr>
<abbr bid="B48">48</abbr>
<abbr bid="B49">49</abbr>
<abbr bid="B50">50</abbr>
<abbr bid="B51">51</abbr>
<abbr bid="B52">52</abbr>
<abbr bid="B53">53</abbr>
</abbrgrp>, four described the development of a prediction model for diagnosed and undiagnosed diabetes <abbrgrp>
<abbr bid="B24">24</abbr>
<abbr bid="B26">26</abbr>
<abbr bid="B54">54</abbr>
<abbr bid="B55">55</abbr>
</abbrgrp>, one described the development of a prediction model for undiagnosed diabetes and prediabetes <abbrgrp>
<abbr bid="B56">56</abbr>
</abbrgrp>, one described the development of a prediction model for abnormal postchallenge plasma glucose level (defined as &#8805; 140 mg/dL) to predict undiagnosed diabetes <abbrgrp>
<abbr bid="B57">57</abbr>
</abbrgrp> and one described the development of a model to predict the risk of undiagnosed type 2 diabetes and impaired glucose regulation <abbrgrp>
<abbr bid="B58">58</abbr>
</abbrgrp>.</p>
<fig id="F1"><title><p>Figure 1</p></title><caption><p>Flow diagram of selected studies</p></caption><text>
   <p><b>Flow diagram of selected studies</b>.</p>
</text><graphic file="1741-7015-9-103-1" hint_layout="double"/></fig>
<tbl id="T1"><title><p>Table 1</p></title><caption><p>Models for predicting risk of incident diabetes<sup>a</sup></p></caption><tblbdy cols="5">
      <r>
         <c ca="left">
            <p>
               <b>Study</b>
            </p>
         </c>
         <c ca="center">
            <p>
               <b>Year</b>
            </p>
         </c>
         <c ca="left">
            <p>
               <b>Country</b>
            </p>
         </c>
         <c ca="left">
            <p>
               <b>Definition of diabetes as reported</b>
            </p>
         </c>
         <c ca="left">
            <p>
               <b>Risk predictors in the model</b>
            </p>
         </c>
      </r>
      <r>
         <c cspan="5">
            <hr/>
         </c>
      </r>
      <r>
         <c ca="left">
            <p>Aekplakorn <it>et al. </it><abbrgrp><abbr bid="B27">27</abbr></abbrgrp></p>
         </c>
         <c ca="center">
            <p>2006</p>
         </c>
         <c ca="left">
            <p>Thailand</p>
         </c>
         <c ca="left">
            <p>Diabetes diagnosed according to ADA criteria as FPG level &#8805; 126 mg/dL (7.0 mmol/L) or 2-h PG level &#8805; 200 mg/dL (11.1 mmol/L) or a previous diagnosis of diabetes</p>
         </c>
         <c ca="left">
            <p>Age, sex, BMI, abdominal obesity (waist circumference), hypertension, family history of diabetes</p>
         </c>
      </r>
      <r>
         <c ca="left">
            <p>Balkau <it>et al. </it><abbrgrp><abbr bid="B23">23</abbr></abbrgrp></p>
         </c>
         <c ca="center">
            <p>2008</p>
         </c>
         <c ca="left">
            <p>France</p>
         </c>
         <c ca="left">
            <p>Incident cases of diabetes identified by treatment for diabetes or FPG &#8805; 7.0 mmol/L</p>
         </c>
         <c ca="left">
            <p>Men: waist circumference, smoking status, hypertension.</p>
            <p>Women: waist circumference, family history of diabetes, hypertension.</p>
         </c>
      </r>
      <r>
         <c ca="left">
            <p>Chen <it>et al. </it><abbrgrp><abbr bid="B28">28</abbr></abbrgrp></p>
         </c>
         <c ca="center">
            <p>2010</p>
         </c>
         <c ca="left">
            <p>Australia</p>
         </c>
         <c ca="left">
            <p>Incident diabetes at follow-up defined by treatment with insulin or oral hypoglycaemic agents, FPG level &#8805; 7.0 mmol/L, or 2-hPG in OGTT &#8805; 11.1 mmol/L</p>
         </c>
         <c ca="left">
            <p>Age, sex, ethnicity, parental history of diabetes, history of high blood glucose, use of antihypertensive medication, smoking status, physical activity, waist circumference</p>
         </c>
      </r>
      <r>
         <c ca="left">
            <p>Chien <it>et al. </it><abbrgrp><abbr bid="B29">29</abbr></abbrgrp></p>
         </c>
         <c ca="center">
            <p>2009</p>
         </c>
         <c ca="left">
            <p>Taiwan</p>
         </c>
         <c ca="left">
            <p>Diabetes defined by FPG &#8805; 7.0 mmol/L or use of oral hypoglycaemic or insulin medication</p>
         </c>
         <c ca="left">
            <p>Age, BMI, WBC count, and triacylglycerol, HDL cholesterol, FPG levels</p>
         </c>
      </r>
      <r>
         <c ca="left">
            <p>Gao <it>et al. </it><abbrgrp><abbr bid="B30">30</abbr></abbrgrp></p>
         </c>
         <c ca="center">
            <p>2009</p>
         </c>
         <c ca="left">
            <p>Mauritius</p>
         </c>
         <c ca="left">
            <p>Diabetes diagnosed according to 2006 WHO/IDF criteria. Diabetes cases were defined as those who reported a history of diabetes and treatment with glucose-lowering medication and/or FPG &#8805; 7.0 mmol/L and/or 2-h PG &#8805; 11.1 mmol/L.</p>
         </c>
         <c ca="left">
            <p>Age, sex, BMI, waist circumference, family history of diabetes</p>
         </c>
      </r>
      <r>
         <c ca="left">
            <p>Gupta <it>et al. </it><abbrgrp><abbr bid="B40">40</abbr></abbrgrp></p>
         </c>
         <c ca="center">
            <p>2008</p>
         </c>
         <c ca="left">
            <p>UK, Ireland, Sweden, Denmark, Iceland, Norway, Finland</p>
         </c>
         <c ca="left">
            <p>FPG &#8805; 7 mmol/L or random glucose &#8805; 11.1 mmol/L at randomisation or screening visits. Self-reported history of diabetes and drug or dietary therapy for diabetes. Presence of both impaired FPG (> 6 and &lt; 7 mmol/L) and glycosuria at randomisation or screening visits.</p>
         </c>
         <c ca="left">
            <p>Age, sex, FPG, BMI, randomised group, triglycerides, systolic blood pressure, total cholesterol, use of non-coronary artery disease medication, HDL cholesterol, alcohol intake</p>
         </c>
      </r>
      <r>
         <c ca="left">
            <p>Hippisley-Cox <it>et al. </it><abbrgrp><abbr bid="B25">25</abbr></abbrgrp></p>
         </c>
         <c ca="center">
            <p>2009</p>
         </c>
         <c ca="left">
            <p>UK</p>
         </c>
         <c ca="left">
            <p>Patients with diabetes identified by searching electronic health records for diagnosis Read code for diabetes (C10%)</p>
         </c>
         <c ca="left">
            <p>Age, BMI, family history of diabetes, smoking status, treated hypertension, current treatment with corticosteroids, diagnosis of CVD, social deprivation, ethnicity</p>
         </c>
      </r>
      <r>
         <c ca="left">
            <p>Kahn <it>et al. </it><abbrgrp><abbr bid="B31">31</abbr></abbrgrp></p>
         </c>
         <c ca="center">
            <p>2009</p>
         </c>
         <c ca="left">
            <p>USA</p>
         </c>
         <c ca="left">
            <p>Participants were considered to have diabetes if they reported a history of physician-diagnosed 'diabetes (sugar in the blood)' or if their FPG level was &#8805; 7.0 mmol/L (&#8805; 126 mg/dL), their non-FPG level was at least 11.1 mmol/L (&#8805; 200 mg/dL), or their 2-h PG at year 9 follow-up was &#8805; 11.1 mmol/L (&#8805; 200 mg/dL). Additional cases of incident diabetes were identified by criteria-based abstractions of hospital records.</p>
         </c>
         <c ca="left">
            <p>Diabetic mother, diabetic father, hypertension, ethnicity, age, smoking status, waist circumference (sex), height (sex), resting pulse (sex), weight (sex)</p>
         </c>
      </r>
      <r>
         <c ca="left">
            <p>Kolberg <it>et al. </it><abbrgrp><abbr bid="B32">32</abbr></abbrgrp></p>
         </c>
         <c ca="center">
            <p>2009</p>
         </c>
         <c ca="left">
            <p>Denmark</p>
         </c>
         <c ca="left">
            <p>Diagnosis of type 2 diabetes was defined by 2-h PG &#8805; 11.1 mmol/L on OGTT or FPG &#8805; 7.0 mmol/L</p>
         </c>
         <c ca="left">
            <p>Adiponectin, C-reactive protein, ferritin, interleukin 2 receptor A, glucose, insulin</p>
         </c>
      </r>
      <r>
         <c ca="left">
            <p>Lindstr&#246;m <it>et al. </it><abbrgrp><abbr bid="B33">33</abbr></abbrgrp></p>
         </c>
         <c ca="center">
            <p>2003</p>
         </c>
         <c ca="left">
            <p>Finland</p>
         </c>
         <c ca="left">
            <p>Subjects not on antidiabetic drug treatment were diagnosed as having diabetes according to WHO 1999 criteria <abbrgrp><abbr bid="B12">12</abbr></abbrgrp> if they had FPG &#8805; 7.0 mmol/L (fasting whole blood glucose &#8805; 6.1 mmol/L) and/or 2-h PG &#8805; 11.1 mmol/L (2-h whole blood glucose &#8805; 10.0 mmol/L)</p>
         </c>
         <c ca="left">
            <p>Age, BMI, waist circumference, use of blood pressure medication, history of high blood glucose, physical activity, daily consumption of vegetables</p>
         </c>
      </r>
      <r>
         <c ca="left">
            <p>Liu <it>et al. </it><abbrgrp><abbr bid="B61">61</abbr></abbrgrp></p>
         </c>
         <c ca="center">
            <p>2011</p>
         </c>
         <c ca="left">
            <p>China</p>
         </c>
         <c ca="left">
            <p>Diabetes was diagnosed according to ADA criteria as FPG &#8805; 126 mg/dL (7.0 mmol/L) or OGTT &#8805; 200 mg/dL (11.1 mmol/L). Incident diabetes was ascertained from multiple sources: self-report, FPG and OGTT results, and data on prescribing of hypoglycaemic medication at follow-up survey.</p>
         </c>
         <c ca="left">
            <p>Age, hypertension, history of high blood glucose, BMI, high FPG</p>
         </c>
      </r>
      <r>
         <c ca="left">
            <p>Schmidt <it>et al. </it><abbrgrp><abbr bid="B34">34</abbr></abbrgrp></p>
         </c>
         <c ca="center">
            <p>2005</p>
         </c>
         <c ca="left">
            <p>USA</p>
         </c>
         <c ca="left">
            <p>Incident diabetes defined by OGTT (FPG &#8805; 7.0 mmol/L or a 2-h PG &#8805; 11.1 mmol/L) at end of follow-up (1996 to 1998) or as report of clinical diagnosis or treatment for diabetes during follow-up period</p>
         </c>
         <c ca="left">
            <p>Age, ethnicity, parental history of diabetes, FPG, systolic blood pressure, waist circumference, height, HDL cholesterol, triglycerides</p>
         </c>
      </r>
      <r>
         <c ca="left">
            <p>Schulze <it>et al. </it><abbrgrp><abbr bid="B35">35</abbr></abbrgrp></p>
         </c>
         <c ca="center">
            <p>2007</p>
         </c>
         <c ca="left">
            <p>Germany</p>
         </c>
         <c ca="left">
            <p>Incident diabetes identified through August 2005 by self-reports of diabetes diagnosis, diabetes relevant medication or dietary treatment due to diabetes. All cases were verified by diagnosing physician on basis of ICD-10 criteria.</p>
         </c>
         <c ca="left">
            <p>Waist circumference, height, age, hypertension, intake of red meat, intake of whole-grain bread, coffee consumption, alcohol consumption, physical activity, former smoker, current heavy smoker (&#8805; 20 cigarettes/day</p>
         </c>
      </r>
      <r>
         <c ca="left">
            <p>Stern <it>et al. </it><abbrgrp><abbr bid="B36">36</abbr></abbrgrp></p>
         </c>
         <c ca="center">
            <p>2002</p>
         </c>
         <c ca="left">
            <p>USA</p>
         </c>
         <c ca="left">
            <p>Diabetes diagnosed according to WHO criteria (FPG &#8805; 7.0 mmol/L (&#8805; 126 mg/dL) or 2-h PG &#8805; 11.1 mmol/L (&#8805; 200 mg/dL)) <abbrgrp><abbr bid="B3">3</abbr></abbrgrp>. Persons who reported history of diabetes diagnosed by physician and reported current use of insulin or oral antidiabetic agent were considered to have diabetes regardless of plasma glucose level.</p>
         </c>
         <c ca="left">
            <p>Age, sex, ethnicity, FPG, systolic blood pressure, HDL cholesterol, BMI, family history of diabetes</p>
         </c>
      </r>
      <r>
         <c ca="left">
            <p>Sun <it>et al. </it><abbrgrp><abbr bid="B37">37</abbr></abbrgrp></p>
         </c>
         <c ca="center">
            <p>2009</p>
         </c>
         <c ca="left">
            <p>Taiwan</p>
         </c>
         <c ca="left">
            <p>Not defined</p>
         </c>
         <c ca="left">
            <p>Sex, education level, age, current smoking status, BMI, waist circumference, family history of diabetes, hypertension, FPG</p>
         </c>
      </r>
      <r>
         <c ca="left">
            <p>Tuomilehto <it>et al. </it><abbrgrp><abbr bid="B38">38</abbr></abbrgrp></p>
         </c>
         <c ca="center">
            <p>2010</p>
         </c>
         <c ca="left">
            <p>Canada, Germany, Austria, Norway, Denmark, Sweden, Finland, Israel, Spain</p>
         </c>
         <c ca="left">
            <p>Primary end point was development of type 2 diabetes, defined as a 2-h PG &#8805; 11.1 mmol/L</p>
         </c>
         <c ca="left">
            <p>Acarbose treatment, sex, serum triglyceride level, waist circumference, FPG, height, history of CVD, diagnosed hypertension</p>
         </c>
      </r>
      <r>
         <c ca="left">
            <p>Wilson <it>et al. </it><abbrgrp><abbr bid="B39">39</abbr></abbrgrp></p>
         </c>
         <c ca="center">
            <p>2007</p>
         </c>
         <c ca="left">
            <p>USA</p>
         </c>
         <c ca="left">
            <p>Participants characterised as developing new diabetes during follow-up if they (1) started receiving oral hypoglycaemic agents or insulin or (2) had a FPG &#8805; 126 mg/dL (&#8805; 7.0 mmol/L)</p>
         </c>
         <c ca="left">
            <p>FPG, BMI, HDL cholesterol, parental history of diabetes, triglyceride level, blood pressure</p>
         </c>
      </r>
   </tblbdy><tblfn>
      <p><sup>a</sup>ADA, American Diabetes Association; BMI, body mass index; WBC, white blood cell; HDL, high-density lipoprotein; WHO/IDF, World Health Organisation/International Diabetes Federation; FPG, fasting plasma glucose; OGTT, oral glucose tolerance test; ICD-10, International Statistical Classification of Diseases and Related Health Problems 10th Revision; CVD, cardiovascular disease; 2-h PG, two-hour 75-g postload plasma glucose level.</p>
   </tblfn></tbl>
<tbl id="T2"><title><p>Table 2</p></title><caption><p>Models for predicting risk of prevalent (undiagnosed) diabetes<sup>a</sup></p></caption><tblbdy cols="5">
      <r>
         <c ca="left">
            <p>
               <b>Study</b>
            </p>
         </c>
         <c ca="center">
            <p>
               <b>Year</b>
            </p>
         </c>
         <c ca="left">
            <p>
               <b>Country</b>
            </p>
         </c>
         <c ca="left">
            <p>
               <b>Definition of diabetes as reported</b>
            </p>
         </c>
         <c ca="left">
            <p>
               <b>Risk predictors in the model</b>
            </p>
         </c>
      </r>
      <r>
         <c cspan="5">
            <hr/>
         </c>
      </r>
      <r>
         <c ca="left">
            <p>Al Khalaf <it>et al. </it><abbrgrp><abbr bid="B60">60</abbr></abbrgrp></p>
         </c>
         <c ca="center">
            <p>2010</p>
         </c>
         <c ca="left">
            <p>Kuwait</p>
         </c>
         <c ca="left">
            <p>Diagnosis of diabetes based on ADA 2003 criteria. If FPG was &#8805; 7.0 mmol/L or random glucose was &#8805; 11.1 mmol/L, participants were classified as having newly diagnosed diabetes.</p>
         </c>
         <c ca="left">
            <p>Age, waist circumference, blood pressure medication, diabetes in sibling</p>
         </c>
      </r>
      <r>
         <c ca="left">
            <p>Al-Lawati <it>et al. </it><abbrgrp><abbr bid="B41">41</abbr></abbrgrp></p>
         </c>
         <c ca="center">
            <p>2007</p>
         </c>
         <c ca="left">
            <p>Oman</p>
         </c>
         <c ca="left">
            <p>Diabetes was diagnosed according to 1998 WHO criteria for OGTT (FPG 11.1 mmol/l 2-h post 75-g glucose load</p>
         </c>
         <c ca="left">
            <p>Age, waist circumference, BMI, family history of diabetes, hypertension</p>
         </c>
      </r>
      <r>
         <c ca="left">
            <p>Baan <it>et al. </it><abbrgrp><abbr bid="B42">42</abbr></abbrgrp></p>
         </c>
         <c ca="center">
            <p>1999</p>
         </c>
         <c ca="left">
            <p>The Netherlands</p>
         </c>
         <c ca="left">
            <p>Diabetes defined as use of antidiabetic medication (insulin or oral hypoglycaemic medication) and/or 2-h PG &#8805; 11.1 mmol/L according to WHO criteria</p>
         </c>
         <c ca="left">
            <p>Age, sex, use of antihypertensive medication, obesity (BMI &#8805; 30)</p>
         </c>
      </r>
      <r>
         <c ca="left">
            <p>Bang <it>et al. </it><abbrgrp><abbr bid="B43">43</abbr></abbrgrp></p>
         </c>
         <c ca="center">
            <p>2009</p>
         </c>
         <c ca="left">
            <p>USA</p>
         </c>
         <c ca="left">
            <p>Undiagnosed diabetes defined as FPG &#8805; 7.0 mmol/L (&#8805; 126 mg/dL)</p>
         </c>
         <c ca="left">
            <p>Age, sex, family history of diabetes, history of hypertension, obesity (BMI or waist circumference), physical activity</p>
         </c>
      </r>
      <r>
         <c ca="left">
            <p>Borrell <it>et al. </it><abbrgrp><abbr bid="B59">59</abbr></abbrgrp></p>
         </c>
         <c ca="center">
            <p>2007</p>
         </c>
         <c ca="left">
            <p>USA</p>
         </c>
         <c ca="left">
            <p>FPG &#8805; 126 mg/dL</p>
         </c>
         <c ca="left">
            <p>Age, sex, ethnicity, family history of diabetes, self-reported hypertension, hypercholesterolaemia, periodontal disease</p>
         </c>
      </r>
      <r>
         <c ca="left">
            <p>Chaturvedi <it>et al. </it><abbrgrp><abbr bid="B44">44</abbr></abbrgrp></p>
         </c>
         <c ca="center">
            <p>2008</p>
         </c>
         <c ca="left">
            <p>India</p>
         </c>
         <c ca="left">
            <p>Undiagnosed diabetes defined as those with FPG &#8805; 126 mg/dL (&#8805; 7.0 mmol/L) but who were not aware of their glycaemic status</p>
         </c>
         <c ca="left">
            <p>Age, blood pressure, waist circumference, family history of diabetes</p>
         </c>
      </r>
      <r>
         <c ca="left">
            <p>Gao <it>et al. </it><abbrgrp><abbr bid="B45">45</abbr></abbrgrp></p>
         </c>
         <c ca="center">
            <p>2010</p>
         </c>
         <c ca="left">
            <p>China</p>
         </c>
         <c ca="left">
            <p>Diabetes defined according to 2006 WHO/IDF criteria. In individuals without known diabetes, undiagnosed diabetes was determined if person had FPG &#8805; 7.0 mmol/L and/or postchallenge PG &#8805; 11.1 mmol/L</p>
         </c>
         <c ca="left">
            <p>Age, waist circumference, family history of diabetes</p>
         </c>
      </r>
      <r>
         <c ca="left">
            <p>Gl&#252;mer <it>et al. </it><abbrgrp><abbr bid="B46">46</abbr></abbrgrp></p>
         </c>
         <c ca="center">
            <p>2004</p>
         </c>
         <c ca="left">
            <p>Denmark</p>
         </c>
         <c ca="left">
            <p>Individuals without known diabetes and with FPG &#8805; 7.0 mmol/L or 2-h PG &#8805; 11.1 mmol/L defined as having SDM</p>
         </c>
         <c ca="left">
            <p>Age, BMI, sex, known hypertension, physical activity, family history of diabetes</p>
         </c>
      </r>
      <r>
         <c ca="left">
            <p>Keesukphan <it>et al. </it><abbrgrp><abbr bid="B47">47</abbr></abbrgrp></p>
         </c>
         <c ca="center">
            <p>2007</p>
         </c>
         <c ca="left">
            <p>Thailand</p>
         </c>
         <c ca="left">
            <p>75-g OGTT carried out as outlined by WHO Diabetes Study Group</p>
         </c>
         <c ca="left">
            <p>Age, BMI, history of hypertension</p>
         </c>
      </r>
      <r>
         <c ca="left">
            <p>Ko <it>et al. </it><abbrgrp><abbr bid="B48">48</abbr></abbrgrp></p>
         </c>
         <c ca="center">
            <p>2010</p>
         </c>
         <c ca="left">
            <p>Hong Kong</p>
         </c>
         <c ca="left">
            <p>All subjects underwent 75-g OGTT using 1998 WHO criteria (FPG &#8805; 7.0 mmol/L and/or 2-h PG &#8805; 11.1 mmol/L</p>
         </c>
         <c ca="left">
            <p>Age, sex, BMI, hypertension, dyslipidaemia, family history of diabetes, gestational diabetes</p>
         </c>
      </r>
      <r>
         <c ca="left">
            <p>Mohan <it>et al. </it><abbrgrp><abbr bid="B49">49</abbr></abbrgrp></p>
         </c>
         <c ca="center">
            <p>2005</p>
         </c>
         <c ca="left">
            <p>India</p>
         </c>
         <c ca="left">
            <p>Diagnosis of diabetes based on WHO Consulting Group criteria, that is, 2-hr PG &#8805; 200 mg/dL</p>
         </c>
         <c ca="left">
            <p>Age, abdominal obesity (waist circumference), physical activity, family history of diabetes</p>
         </c>
      </r>
      <r>
         <c ca="left">
            <p>Pires de Sousa <it>et al. </it><abbrgrp><abbr bid="B50">50</abbr></abbrgrp></p>
         </c>
         <c ca="center">
            <p>2009</p>
         </c>
         <c ca="left">
            <p>Brazil</p>
         </c>
         <c ca="left">
            <p>FPG > 126 mg/dL (7.0 mmol/L), that is, provisional diagnosis of diabetes according to ADA criteria, classified as type 2 diabetes patients</p>
         </c>
         <c ca="left">
            <p>Age, BMI, hypertension</p>
         </c>
      </r>
      <r>
         <c ca="left">
            <p>Ramachandran <it>et al. </it><abbrgrp><abbr bid="B51">51</abbr></abbrgrp></p>
         </c>
         <c ca="center">
            <p>2005</p>
         </c>
         <c ca="left">
            <p>India</p>
         </c>
         <c ca="left">
            <p>Diabetes diagnosis based on 2-h PG &#8805; 11.1 mmol/L</p>
         </c>
         <c ca="left">
            <p>Age, family history of diabetes, BMI, waist circumference, physical activity</p>
         </c>
      </r>
      <r>
         <c ca="left">
            <p>Ruige <it>et al. </it><abbrgrp><abbr bid="B52">52</abbr></abbrgrp></p>
         </c>
         <c ca="center">
            <p>1997</p>
         </c>
         <c ca="left">
            <p>The Netherlands</p>
         </c>
         <c ca="left">
            <p>Participants underwent 75-g OGTT and were classified according to WHO criteria</p>
         </c>
         <c ca="left">
            <p>Frequent thirst, pain during walking with need to slow down, shortness of breath when walking, age, sex, obesity (BMI), obesity (men), family history of diabetes, use of antihypertensive drugs, reluctance to use bicycle for transportation</p>
         </c>
      </r>
      <r>
         <c ca="left">
            <p>Tabaei and Herman <abbrgrp><abbr bid="B53">53</abbr></abbrgrp></p>
         </c>
         <c ca="center">
            <p>2002</p>
         </c>
         <c ca="left">
            <p>Egypt</p>
         </c>
         <c ca="left">
            <p>Undiagnosed diabetes defined based on FPG &#8805; 126 mg/dL and/or 2-h PG &#8805; 200 mg/dL</p>
         </c>
         <c ca="left">
            <p>Age, random plasma glucose, postprandial time, sex, BMI</p>
         </c>
      </r>
   </tblbdy><tblfn>
      <p><sup>a</sup>SDM, screen-detected diabetes; ADA, American Diabetes Association; BMI, body mass index; WHO/IDF, World Health Organisation/International Diabetes Federation; FPG, fasting plasma glucose; OGTT, oral glucose tolerance test; 2-h PG, two-hour 75-g postload plasma glucose level.</p>
   </tblfn></tbl>
<tbl id="T3"><title><p>Table 3</p></title><caption><p>Models for predicting risk of other diabetes outcomes<sup>a</sup></p></caption><tblbdy cols="6">
      <r>
         <c ca="left">
            <p>
               <b>Study</b>
            </p>
         </c>
         <c ca="center">
            <p>
               <b>Year</b>
            </p>
         </c>
         <c ca="left">
            <p>
               <b>Country</b>
            </p>
         </c>
         <c ca="left">
            <p>
               <b>Model objective (undiagnosed or incident diabetes)</b>
            </p>
         </c>
         <c ca="left">
            <p>
               <b>Definition of diabetes as reported</b>
            </p>
         </c>
         <c ca="left">
            <p>
               <b>Risk predictors in the model</b>
            </p>
         </c>
      </r>
      <r>
         <c cspan="6">
            <hr/>
         </c>
      </r>
      <r>
         <c ca="left">
            <p>Bindraban <it>et al. </it><abbrgrp><abbr bid="B54">54</abbr></abbrgrp></p>
         </c>
         <c ca="center">
            <p>2008</p>
         </c>
         <c ca="left">
            <p>The Netherlands</p>
         </c>
         <c ca="left">
            <p>Diagnosed and undiagnosed</p>
         </c>
         <c ca="left">
            <p>FPG &#8805; 7.0 mmol/L and/or self-report</p>
         </c>
         <c ca="left">
            <p>Age, BMI, waist circumference, resting heart rate, first-degree relative with diabetes, hypertension, history of CVD, ethnicity</p>
         </c>
      </r>
      <r>
         <c ca="left">
            <p>Cabrera de Le&#243;n <it>et al. </it><abbrgrp><abbr bid="B24">24</abbr></abbrgrp></p>
         </c>
         <c ca="center">
            <p>2008</p>
         </c>
         <c ca="left">
            <p>Canary Islands</p>
         </c>
         <c ca="left">
            <p>Unclear</p>
         </c>
         <c ca="left">
            <p>Persons recorded as having diabetes if they said they had the disease and reported dietary or pharmacological treatment with oral antidiabetics or insulin. Persons were considered to have undetected type 2 diabetes if they were unaware of disease at time of inclusion in study but had two consecutive FPG values &#8805; 7 mmol/L (&#8805; 126 mg/dL).</p>
         </c>
         <c ca="left">
            <p>Men: age, waist/height ratio, family history of diabetes</p>
            <p>Women: age, waist/height ratio, family history of diabetes, gestational diabetes</p>
         </c>
      </r>
      <r>
         <c ca="left">
            <p>Gray <it>et al. </it><abbrgrp><abbr bid="B58">58</abbr></abbrgrp></p>
         </c>
         <c ca="center">
            <p>2010</p>
         </c>
         <c ca="left">
            <p>UK</p>
         </c>
         <c ca="left">
            <p>Undiagnosed and impaired glucose regulation</p>
         </c>
         <c ca="left">
            <p>Participants diagnosed with type 2 diabetes according to WHO criteria <abbrgrp><abbr bid="B1">1</abbr></abbrgrp> with FPG &#8805; 7 mmol/L and/or 2-h PG &#8805; 11.1 mmol/L. IFG defined as FPG between 6.1 and 6.9 mmol/L inclusive.</p>
         </c>
         <c ca="left">
            <p>Age, ethnicity, sex, first-degree family history of diabetes, antihypertensive therapy or history of hypertension, waist circumference, BMI</p>
         </c>
      </r>
      <r>
         <c ca="left">
            <p>Griffin <it>et al. </it><abbrgrp><abbr bid="B55">55</abbr></abbrgrp></p>
         </c>
         <c ca="center">
            <p>2000</p>
         </c>
         <c ca="left">
            <p>UK</p>
         </c>
         <c ca="left">
            <p>Diagnosed and undiagnosed</p>
         </c>
         <c ca="left">
            <p>Classified according to WHO criteria</p>
         </c>
         <c ca="left">
            <p>Sex, prescribed antihypertensive medication, prescribed steroids, age, BMI, family history of diabetes, smoking status</p>
         </c>
      </r>
      <r>
         <c ca="left">
            <p>Heikes <it>et al. </it><abbrgrp><abbr bid="B56">56</abbr></abbrgrp></p>
         </c>
         <c ca="center">
            <p>2008</p>
         </c>
         <c ca="left">
            <p>USA</p>
         </c>
         <c ca="left">
            <p>Undiagnosed and pre-diabetes</p>
         </c>
         <c ca="left">
            <p>Diabetes is defined as FPG &#8805; 126 mg/dL and/or 2-h OGTT &#8805; 200 mg/dL. Prediabetes defined as IFG and/or IGT without diabetes. Undiagnosed diabetes defined as presence of actual diabetes based on FPG and/or 2-h OGTTand absence of having been told that he or she has diabetes.</p>
         </c>
         <c ca="left">
            <p>Age, waist circumference, history of gestational diabetes, family history of diabetes, ethnicity, high blood pressure, weight, height, parental diabetes, exercise</p>
         </c>
      </r>
      <r>
         <c ca="left">
            <p>Kanaya <it>et al. </it><abbrgrp><abbr bid="B57">57</abbr></abbrgrp></p>
         </c>
         <c ca="center">
            <p>2005</p>
         </c>
         <c ca="left">
            <p>USA</p>
         </c>
         <c ca="left">
            <p>Abnormal PCPG</p>
         </c>
         <c ca="left">
            <p>Abnormal 2-h PG postchallenge test result (&#8805; 140 mg/dL)</p>
         </c>
         <c ca="left">
            <p>Sex, age, triglycerides, FPG</p>
         </c>
      </r>
      <r>
         <c ca="left">
            <p>Xie <it>et al. </it><abbrgrp><abbr bid="B26">26</abbr></abbrgrp></p>
         </c>
         <c ca="center">
            <p>2010</p>
         </c>
         <c ca="left">
            <p>China</p>
         </c>
         <c ca="left">
            <p>Diagnosed and undiagnosed</p>
         </c>
         <c ca="left">
            <p>Participants without a previous diagnosis of diabetes were categorised according to the ADA diagnostic criteria as follows: undiagnosed diabetes (FPG &#8805; 7.0 mmol/L) and impaired fasting glycaemia (6.1 to 6.9 mmol/L). Diabetes was defined as self-reported history of diabetes plus undiagnosed diabetes.</p>
         </c>
         <c ca="left">
            <p>Men: waist circumference, age</p>
            <p>Women: waist/hip ratio, age</p>
         </c>
      </r>
   </tblbdy><tblfn>
      <p><sup>a</sup>ADA, American Diabetes Association; BMI, body mass index; WHO, World Health Organisation; FPG, fasting plasma glucose; OGTT, oral glucose tolerance test; CVD, cardiovascular disease; 2-h PG, two-hour 75-g postload plasma glucose level; IGT, impaired glucose tolerance; IFG, impaired fasting glucose.</p>
   </tblfn></tbl>
<p>In terms of geography, all but two risk prediction models were developed using patient data from single countries <abbrgrp>
<abbr bid="B38">38</abbr>
<abbr bid="B40">40</abbr>
</abbrgrp>. Eight articles (21%) were from the USA <abbrgrp>
<abbr bid="B31">31</abbr>
<abbr bid="B34">34</abbr>
<abbr bid="B36">36</abbr>
<abbr bid="B39">39</abbr>
<abbr bid="B43">43</abbr>
<abbr bid="B56">56</abbr>
<abbr bid="B57">57</abbr>
<abbr bid="B59">59</abbr>
</abbrgrp>, thirteen articles (33%) were from Europe <abbrgrp>
<abbr bid="B23">23</abbr>
<abbr bid="B24">24</abbr>
<abbr bid="B25">25</abbr>
<abbr bid="B32">32</abbr>
<abbr bid="B33">33</abbr>
<abbr bid="B35">35</abbr>
<abbr bid="B40">40</abbr>
<abbr bid="B42">42</abbr>
<abbr bid="B46">46</abbr>
<abbr bid="B52">52</abbr>
<abbr bid="B54">54</abbr>
<abbr bid="B55">55</abbr>
</abbrgrp>, thirteen articles (33%) were from Asia <abbrgrp>
<abbr bid="B26">26</abbr>
<abbr bid="B27">27</abbr>
<abbr bid="B29">29</abbr>
<abbr bid="B37">37</abbr>
<abbr bid="B41">41</abbr>
<abbr bid="B44">44</abbr>
<abbr bid="B45">45</abbr>
<abbr bid="B47">47</abbr>
<abbr bid="B48">48</abbr>
<abbr bid="B49">49</abbr>
<abbr bid="B51">51</abbr>
<abbr bid="B60">60</abbr>
</abbrgrp>, two were from Africa <abbrgrp>
<abbr bid="B30">30</abbr>
<abbr bid="B53">53</abbr>
</abbrgrp>, one was from Australia <abbrgrp>
<abbr bid="B28">28</abbr>
</abbrgrp> and one was from Brazil <abbrgrp>
<abbr bid="B50">50</abbr>
</abbrgrp>.</p>
<sec>
<st>
<p>Number of patients and events</p>
</st>
<p>The number of participants included in developing risk prediction models was clearly reported in 35 (90%) studies. In the four studies where this was not clearly reported, the number of events was not reported <abbrgrp>
<abbr bid="B26">26</abbr>
<abbr bid="B34">34</abbr>
<abbr bid="B49">49</abbr>
<abbr bid="B56">56</abbr>
</abbrgrp>. The median number of participants included in model development was 2,562 (interquartile range (IQR) 1,426 to 4,965). One particular study that included 2.54 million general practice patients used separate models for men (1.26 million) and women (1.28 million) <abbrgrp>
<abbr bid="B25">25</abbr>
</abbrgrp>. Six studies (15%) did not report the number of events in the analysis <abbrgrp>
<abbr bid="B26">26</abbr>
<abbr bid="B34">34</abbr>
<abbr bid="B47">47</abbr>
<abbr bid="B49">49</abbr>
<abbr bid="B56">56</abbr>
<abbr bid="B58">58</abbr>
</abbrgrp>. Where the number of events was recorded, the median number of events used to develop the models was 205 (IQR 135 to 420).</p>
</sec>
<sec>
<st>
<p>Number of risk predictors</p>
</st>
<p>The number of candidate risk predictors was not reported or was unclear in seven studies <abbrgrp>
<abbr bid="B27">27</abbr>
<abbr bid="B31">31</abbr>
<abbr bid="B37">37</abbr>
<abbr bid="B47">47</abbr>
<abbr bid="B48">48</abbr>
<abbr bid="B52">52</abbr>
<abbr bid="B54">54</abbr>
<abbr bid="B60">60</abbr>
</abbrgrp>. A median of 14 risk predictors (IQR 9 to 19, range 4 to 64) were considered candidate risk predictors. The rationales or references for including risk predictors were provided in 13 studies <abbrgrp>
<abbr bid="B25">25</abbr>
<abbr bid="B29">29</abbr>
<abbr bid="B31">31</abbr>
<abbr bid="B32">32</abbr>
<abbr bid="B38">38</abbr>
<abbr bid="B42">42</abbr>
<abbr bid="B46">46</abbr>
<abbr bid="B49">49</abbr>
<abbr bid="B50">50</abbr>
<abbr bid="B51">51</abbr>
<abbr bid="B52">52</abbr>
<abbr bid="B56">56</abbr>
<abbr bid="B58">58</abbr>
</abbrgrp>. The final reported prediction models included a median of six risk predictors (IQR 4 to 8, range 2 to 11). In total, 47 different risk predictors were included in the final risk prediction models (see Figure <figr fid="F2">2</figr>). The most commonly identified risk predictors included in the final risk prediction model were age (<it>n </it>= 38), family history of diabetes (<it>n </it>= 28), body mass index (<it>n </it>= 24), hypertension (<it>n </it>= 24), waist circumference (<it>n </it>= 21) and sex (<it>n </it>= 17). Other commonly identified risk predictors included ethnicity and fasting glucose level (both <it>n </it>= 10) and smoking status and physical activity (both <it>n </it>= 8). Twenty-four risk predictors appeared only once in the final risk prediction model.</p>
<fig id="F2"><title><p>Figure 2</p></title><caption><p>Frequency of identified risk predictors in the final prediction models</p></caption><text>
   <p><b>Frequency of identified risk predictors in the final prediction models</b>. * Other risk predictors appearing no more than twice in the final model; (1) white blood cell. count, (2) dyslipidaemia, (3) adiponectin, (4) C-reactive protein, (5) ferritin, (6) interleuken-2 receptor A, (7) insulin, (8) glucose, (9) vegetable consumption, (10) frequent thirst, (11) pain during walking, (12) shortness of breath, (13) reluctance to use bicycle, (14) total cholesterol, (15) intake of red meat, (16) intake of whole-grain bread, (17) coffee consumption, (18) educational level, (19) postprandial time, (20) non-coronary artery disease medication, (21) acarbose treatment, (22) hypercholesterolemia, (23) periodontal disease, (24) RCT group <it>[1-24 all appear only once]</it>, (25) alcohol consumption (26) resting heart rate, (27) weight, (28) social deprivation <it>[25-28 appear twice] </it>Abbreviations: WHR = waist-to-hip ratio; HDL = High density lipoprotein; GDB = Gestational diabetes.</p>
</text><graphic file="1741-7015-9-103-2" hint_layout="double"/></fig>
</sec>
<sec>
<st>
<p>Sample size</p>
</st>
<p>The number of events per variable could not be calculated for 14 models. Nine risk prediction models (21%) were developed in which the number of events per variable was &lt; 10. Overall, the median number of events per variable was 19 (IQR 8 to 36, range 2.5 to 4,796).</p>
</sec>
<sec>
<st>
<p>Treatment of continuous risk predictors</p>
</st>
<p>Thirteen prediction models (30%) were developed retaining continuous risk predictors as continuous, twenty-one risk prediction models (49%) dichotomised or categorised all continuous risk predictors and six risk prediction models (14%) kept some continuous risk predictors as continuous and categorised others (Table <tblr tid="T4">4</tblr>). It was unclear how continuous risk predictors were treated in the development of three risk prediction models (7%). Only five studies (13%) considered nonlinear terms <abbrgrp>
<abbr bid="B23">23</abbr>
<abbr bid="B25">25</abbr>
<abbr bid="B34">34</abbr>
<abbr bid="B35">35</abbr>
<abbr bid="B40">40</abbr>
</abbrgrp>, of which only the QDScore Diabetes Risk Calculator included nonlinear terms in the final prediction model <abbrgrp>
<abbr bid="B25">25</abbr>
</abbrgrp>.</p>
<tbl id="T4"><title><p>Table 4</p></title><caption><p>Issues in model development<sup>a</sup></p></caption><tblbdy cols="2">
      <r>
         <c ca="left">
            <p>
               <b>Variables</b>
            </p>
         </c>
         <c ca="left">
            <p>
               <b>Data</b>
            </p>
         </c>
      </r>
      <r>
         <c cspan="2">
            <hr/>
         </c>
      </r>
      <r>
         <c ca="left">
            <p>Sample size, median (IQR)</p>
         </c>
         <c>
            <p/>
         </c>
      </r>
      <r>
         <c indent="1" ca="left">
            <p>Development cohort<sup>b</sup></p>
         </c>
         <c ca="left">
            <p>2,562 (1,426 to 4,965)</p>
         </c>
      </r>
      <r>
         <c indent="1" ca="left">
            <p>Validation cohorts<sup>c</sup></p>
         </c>
         <c ca="left">
            <p>1,895 (1,253 to 4,398)</p>
         </c>
      </r>
      <r>
         <c>
            <p/>
         </c>
         <c>
            <p/>
         </c>
      </r>
      <r>
         <c ca="left">
            <p>Treatment of continuous risk predictors, <it>n </it>(%)</p>
         </c>
         <c>
            <p/>
         </c>
      </r>
      <r>
         <c indent="1" ca="left">
            <p>All kept continuous</p>
         </c>
         <c ca="left">
            <p>13 (30%)</p>
         </c>
      </r>
      <r>
         <c indent="1" ca="left">
            <p>All categorised/dichotomised</p>
         </c>
         <c ca="left">
            <p>21 (49%)</p>
         </c>
      </r>
      <r>
         <c indent="1" ca="left">
            <p>Some categorised, some not</p>
         </c>
         <c ca="left">
            <p>6 (14%)</p>
         </c>
      </r>
      <r>
         <c indent="1" ca="left">
            <p>Unclear</p>
         </c>
         <c ca="left">
            <p>3 (7%)</p>
         </c>
      </r>
      <r>
         <c>
            <p/>
         </c>
         <c>
            <p/>
         </c>
      </r>
      <r>
         <c ca="left">
            <p>Treatment of missing data, <it>n </it>(%)</p>
         </c>
         <c>
            <p/>
         </c>
      </r>
      <r>
         <c indent="1" ca="left">
            <p>Not mentioned</p>
         </c>
         <c ca="left">
            <p>16 (41%)</p>
         </c>
      </r>
      <r>
         <c indent="1" ca="left">
            <p>Complete case</p>
         </c>
         <c ca="left">
            <p>21 (54%)</p>
         </c>
      </r>
      <r>
         <c indent="1" ca="left">
            <p>Multiple imputation</p>
         </c>
         <c ca="left">
            <p>1 (3%)</p>
         </c>
      </r>
      <r>
         <c indent="1" ca="left">
            <p>Other (for example, surrogate splitter for regression trees)</p>
         </c>
         <c ca="left">
            <p>1 (3%)</p>
         </c>
      </r>
      <r>
         <c>
            <p/>
         </c>
         <c>
            <p/>
         </c>
      </r>
      <r>
         <c ca="left">
            <p>Model-building strategy, <it>n </it>(%)</p>
         </c>
         <c>
            <p/>
         </c>
      </r>
      <r>
         <c indent="1" ca="left">
            <p>Stepwise, forward selection, backward elimination</p>
         </c>
         <c ca="left">
            <p>20 (51%)</p>
         </c>
      </r>
      <r>
         <c indent="1" ca="left">
            <p>All significant in univariate analysis</p>
         </c>
         <c ca="left">
            <p>2 (5%)</p>
         </c>
      </r>
      <r>
         <c indent="1" ca="left">
            <p>Other</p>
         </c>
         <c ca="left">
            <p>12 (31%)</p>
         </c>
      </r>
      <r>
         <c indent="1" ca="left">
            <p>Unclear</p>
         </c>
         <c ca="left">
            <p>5 (13%)</p>
         </c>
      </r>
      <r>
         <c>
            <p/>
         </c>
         <c>
            <p/>
         </c>
      </r>
      <r>
         <c ca="left">
            <p>Overfitting mentioned or discussed, <it>n </it>(%)</p>
         </c>
         <c ca="left">
            <p>5 (13%)</p>
         </c>
      </r>
   </tblbdy><tblfn>
      <p><sup>a</sup>IQR, interquartile range; <sup>b</sup>sample size not reported in four studies; <sup>c</sup>sample size not reported in two studies and unclear in one study.</p>
   </tblfn></tbl>
</sec>
<sec>
<st>
<p>Missing data</p>
</st>
<p>Twenty-three studies (59%) made reference to missing data in developing the risk prediction model, of which twenty-one studies explicitly excluded individuals with missing data regarding one or more risk predictors (often a specified inclusion criterion), thereby rendering them complete case analyses <abbrgrp>
<abbr bid="B23">23</abbr>
<abbr bid="B26">26</abbr>
<abbr bid="B28">28</abbr>
<abbr bid="B29">29</abbr>
<abbr bid="B30">30</abbr>
<abbr bid="B31">31</abbr>
<abbr bid="B33">33</abbr>
<abbr bid="B34">34</abbr>
<abbr bid="B35">35</abbr>
<abbr bid="B36">36</abbr>
<abbr bid="B37">37</abbr>
<abbr bid="B38">38</abbr>
<abbr bid="B40">40</abbr>
<abbr bid="B41">41</abbr>
<abbr bid="B43">43</abbr>
<abbr bid="B44">44</abbr>
<abbr bid="B45">45</abbr>
<abbr bid="B46">46</abbr>
<abbr bid="B54">54</abbr>
<abbr bid="B58">58</abbr>
<abbr bid="B61">61</abbr>
</abbrgrp>. One study derived the model using a complete case approach, though it included a sensitivity analysis to examine the impact of missing data <abbrgrp>
<abbr bid="B58">58</abbr>
</abbrgrp>. One study used multiple imputations to replace missing values for two risk predictors <abbrgrp>
<abbr bid="B25">25</abbr>
</abbrgrp>. One study used two different approaches to developing a risk prediction model (logistic regression and classification trees) with surrogate splitters to deal with missing data when using classification trees, whilst the approach for dealing with missing data in the logistic regression analyses was not reported, in which event a complete case analysis was most likely.. Sixteen studies (41%) made no mention of missing data (Table <tblr tid="T4">4</tblr>), thus it can only be assumed that a complete case analysis was conducted or that all data for all risk predictors (including candidate risk predictors) were available, which seems unlikely <abbrgrp>
<abbr bid="B24">24</abbr>
<abbr bid="B27">27</abbr>
<abbr bid="B32">32</abbr>
<abbr bid="B39">39</abbr>
<abbr bid="B42">42</abbr>
<abbr bid="B47">47</abbr>
<abbr bid="B48">48</abbr>
<abbr bid="B49">49</abbr>
<abbr bid="B50">50</abbr>
<abbr bid="B51">51</abbr>
<abbr bid="B52">52</abbr>
<abbr bid="B53">53</abbr>
<abbr bid="B55">55</abbr>
<abbr bid="B57">57</abbr>
<abbr bid="B59">59</abbr>
<abbr bid="B60">60</abbr>
</abbrgrp>.</p>
</sec>
<sec>
<st>
<p>Model building</p>
</st>
<p>Eight studies (21%) reported using bivariable screening (often referred to as 'univariate screening') to reduce the number of risk predictors <abbrgrp>
<abbr bid="B32">32</abbr>
<abbr bid="B34">34</abbr>
<abbr bid="B44">44</abbr>
<abbr bid="B45">45</abbr>
<abbr bid="B46">46</abbr>
<abbr bid="B50">50</abbr>
<abbr bid="B52">52</abbr>
<abbr bid="B54">54</abbr>
</abbrgrp>, whilst it was unclear how the risk predictors were reduced prior to development of the multivariable model in nine studies (23%) <abbrgrp>
<abbr bid="B23">23</abbr>
<abbr bid="B29">29</abbr>
<abbr bid="B31">31</abbr>
<abbr bid="B35">35</abbr>
<abbr bid="B37">37</abbr>
<abbr bid="B47">47</abbr>
<abbr bid="B48">48</abbr>
<abbr bid="B55">55</abbr>
<abbr bid="B58">58</abbr>
</abbrgrp>. Two studies reported examining the association of individual risk predictors with patient outcome after adjusting for age and sex <abbrgrp>
<abbr bid="B27">27</abbr>
</abbrgrp> and age and cohort <abbrgrp>
<abbr bid="B30">30</abbr>
</abbrgrp>. Nine studies (23%) included all risk predictors in the multivariable analysis <abbrgrp>
<abbr bid="B25">25</abbr>
<abbr bid="B26">26</abbr>
<abbr bid="B33">33</abbr>
<abbr bid="B36">36</abbr>
<abbr bid="B39">39</abbr>
<abbr bid="B49">49</abbr>
<abbr bid="B51">51</abbr>
<abbr bid="B53">53</abbr>
<abbr bid="B61">61</abbr>
</abbrgrp>.</p>
<p>Twenty-two studies (56%) reported using automated variable selection (forward selection, backward elimination and stepwise) procedures to derive the final multivariable model (Table <tblr tid="T4">4</tblr>). Nine studies (23%) reported using backward elimination <abbrgrp>
<abbr bid="B24">24</abbr>
<abbr bid="B28">28</abbr>
<abbr bid="B41">41</abbr>
<abbr bid="B43">43</abbr>
<abbr bid="B45">45</abbr>
<abbr bid="B46">46</abbr>
<abbr bid="B50">50</abbr>
<abbr bid="B52">52</abbr>
<abbr bid="B57">57</abbr>
</abbrgrp>, seven studies (18%) reported using forward selection <abbrgrp>
<abbr bid="B34">34</abbr>
<abbr bid="B35">35</abbr>
<abbr bid="B38">38</abbr>
<abbr bid="B40">40</abbr>
<abbr bid="B48">48</abbr>
<abbr bid="B55">55</abbr>
<abbr bid="B60">60</abbr>
</abbrgrp> whilst six studies (15%) used stepwise selection methods <abbrgrp>
<abbr bid="B23">23</abbr>
<abbr bid="B32">32</abbr>
<abbr bid="B42">42</abbr>
<abbr bid="B47">47</abbr>
<abbr bid="B54">54</abbr>
<abbr bid="B58">58</abbr>
</abbrgrp>.</p>
<p>All studies clearly identified the type of model they used to derive the prediction model. The final models were based on logistic regression in 29 articles, the Cox proportional hazards model in 7 articles <abbrgrp>
<abbr bid="B25">25</abbr>
<abbr bid="B29">29</abbr>
<abbr bid="B30">30</abbr>
<abbr bid="B35">35</abbr>
<abbr bid="B37">37</abbr>
<abbr bid="B38">38</abbr>
<abbr bid="B40">40</abbr>
</abbrgrp>, recursive partitioning in 2 articles <abbrgrp>
<abbr bid="B26">26</abbr>
<abbr bid="B56">56</abbr>
</abbrgrp> and a Weibull parametric survival model in 1 article <abbrgrp>
<abbr bid="B31">31</abbr>
</abbrgrp>. Two studies used two modelling approaches (logistic regression and Cox proportional hazards model <abbrgrp>
<abbr bid="B39">39</abbr>
</abbrgrp> and logistic regression and recursive partitioning <abbrgrp>
<abbr bid="B56">56</abbr>
</abbrgrp>).</p>
<p>Twenty-five risk prediction models (58%) considered interactions in developing the model; however, this was not explicitly stated for seven of these risk prediction models. Three studies clearly stated that they did not consider interactions to keep the risk prediction model simple, yet all three models implicitly included a waist circumference by sex interaction in their definition of obesity <abbrgrp>
<abbr bid="B33">33</abbr>
<abbr bid="B41">41</abbr>
<abbr bid="B44">44</abbr>
</abbrgrp>. Two studies examined over 20 interactions <abbrgrp>
<abbr bid="B36">36</abbr>
<abbr bid="B43">43</abbr>
</abbrgrp>.</p>
</sec>
<sec>
<st>
<p>Validation</p>
</st>
<p>Ten studies (26%) randomly split the cohort into development and validation cohorts <abbrgrp>
<abbr bid="B24">24</abbr>
<abbr bid="B25">25</abbr>
<abbr bid="B26">26</abbr>
<abbr bid="B30">30</abbr>
<abbr bid="B31">31</abbr>
<abbr bid="B34">34</abbr>
<abbr bid="B37">37</abbr>
<abbr bid="B46">46</abbr>
<abbr bid="B51">51</abbr>
<abbr bid="B55">55</abbr>
</abbrgrp> (Table <tblr tid="T5">5</tblr>). Eight of these studies split the original cohort equally into development and validation cohorts. Twenty-one studies (54%) conducted and published an external validation of their risk prediction models within the same article <abbrgrp>
<abbr bid="B23">23</abbr>
<abbr bid="B27">27</abbr>
<abbr bid="B28">28</abbr>
<abbr bid="B33">33</abbr>
<abbr bid="B35">35</abbr>
<abbr bid="B38">38</abbr>
<abbr bid="B41">41</abbr>
<abbr bid="B42">42</abbr>
<abbr bid="B43">43</abbr>
<abbr bid="B44">44</abbr>
<abbr bid="B45">45</abbr>
<abbr bid="B46">46</abbr>
<abbr bid="B47">47</abbr>
<abbr bid="B48">48</abbr>
<abbr bid="B50">50</abbr>
<abbr bid="B51">51</abbr>
<abbr bid="B52">52</abbr>
<abbr bid="B53">53</abbr>
<abbr bid="B56">56</abbr>
<abbr bid="B57">57</abbr>
<abbr bid="B58">58</abbr>
</abbrgrp>, and eight of these studies used two or more data sets in an attempt to demonstrate the external validity (that is, generalisability) of the risk prediction model.</p>
<tbl id="T5"><title><p>Table 5</p></title><caption><p>Evaluating performance of risk prediction models<sup>a</sup></p></caption><tblbdy cols="2">
      <r>
         <c ca="left">
            <p>
               <b>Parameter</b>
            </p>
         </c>
         <c ca="center">
            <p>
               <b>Number of studies (%)</b>
            </p>
         </c>
      </r>
      <r>
         <c cspan="2">
            <hr/>
         </c>
      </r>
      <r>
         <c ca="left">
            <p>Validation</p>
         </c>
         <c>
            <p/>
         </c>
      </r>
      <r>
         <c indent="1" ca="left">
            <p>Apparent</p>
         </c>
         <c ca="center">
            <p>30 (77%)</p>
         </c>
      </r>
      <r>
         <c indent="1" ca="left">
            <p>Internal</p>
         </c>
         <c ca="center">
            <p>15 (38%)</p>
         </c>
      </r>
      <r>
         <c indent="2" ca="left">
            <p>Bootstrapping</p>
         </c>
         <c ca="center">
            <p>2 (5%)</p>
         </c>
      </r>
      <r>
         <c indent="2" ca="left">
            <p>Jack-knifing</p>
         </c>
         <c ca="center">
            <p>1 (3%)</p>
         </c>
      </r>
      <r>
         <c indent="2" ca="left">
            <p>Random split sample</p>
         </c>
         <c ca="center">
            <p>10 (26%)</p>
         </c>
      </r>
      <r>
         <c indent="2" ca="left">
            <p>Cross-validation</p>
         </c>
         <c ca="center">
            <p>2 (5%)</p>
         </c>
      </r>
      <r>
         <c indent="1" ca="left">
            <p>Temporal</p>
         </c>
         <c ca="center">
            <p>3 (8%)</p>
         </c>
      </r>
      <r>
         <c indent="1" ca="left">
            <p>External</p>
         </c>
         <c ca="center">
            <p>21 (54%)</p>
         </c>
      </r>
      <r>
         <c>
            <p/>
         </c>
         <c>
            <p/>
         </c>
      </r>
      <r>
         <c ca="left">
            <p>Performance metrics<sup>b</sup></p>
         </c>
         <c>
            <p/>
         </c>
      </r>
      <r>
         <c indent="1" ca="left">
            <p>Discrimination</p>
         </c>
         <c>
            <p/>
         </c>
      </r>
      <r>
         <c indent="2" ca="left">
            <p>C-statistic</p>
         </c>
         <c ca="center">
            <p>39 (100%)</p>
         </c>
      </r>
      <r>
         <c indent="2" ca="left">
            <p>D-statistic</p>
         </c>
         <c ca="center">
            <p>1 (3%)</p>
         </c>
      </r>
      <r>
         <c ca="left">
            <p>Calibration<sup>c</sup></p>
         </c>
         <c ca="center">
            <p>10 (26%)</p>
         </c>
      </r>
      <r>
         <c indent="2" ca="left">
            <p>Hosmer-Lemeshow statistic</p>
         </c>
         <c ca="center">
            <p>8 (21%)</p>
         </c>
      </r>
      <r>
         <c indent="2" ca="left">
            <p>Calibration plot</p>
         </c>
         <c ca="center">
            <p>2 (5%)</p>
         </c>
      </r>
      <r>
         <c indent="1" ca="left">
            <p>Classification</p>
         </c>
         <c>
            <p/>
         </c>
      </r>
      <r>
         <c indent="2" ca="left">
            <p>Reclassification (NRI)</p>
         </c>
         <c ca="center">
            <p>2 (5%)</p>
         </c>
      </r>
      <r>
         <c indent="2" ca="left">
            <p>Other (for example, sensitivity, specificity)</p>
         </c>
         <c ca="center">
            <p>31 (79%)</p>
         </c>
      </r>
   </tblbdy><tblfn>
      <p><sup>a</sup>NRI,- Net Reclassification Index; <sup>b</sup>studies can report more than one performance metric; <sup>c</sup>calibration assessed on the basis of the development cohort in 10 studies and in the validation cohorts in 2 studies.</p>
   </tblfn></tbl>
</sec>
<sec>
<st>
<p>Model performance</p>
</st>
<p>We assessed the type of performance measure used to evaluate the risk prediction models (Table <tblr tid="T5">5</tblr>). All studies reported C-statistics, with 31 studies (79%) reporting C-statistics on the data used to derive the model <abbrgrp>
<abbr bid="B23">23</abbr>
<abbr bid="B26">26</abbr>
<abbr bid="B27">27</abbr>
<abbr bid="B28">28</abbr>
<abbr bid="B29">29</abbr>
<abbr bid="B32">32</abbr>
<abbr bid="B33">33</abbr>
<abbr bid="B35">35</abbr>
<abbr bid="B36">36</abbr>
<abbr bid="B37">37</abbr>
<abbr bid="B38">38</abbr>
<abbr bid="B39">39</abbr>
<abbr bid="B41">41</abbr>
<abbr bid="B43">43</abbr>
<abbr bid="B44">44</abbr>
<abbr bid="B45">45</abbr>
<abbr bid="B46">46</abbr>
<abbr bid="B47">47</abbr>
<abbr bid="B48">48</abbr>
<abbr bid="B49">49</abbr>
<abbr bid="B50">50</abbr>
<abbr bid="B51">51</abbr>
<abbr bid="B52">52</abbr>
<abbr bid="B53">53</abbr>
<abbr bid="B54">54</abbr>
<abbr bid="B56">56</abbr>
<abbr bid="B57">57</abbr>
<abbr bid="B58">58</abbr>
<abbr bid="B59">59</abbr>
<abbr bid="B60">60</abbr>
<abbr bid="B61">61</abbr>
</abbrgrp>, 13 studies (33%) calculating C-statistics on an internal validation data set <abbrgrp>
<abbr bid="B24">24</abbr>
<abbr bid="B25">25</abbr>
<abbr bid="B26">26</abbr>
<abbr bid="B29">29</abbr>
<abbr bid="B30">30</abbr>
<abbr bid="B31">31</abbr>
<abbr bid="B32">32</abbr>
<abbr bid="B34">34</abbr>
<abbr bid="B37">37</abbr>
<abbr bid="B39">39</abbr>
<abbr bid="B40">40</abbr>
<abbr bid="B55">55</abbr>
<abbr bid="B56">56</abbr>
</abbrgrp> and 21 studies (54%) reporting C-statistics on external validation data sets <abbrgrp>
<abbr bid="B23">23</abbr>
<abbr bid="B27">27</abbr>
<abbr bid="B28">28</abbr>
<abbr bid="B33">33</abbr>
<abbr bid="B35">35</abbr>
<abbr bid="B38">38</abbr>
<abbr bid="B41">41</abbr>
<abbr bid="B42">42</abbr>
<abbr bid="B43">43</abbr>
<abbr bid="B44">44</abbr>
<abbr bid="B45">45</abbr>
<abbr bid="B46">46</abbr>
<abbr bid="B47">47</abbr>
<abbr bid="B48">48</abbr>
<abbr bid="B50">50</abbr>
<abbr bid="B51">51</abbr>
<abbr bid="B52">52</abbr>
<abbr bid="B53">53</abbr>
<abbr bid="B56">56</abbr>
<abbr bid="B57">57</abbr>
<abbr bid="B58">58</abbr>
</abbrgrp>. Only 10 studies (26%) assessed how well the predicted risks compared to the observed risks (calibration), investigators in 8 studies (21%) chose to calculate the Hosmer-Lemeshow goodness-of-fit test <abbrgrp>
<abbr bid="B23">23</abbr>
<abbr bid="B27">27</abbr>
<abbr bid="B28">28</abbr>
<abbr bid="B29">29</abbr>
<abbr bid="B36">36</abbr>
<abbr bid="B37">37</abbr>
<abbr bid="B45">45</abbr>
<abbr bid="B53">53</abbr>
</abbrgrp> and in 2 studies a calibration plot was presented <abbrgrp>
<abbr bid="B25">25</abbr>
<abbr bid="B37">37</abbr>
</abbrgrp>.</p>
</sec>
<sec>
<st>
<p>Model presentation</p>
</st>
<p>Twenty-four studies (62%) derived simplified scoring systems from the risk models <abbrgrp>
<abbr bid="B23">23</abbr>
<abbr bid="B24">24</abbr>
<abbr bid="B27">27</abbr>
<abbr bid="B28">28</abbr>
<abbr bid="B29">29</abbr>
<abbr bid="B31">31</abbr>
<abbr bid="B33">33</abbr>
<abbr bid="B38">38</abbr>
<abbr bid="B39">39</abbr>
<abbr bid="B41">41</abbr>
<abbr bid="B42">42</abbr>
<abbr bid="B43">43</abbr>
<abbr bid="B44">44</abbr>
<abbr bid="B45">45</abbr>
<abbr bid="B46">46</abbr>
<abbr bid="B48">48</abbr>
<abbr bid="B49">49</abbr>
<abbr bid="B50">50</abbr>
<abbr bid="B51">51</abbr>
<abbr bid="B52">52</abbr>
<abbr bid="B57">57</abbr>
<abbr bid="B58">58</abbr>
<abbr bid="B61">61</abbr>
</abbrgrp>. Twelve studies derived a simple points system by multiplying (or dividing) the regression coefficients by a constant (typically 10) and then rounding the result to the nearest integer <abbrgrp>
<abbr bid="B24">24</abbr>
<abbr bid="B41">41</abbr>
<abbr bid="B42">42</abbr>
<abbr bid="B43">43</abbr>
<abbr bid="B44">44</abbr>
<abbr bid="B46">46</abbr>
<abbr bid="B48">48</abbr>
<abbr bid="B50">50</abbr>
<abbr bid="B51">51</abbr>
<abbr bid="B52">52</abbr>
<abbr bid="B57">57</abbr>
<abbr bid="B58">58</abbr>
</abbrgrp>. Four studies used the method of Sullivan <it>et al. </it>
<abbrgrp>
<abbr bid="B62">62</abbr>
</abbrgrp> to develop a points system <abbrgrp>
<abbr bid="B27">27</abbr>
<abbr bid="B29">29</abbr>
<abbr bid="B38">38</abbr>
<abbr bid="B39">39</abbr>
</abbrgrp>.</p>
</sec>
</sec>
<sec>
<st>
<p>Discussion</p>
</st>
<sec>
<st>
<p>Main findings</p>
</st>
<p>Our systematic review of 39 published studies highlights inadequate conduct and reporting in all aspects of developing a multivariable prediction model for detecting prevalent or incident type 2 diabetes. Fundamental aspects of describing the data (i.e. the number of participants and the number of events), a clear description of all selection of risk predictors and steps taken to build the multivariable model were all shown to be poor</p>
<p>One of the problems researchers face when developing a multivariable prediction model is overfitting. This occurs when the number of events in the cohort is disproportionately small in relation to the number of candidate risk predictors. A rule of thumb is that models should be developed with 10 to 20 events per variable (EPV) <abbrgrp>
<abbr bid="B63">63</abbr>
<abbr bid="B64">64</abbr>
</abbrgrp>. Of the studies included in this review, 21% had fewer than 10 EPV, whilst there was insufficient detail reported for an EPV to be calculated in 33% of the risk prediction models. The consequences of overfitting are that models subsequently often fail to perform satisfactorily when applied to data sets not used to derive the model <abbrgrp>
<abbr bid="B65">65</abbr>
</abbrgrp>. Investigators in other studies have reported similar findings (EPV &lt; 10) when appraising the development of multivariable prediction models <abbrgrp>
<abbr bid="B3">3</abbr>
<abbr bid="B21">21</abbr>
<abbr bid="B66">66</abbr>
</abbrgrp>.</p>
<p>Another key component affecting the performance of the final model is how continuous variables are treated, whether they are kept as continuous measurements or whether they have been categorised into two or more categories <abbrgrp>
<abbr bid="B67">67</abbr>
</abbrgrp>. Common approaches include dichotomising at the median value or choosing an optimal cutoff point based on minimising a <it>P </it>value. Regardless of the approach used, the practice of artificially treating a continuous risk predictor as categorical should be avoided <abbrgrp>
<abbr bid="B67">67</abbr>
</abbrgrp>, yet this is frequently done in the development of risk prediction models <abbrgrp>
<abbr bid="B4">4</abbr>
<abbr bid="B5">5</abbr>
<abbr bid="B68">68</abbr>
<abbr bid="B69">69</abbr>
<abbr bid="B70">70</abbr>
<abbr bid="B71">71</abbr>
<abbr bid="B72">72</abbr>
<abbr bid="B73">73</abbr>
<abbr bid="B74">74</abbr>
</abbrgrp>. In our review, we identified 63% of studies that categorised all or some of the continuous risk predictors, and similar figures have been reported in other reviews <abbrgrp>
<abbr bid="B3">3</abbr>
</abbrgrp>. Dichotomising continuous variables causes a detrimental loss of information and loss of power to detect real relationships, equivalent to losing one-third of the data or even more if the data are exponentially distributed <abbrgrp>
<abbr bid="B75">75</abbr>
</abbrgrp>. Continuous risk predictors (that is, age) should be retained in the model as continuous variables, and if the risk predictor has a nonlinear relationship with the outcome, then the use of splines or fractional polynomial functions is recommended <abbrgrp>
<abbr bid="B76">76</abbr>
</abbrgrp>.</p>
<p>Missing data is common in most clinical data sets, which can be a serious problem in studies deriving a risk prediction model. Regardless of study design, collecting all data on all risk predictors for all individuals is a difficult task that is rarely achieved. For studies that derive models on the basis of retrospective cohorts, there is no scope in retrieving any missing data and investigators are thus confronted with deciding how to deal with incomplete data. A common approach is to exclude individuals with missing values on any of the variables and conduct a complete case analysis. However, a complete case analysis, in addition to sacrificing and discarding useful information, is not recommended as it has been shown that it can yield biased results <abbrgrp>
<abbr bid="B77">77</abbr>
</abbrgrp>. Forty percent of the studies in our review failed to report any information regarding missing data. Multiple imputation offers investigators a valid approach to minimise the effect of missing data, yet this is seldom done in developing risk prediction models <abbrgrp>
<abbr bid="B78">78</abbr>
</abbrgrp>, though guidance and illustrative examples are slowly appearing <abbrgrp>
<abbr bid="B18">18</abbr>
<abbr bid="B79">79</abbr>
<abbr bid="B80">80</abbr>
</abbrgrp>. The completeness of overall data (how many individuals have complete data on all variables) and by variable should always be reported so that readers can judge the representativeness and quality of the data.</p>
<p>Whilst developing a model, predictors that are shown to have little influence on predicting patients likely to have particular outcomes might be taken out of a final model during model development. However, this is not a simple matter of selecting predictors solely on the basis of statistical significance during model development, as it can be important to retain these among the model risk predictors known to be important from the literature, but which may not reach statistical significance in a particular data set. Unfortunately, the process of developing a risk predictor model for use in clinical practice for prediction is often confused with using multivariate modelling to identify risk predictors with statistical significance in epidemiological studies. This misunderstanding of the modelling aims can lead to use of inappropriate methods such as prescreening candidate variables for a risk predictor model based on bivariable tests of association with the outcome (that is, a statistical test to examine the association of an individual predictor with the outcome). This has been shown to be inappropriate, as it can wrongly reject important risk predictors that become prognostic only after adjustment of other risk predictors, thus leading to unreliable models <abbrgrp>
<abbr bid="B18">18</abbr>
<abbr bid="B81">81</abbr>
</abbrgrp>. More importantly, it is crucial to clearly report any procedure used to reduce the number of candidate risk predictors. Nearly half of the studies in our review reduced the initial number candidate risk predictors prior to the multivariable modelling, yet over half of these failed provide sufficient detail on how this was carried out.</p>
<p>The most commonly used strategy to build a multivariable model is to use an automated selection approach (forward selection, backward elimination or stepwise) to derive the final risk prediction model (50% in our review). Automated selection methods are data-driven approaches based on statistical significance without reference to clinical relevance, and it has been shown that these methods frequently produce unstable models, have biased estimates of regression coefficients and yield poor predictions <abbrgrp>
<abbr bid="B82">82</abbr>
<abbr bid="B83">83</abbr>
<abbr bid="B84">84</abbr>
</abbrgrp>.</p>
<p>Arguably, regardless of how the multivariable model is developed, all that ultimately matters is to demonstrate that the model works. Thus, after a risk prediction model has been derived, it is essential that the performance of the model be evaluated. Broadly speaking, there are three types of performance data one can present, in order of increasing levels of evidence: (1) apparent validation on the same data used to derive the model; (2) internal validation using a split sample (if the cohort is large enough), cross-validation or, preferably, resampling (that is, bootstrapping); and (3) external validation using a completely different cohort of individuals from different centres or locations than those used to derive the model <abbrgrp>
<abbr bid="B85">85</abbr>
<abbr bid="B86">86</abbr>
</abbrgrp>. Investigators in over half of the studies in our review (54%) conducted an external validation on cohorts that were much larger than other reporting in other reviews <abbrgrp>
<abbr bid="B72">72</abbr>
<abbr bid="B87">87</abbr>
</abbrgrp>.</p>
<p>Reporting performance data solely from an apparent validation analysis is to a large extent uninformative, unless the obvious optimism in evaluating the performance based on the same data used to derive the model is accounted for and this optimism quantified (using internal validation techniques such as resampling). Unless the cohort is particularly large (&gt; 20,000), then using a split sample to derive and evaluate a model also has limited value, especially if the cohorts are randomly split, since the two cohorts are selected to be similar and thus produce overly optimistic performance data. In models in which a split sample has been used, a better approach is a nonrandom split (that is, certain centres or a temporal split) <abbrgrp>
<abbr bid="B85">85</abbr>
<abbr bid="B86">86</abbr>
</abbrgrp>.</p>
</sec>
<sec>
<st>
<p>What is already known on the topic</p>
</st>
<p>The findings of this review are consistent with those of other published reviews of prediction models in cancer <abbrgrp>
<abbr bid="B3">3</abbr>
<abbr bid="B70">70</abbr>
<abbr bid="B71">71</abbr>
</abbrgrp>, stroke <abbrgrp>
<abbr bid="B4">4</abbr>
<abbr bid="B73">73</abbr>
<abbr bid="B88">88</abbr>
</abbrgrp>, traumatic brain injury <abbrgrp>
<abbr bid="B68">68</abbr>
<abbr bid="B72">72</abbr>
</abbrgrp>, liver transplantation <abbrgrp>
<abbr bid="B5">5</abbr>
</abbrgrp> and dentistry <abbrgrp>
<abbr bid="B89">89</abbr>
</abbrgrp>. We observed poor reporting in all aspects of developing the risk prediction models in terms of describing the data and providing sufficient detail in all steps taken in building the model.</p>
</sec>
<sec>
<st>
<p>Limitations</p>
</st>
<p>Our systematic review was limited to English-language articles and did not consider grey literature; therefore, we may have missed some studies. However, we strongly suspect that including articles in our review would not have altered any of the findings.</p>
</sec>
</sec>
<sec>
<st>
<p>Conclusions</p>
</st>
<p>This systematic review of 39 published studies highlights numerous methodological deficiencies and a generally poor level of reporting in studies in which risk prediction models were developed for the detection of prevalent or incident type 2 diabetes. Reporting guidelines are available for therapeutic <abbrgrp>
<abbr bid="B90">90</abbr>
</abbrgrp>, diagnostic <abbrgrp>
<abbr bid="B91">91</abbr>
</abbrgrp> and other study designs <abbrgrp>
<abbr bid="B14">14</abbr>
<abbr bid="B92">92</abbr>
<abbr bid="B93">93</abbr>
</abbrgrp>, and these have been shown to increase the reporting of key study information <abbrgrp>
<abbr bid="B94">94</abbr>
<abbr bid="B95">95</abbr>
</abbrgrp>. Such an initiative is long overdue for the reporting of risk prediction models. We note that in the field of veterinary oncology, recommended guidelines for the conduct and evaluation of prognostic studies have been developed to stem the tide of low-quality research. Until reporting guidelines suitable for deriving and evaluating risk prediction models are developed and adopted by journals and peer reviewers, the conduct, methodology and reporting of such models will remain disappointingly poor.</p>
</sec>
<sec>
<st>
<p>Competing interests</p>
</st>
<p>The authors declare that they have no competing interests.</p>
</sec>
<sec>
<st>
<p>Authors' contributions</p>
</st>
<p>GSC contributed to the study design, carried out the data extraction of all articles and items, compiled the results and drafted the manuscript. SM contributed to the study design, duplicate data extraction and drafting of the article. OO and LMY carried out duplicate data extraction and commented on the manuscript. All authors read and approved the final manuscript.</p>
</sec>
<sec>
<st>
<p>Authors' information</p>
</st>
<p>All authors are medical statisticians.</p>
</sec>
<sec>
<st>
<p>Appendix 1: Search strings</p>
</st>
<sec>
<st>
<p>PubMed search string</p>
</st>
<p>'diabetes'[ti] AND ('risk prediction model'[tiab] OR 'predictive model'[tiab] OR 'predictive equation'[tiab] OR 'prediction model'[tiab] OR 'risk calculator'[tiab] OR 'prediction rule'[tiab] OR 'risk model'[tiab] OR 'statistical model'[tiab] OR 'cox model'[tiab] OR 'multivariable'[tiab]) NOT (review[Publication Type] OR Bibliography[Publication Type] OR Editorial[Publication Type] OR Letter[Publication Type] OR Meta-analysis[Publication Type] OR News[Publication Type]).</p>
</sec>
<sec>
<st>
<p>EMBASE search string</p>
</st>
<p>risk prediction model.ab. or risk prediction model.ti. or predictive model.ab. or predictive model.ti. or predictive equation.ab. or predictive equation.ti. or prediction model.ab. or prediction model.ti. or risk calculator.ab. or risk calculator.ti. or prediction rule.ab. or prediction rule.ti. or risk model.ab. or risk model.ti. or statistical model.ab. or statistical model.ti. or cox model.ab. or cox model.ti. or multivariable.ab. or multivariable.ti. and diabetes.ti not letter.pt not review.pt not editorial.pt not conference.pt not book.pt.</p>
</sec>
</sec>
</bdy><bm>
<ack>
<sec>
<st>
<p>Acknowledgements</p>
</st>
<p>The authors received no funding for this study. GSC is funded by the Centre for Statistics in Medicine. SM is funded by Cancer Research UK. OO and LMY are funded by the NHS Trust.</p>
</sec>
</ack>
<refgrp><bibl id="B1"><title><p>Screening for Type 2 Diabetes: Report of a World Health Organization and International Diabetes Federation meeting</p></title><url>http://www.who.int/diabetes/publications/en/screening_mnc03.pdf</url></bibl><bibl id="B2"><title><p>Mortality in people with type 2 diabetes in the UK</p></title><aug><au><snm>Mulnier</snm><fnm>HE</fnm></au><au><snm>Seaman</snm><fnm>HE</fnm></au><au><snm>Raleigh</snm><fnm>VS</fnm></au><au><snm>Soedamah-Muthu</snm><fnm>SS</fnm></au><au><snm>Colhoun</snm><fnm>HM</fnm></au><au><snm>Lawrenson</snm><fnm>RA</fnm></au></aug><source>Diabet Med</source><pubdate>2006</pubdate><volume>23</volume><fpage>516</fpage><lpage>521</lpage><xrefbib><pubidlist><pubid idtype="doi">10.1111/j.1464-5491.2006.01838.x</pubid><pubid idtype="pmpid" link="fulltext">16681560</pubid></pubidlist></xrefbib></bibl><bibl id="B3"><title><p>Prognostic models: a methodological framework and review of models for breast cancer</p></title><aug><au><snm>Altman</snm><fnm>DG</fnm></au></aug><source>Cancer Invest</source><pubdate>2009</pubdate><volume>27</volume><fpage>235</fpage><lpage>243</lpage><xrefbib><pubidlist><pubid idtype="doi">10.1080/07357900802572110</pubid><pubid idtype="pmpid" link="fulltext">19291527</pubid></pubidlist></xrefbib></bibl><bibl id="B4"><title><p>Systematic review of prognostic models in patients with stroke</p></title><aug><au><snm>Counsell</snm><fnm>C</fnm></au><au><snm>Dennis</snm><fnm>M</fnm></au></aug><source>Cerebrovasc Dis</source><pubdate>2001</pubdate><volume>12</volume><fpage>159</fpage><lpage>170</lpage><xrefbib><pubidlist><pubid idtype="doi">10.1159/000047699</pubid><pubid idtype="pmpid" link="fulltext">11641579</pubid></pubidlist></xrefbib></bibl><bibl id="B5"><title><p>Systematic review and validation of prognostic models in liver transplantation</p></title><aug><au><snm>Jacob</snm><fnm>M</fnm></au><au><snm>Lewsey</snm><fnm>JD</fnm></au><au><snm>Sharpin</snm><fnm>C</fnm></au><au><snm>Gimson</snm><fnm>A</fnm></au><au><snm>Rela</snm><fnm>M</fnm></au><au><snm>van der Meulen</snm><fnm>JHP</fnm></au></aug><source>Liver Transpl</source><pubdate>2005</pubdate><volume>11</volume><fpage>814</fpage><lpage>825</lpage><xrefbib><pubidlist><pubid idtype="doi">10.1002/lt.20456</pubid><pubid idtype="pmpid" link="fulltext">15973726</pubid></pubidlist></xrefbib></bibl><bibl id="B6"><title><p>Logistic regression in the medical literature: standards for use and reporting, with particular attention to one medical domain</p></title><aug><au><snm>Bagley</snm><fnm>SC</fnm></au><au><snm>White</snm><fnm>H</fnm></au><au><snm>Golomb</snm><fnm>BA</fnm></au></aug><source>J Clin Epidemiol</source><pubdate>2001</pubdate><volume>54</volume><fpage>979</fpage><lpage>985</lpage><xrefbib><pubidlist><pubid idtype="doi">10.1016/S0895-4356(01)00372-9</pubid><pubid idtype="pmpid" link="fulltext">11576808</pubid></pubidlist></xrefbib></bibl><bibl id="B7"><title><p>Recommendations for the assessment and reporting of multivariable logistic regression in transplantation literature</p></title><aug><au><snm>Kalil</snm><fnm>AC</fnm></au><au><snm>Mattei</snm><fnm>J</fnm></au><au><snm>Florescu</snm><fnm>DF</fnm></au><au><snm>Sun</snm><fnm>J</fnm></au><au><snm>Kalil</snm><fnm>RS</fnm></au></aug><source>Am J Transplant</source><pubdate>2010</pubdate><volume>10</volume><fpage>1686</fpage><lpage>1694</lpage><xrefbib><pubidlist><pubid idtype="doi">10.1111/j.1600-6143.2010.03141.x</pubid><pubid idtype="pmcid">2909008</pubid><pubid idtype="pmpid" link="fulltext">20642690</pubid></pubidlist></xrefbib></bibl><bibl id="B8"><title><p>Multivariable analysis: a primer for readers of medical research</p></title><aug><au><snm>Khan</snm><fnm>KS</fnm></au><au><snm>Chien</snm><fnm>PF</fnm></au><au><snm>Dwarakanath</snm><fnm>LS</fnm></au></aug><source>Obstet Gynecol</source><pubdate>1999</pubdate><volume>93</volume><fpage>1014</fpage><lpage>1020</lpage><xrefbib><pubidlist><pubid idtype="doi">10.1016/S0029-7844(98)00537-7</pubid><pubid idtype="pmpid">10362173</pubid></pubidlist></xrefbib></bibl><bibl id="B9"><title><p>Evaluation of logistic regression reporting in current obstetrics and gynecology literature</p></title><aug><au><snm>Mikolajczyk</snm><fnm>RT</fnm></au><au><snm>DiSilvestro</snm><fnm>A</fnm></au><au><snm>Zhang</snm><fnm>J</fnm></au></aug><source>Obstet Gynecol</source><pubdate>2008</pubdate><volume>111</volume><fpage>413</fpage><lpage>419</lpage><xrefbib><pubidlist><pubid idtype="doi">10.1097/AOG.0b013e318160f38e</pubid><pubid idtype="pmpid">18238980</pubid></pubidlist></xrefbib></bibl><bibl id="B10"><title><p>A review of two journals found that articles using multivariable logistic regression frequently did not report commonly recommended assumptions</p></title><aug><au><snm>Ottenbacher</snm><fnm>KJ</fnm></au><au><snm>Ottenbacher</snm><fnm>HR</fnm></au><au><snm>Tooth</snm><fnm>L</fnm></au><au><snm>Ostir</snm><fnm>GV</fnm></au></aug><source>J Clin Epidemiol</source><pubdate>2004</pubdate><volume>57</volume><fpage>1147</fpage><lpage>1152</lpage><xrefbib><pubidlist><pubid idtype="doi">10.1016/j.jclinepi.2003.05.003</pubid><pubid idtype="pmpid" link="fulltext">15567630</pubid></pubidlist></xrefbib></bibl><bibl id="B11"><title><p>The risk of determining risk with multivariable models</p></title><aug><au><snm>Concato</snm><fnm>J</fnm></au><au><snm>Feinsten</snm><fnm>AR</fnm></au><au><snm>Holford</snm><fnm>TR</fnm></au></aug><source>Ann Intern Med</source><pubdate>1993</pubdate><volume>118</volume><fpage>201</fpage><lpage>210</lpage><xrefbib><pubid idtype="pmpid">8417638</pubid></xrefbib></bibl><bibl id="B12"><title><p>Clinical prediction rules: applications and methodological standards</p></title><aug><au><snm>Wasson</snm><fnm>JH</fnm></au><au><snm>Sox</snm><fnm>HC</fnm></au><au><snm>Neff</snm><fnm>RK</fnm></au><au><snm>Goldman</snm><fnm>L</fnm></au></aug><source>N Engl J Med</source><pubdate>1985</pubdate><volume>313</volume><fpage>793</fpage><lpage>799</lpage><xrefbib><pubidlist><pubid idtype="doi">10.1056/NEJM198509263131306</pubid><pubid idtype="pmpid" link="fulltext">3897864</pubid></pubidlist></xrefbib></bibl><bibl id="B13"><title><p>CONSORT 2010 statement: updated guidelines for reporting parallel group randomised trials</p></title><aug><au><snm>Schulz</snm><fnm>KF</fnm></au><au><snm>Altman</snm><fnm>DG</fnm></au><au><snm>Moher</snm><fnm>D</fnm></au><au><cnm>CONSORT Group</cnm></au></aug><source>BMJ</source><pubdate>2010</pubdate><volume>340</volume><fpage>c332</fpage><xrefbib><pubidlist><pubid idtype="doi">10.1136/bmj.c332</pubid><pubid idtype="pmcid">2844940</pubid><pubid idtype="pmpid" link="fulltext">20332509</pubid></pubidlist></xrefbib></bibl><bibl id="B14"><title><p>Strengthening the Reporting of Observational Studies in Epidemiology (STROBE) statement: guidelines for reporting observational studies</p></title><aug><au><snm>von Elm</snm><fnm>E</fnm></au><au><snm>Altman</snm><fnm>DG</fnm></au><au><snm>Egger</snm><fnm>M</fnm></au><au><snm>Pocock</snm><fnm>SJ</fnm></au><au><snm>G&#248;tzsche</snm><fnm>PC</fnm></au><au><snm>Vandenbrouke</snm><fnm>JP</fnm></au><au><cnm>STROBE Initiative</cnm></au></aug><source>BMJ</source><pubdate>2007</pubdate><volume>335</volume><fpage>806</fpage><lpage>808</lpage><xrefbib><pubidlist><pubid idtype="doi">10.1136/bmj.39335.541782.AD</pubid><pubid idtype="pmcid">2034723</pubid><pubid idtype="pmpid" link="fulltext">17947786</pubid></pubidlist></xrefbib></bibl><bibl id="B15"><title><p>Towards complete and accurate reporting of studies of diagnostic accuracy: the STARD initiative</p></title><aug><au><snm>Bossuyt</snm><fnm>PM</fnm></au><au><snm>Reitsma</snm><fnm>JB</fnm></au><au><snm>Bruns</snm><fnm>DE</fnm></au><au><snm>Gatsonis</snm><fnm>CA</fnm></au><au><snm>Glasziou</snm><fnm>PP</fnm></au><au><snm>Irwig</snm><fnm>LM</fnm></au><au><snm>Lijmer</snm><fnm>JG</fnm></au><au><snm>Moher</snm><fnm>D</fnm></au><au><snm>Rennie</snm><fnm>D</fnm></au><au><snm>de Vet</snm><fnm>HC</fnm></au><au><cnm>Standards for Reporting of Diagnostic Accuracy</cnm></au></aug><source>BMJ</source><pubdate>2003</pubdate><volume>326</volume><fpage>41</fpage><lpage>44</lpage><xrefbib><pubidlist><pubid idtype="doi">10.1136/bmj.326.7379.41</pubid><pubid idtype="pmcid">1124931</pubid><pubid idtype="pmpid" link="fulltext">12511463</pubid></pubidlist></xrefbib></bibl><bibl id="B16"><title><p>Preferred reporting items for systematic reviews and meta-analyses: the PRISMA statement</p></title><aug><au><snm>Moher</snm><fnm>D</fnm></au><au><snm>Liberati</snm><fnm>A</fnm></au><au><snm>Tetzlaff</snm><fnm>J</fnm></au><au><snm>Altman</snm><fnm>DG</fnm></au><au><cnm>PRISMA Group</cnm></au></aug><source>BMJ</source><pubdate>2009</pubdate><volume>339</volume><fpage>b2535</fpage><xrefbib><pubidlist><pubid idtype="doi">10.1136/bmj.b2535</pubid><pubid idtype="pmcid">2714657</pubid><pubid idtype="pmpid" link="fulltext">19622551</pubid></pubidlist></xrefbib></bibl><bibl id="B17"><title><p>REporting recommendations for tumour MARKer prognostic studies (REMARK)</p></title><aug><au><snm>McShane</snm><fnm>LM</fnm></au><au><snm>Altman</snm><fnm>DG</fnm></au><au><snm>Sauerbrei</snm><fnm>W</fnm></au><au><snm>Taube</snm><fnm>SE</fnm></au><au><snm>Gion</snm><fnm>M</fnm></au><au><snm>Clark</snm><fnm>GM</fnm></au><au><cnm>Statistics Subcommittee of the NCI-EORTC Working Group on Cancer Diagnostics</cnm></au></aug><source>Br J Cancer</source><pubdate>2005</pubdate><volume>93</volume><fpage>387</fpage><lpage>391</lpage><xrefbib><pubidlist><pubid idtype="doi">10.1038/sj.bjc.6602678</pubid><pubid idtype="pmcid">2361579</pubid><pubid idtype="pmpid" link="fulltext">16106245</pubid></pubidlist></xrefbib></bibl><bibl id="B18"><title><p>Multivariable prognostic models: issues in developing models, evaluating assumptions and adequacy, and measuring and reducing errors</p></title><aug><au><snm>Harrell</snm><fnm>FE</fnm><suf>Jr</suf></au><au><snm>Lee</snm><fnm>KL</fnm></au><au><snm>Mark</snm><fnm>DB</fnm></au></aug><source>Stat Med</source><pubdate>1996</pubdate><volume>15</volume><fpage>361</fpage><lpage>387</lpage><xrefbib><pubidlist><pubid idtype="doi">10.1002/(SICI)1097-0258(19960229)15:4&lt;361::AID-SIM168&gt;3.0.CO;2-4</pubid><pubid idtype="pmpid">8668867</pubid></pubidlist></xrefbib></bibl><bibl id="B19"><title><p>Methodological aspects of prognostic factor studies: some caveats</p></title><aug><au><snm>Metze</snm><fnm>K</fnm></au></aug><source>Sao Paulo Med J</source><pubdate>1998</pubdate><volume>116</volume><fpage>1787</fpage><lpage>1788</lpage><xrefbib><pubidlist><pubid idtype="doi">10.1590/S1516-31801998000400011</pubid><pubid idtype="pmpid">9951753</pubid></pubidlist></xrefbib></bibl><bibl id="B20"><title><p>Barriers to routine risk-score use for healthy primary care patients</p></title><aug><au><snm>M&#252;ller-Riemenschneider</snm><fnm>F</fnm></au><au><snm>Holmberg</snm><fnm>C</fnm></au><au><snm>Rieckmann</snm><fnm>N</fnm></au><au><snm>Kliems</snm><fnm>H</fnm></au><au><snm>Rufer</snm><fnm>V</fnm></au><au><snm>M&#252;ller-Nordhorn</snm><fnm>J</fnm></au><au><snm>Willich</snm><fnm>SN</fnm></au></aug><source>Arch Intern Med</source><pubdate>2010</pubdate><volume>170</volume><fpage>719</fpage><lpage>724</lpage><xrefbib><pubidlist><pubid idtype="doi">10.1001/archinternmed.2010.66</pubid><pubid idtype="pmpid" link="fulltext">20421559</pubid></pubidlist></xrefbib></bibl><bibl id="B21"><title><p>Reporting methods in studies developing prognostic models in cancer: a review</p></title><aug><au><snm>Mallett</snm><fnm>S</fnm></au><au><snm>Royston</snm><fnm>P</fnm></au><au><snm>Dutton</snm><fnm>S</fnm></au><au><snm>Waters</snm><fnm>R</fnm></au><au><snm>Altman</snm><fnm>DG</fnm></au></aug><source>BMC Med</source><pubdate>2010</pubdate><volume>8</volume><fpage>20</fpage><xrefbib><pubidlist><pubid idtype="doi">10.1186/1741-7015-8-20</pubid><pubid idtype="pmcid">2856521</pubid><pubid idtype="pmpid" link="fulltext">20353578</pubid></pubidlist></xrefbib></bibl><bibl id="B22"><title><p>Reporting performance of prognostic models in cancer: a review</p></title><aug><au><snm>Mallett</snm><fnm>S</fnm></au><au><snm>Royston</snm><fnm>P</fnm></au><au><snm>Waters</snm><fnm>R</fnm></au><au><snm>Dutton</snm><fnm>S</fnm></au><au><snm>Altman</snm><fnm>DG</fnm></au></aug><source>BMC Med</source><pubdate>2010</pubdate><volume>8</volume><fpage>21</fpage><xrefbib><pubidlist><pubid idtype="doi">10.1186/1741-7015-8-21</pubid><pubid idtype="pmcid">2857810</pubid><pubid idtype="pmpid" link="fulltext">20353579</pubid></pubidlist></xrefbib></bibl><bibl id="B23"><title><p>Predicting diabetes: clinical, biological, and genetic approaches: data from the Epidemiological Study on the Insulin Resistance Syndrome (DESIR)</p></title><aug><au><snm>Balkau</snm><fnm>B</fnm></au><au><snm>Lange</snm><fnm>C</fnm></au><au><snm>Fezeu</snm><fnm>L</fnm></au><au><snm>Tichet</snm><fnm>J</fnm></au><au><snm>de Lauzon-Guillain</snm><fnm>B</fnm></au><au><snm>Czernichow</snm><fnm>S</fnm></au><au><snm>Fumeron</snm><fnm>F</fnm></au><au><snm>Froguel</snm><fnm>P</fnm></au><au><snm>Vaxillaire</snm><fnm>M</fnm></au><au><snm>Cauchi</snm><fnm>S</fnm></au><au><snm>Ducimeti&#232;re</snm><fnm>P</fnm></au><au><snm>Eschw&#232;ge</snm><fnm>E</fnm></au></aug><source>Diabetes Care</source><pubdate>2008</pubdate><volume>31</volume><fpage>2056</fpage><lpage>2061</lpage><xrefbib><pubidlist><pubid idtype="doi">10.2337/dc08-0368</pubid><pubid idtype="pmcid">2551654,2551654</pubid><pubid idtype="pmpid" link="fulltext">18689695</pubid></pubidlist></xrefbib></bibl><bibl id="B24"><title><p>A simple clinical score for type 2 diabetes mellitus screening in the Canary Islands</p></title><aug><au><snm>Cabrera de Le&#243;n</snm><fnm>A</fnm></au><au><snm>Coello</snm><fnm>SD</fnm></au><au><snm>del Cristo Rodr&#237;guezP&#233;rez</snm><fnm>M</fnm></au><au><snm>Medina</snm><fnm>MB</fnm></au><au><snm>Almeida Gonz&#225;lez</snm><fnm>D</fnm></au><au><snm>Diaz</snm><fnm>BB</fnm></au><au><snm>de Fuentes</snm><fnm>MM</fnm></au><au><snm>Aguirre-Jaime</snm><fnm>A</fnm></au></aug><source>Diabetes Res Clin Pract</source><pubdate>2008</pubdate><volume>80</volume><fpage>128</fpage><lpage>133</lpage><xrefbib><pubidlist><pubid idtype="doi">10.1016/j.diabres.2007.10.022</pubid><pubid idtype="pmpid" link="fulltext">18082285</pubid></pubidlist></xrefbib></bibl><bibl id="B25"><title><p>Predicting risk of type 2 diabetes in England and Wales: prospective derivation and validation of QDScore</p></title><aug><au><snm>Hippisley-Cox</snm><fnm>J</fnm></au><au><snm>Coupland</snm><fnm>C</fnm></au><au><snm>Robson</snm><fnm>J</fnm></au><au><snm>Sheikh</snm><fnm>A</fnm></au><au><snm>Brindle</snm><fnm>P</fnm></au></aug><source>BMJ</source><pubdate>2009</pubdate><volume>338</volume><fpage>b880</fpage><xrefbib><pubidlist><pubid idtype="doi">10.1136/bmj.b880</pubid><pubid idtype="pmcid">2659857</pubid><pubid idtype="pmpid" link="fulltext">19297312</pubid></pubidlist></xrefbib></bibl><bibl id="B26"><title><p>A quick self-assessment tool to identify individuals at high risk of type 2 diabetes in the Chinese general population</p></title><aug><au><snm>Xie</snm><fnm>J</fnm></au><au><snm>Hu</snm><fnm>D</fnm></au><au><snm>Yu</snm><fnm>D</fnm></au><au><snm>Chen</snm><fnm>CS</fnm></au><au><snm>He</snm><fnm>J</fnm></au><au><snm>Gu</snm><fnm>D</fnm></au></aug><source>J Epidemiol Community Health</source><pubdate>2010</pubdate><volume>64</volume><fpage>236</fpage><lpage>242</lpage><xrefbib><pubidlist><pubid idtype="doi">10.1136/jech.2009.087544</pubid><pubid idtype="pmpid" link="fulltext">19710044</pubid></pubidlist></xrefbib></bibl><bibl id="B27"><title><p>A risk score for predicting incident diabetes in the Thai population</p></title><aug><au><snm>Aekplakorn</snm><fnm>W</fnm></au><au><snm>Bunnag</snm><fnm>P</fnm></au><au><snm>Woodward</snm><fnm>M</fnm></au><au><snm>Sritara</snm><fnm>P</fnm></au><au><snm>Cheepudomwit</snm><fnm>S</fnm></au><au><snm>Yamwong</snm><fnm>S</fnm></au><au><snm>Yipintsoi</snm><fnm>T</fnm></au><au><snm>Rajatanavin</snm><fnm>R</fnm></au></aug><source>Diabetes Care</source><pubdate>2006</pubdate><volume>29</volume><fpage>1872</fpage><lpage>1877</lpage><xrefbib><pubidlist><pubid idtype="doi">10.2337/dc05-2141</pubid><pubid idtype="pmpid" link="fulltext">16873795</pubid></pubidlist></xrefbib></bibl><bibl id="B28"><title><p>AUSDRISK: an Australian Type 2 Diabetes Risk Assessment Tool based on demographic, lifestyle and simple anthropometric measures</p></title><aug><au><snm>Chen</snm><fnm>L</fnm></au><au><snm>Magliano</snm><fnm>DJ</fnm></au><au><snm>Balkau</snm><fnm>B</fnm></au><au><snm>Colagiuri</snm><fnm>S</fnm></au><au><snm>Zimmet</snm><fnm>PZ</fnm></au><au><snm>Tonkin</snm><fnm>AM</fnm></au><au><snm>Mitchell</snm><fnm>P</fnm></au><au><snm>Phillips</snm><fnm>PJ</fnm></au><au><snm>Shaw</snm><fnm>JE</fnm></au></aug><source>Med J Aust</source><pubdate>2010</pubdate><volume>192</volume><fpage>197</fpage><lpage>202</lpage><xrefbib><pubid idtype="pmpid" link="fulltext">20170456</pubid></xrefbib></bibl><bibl id="B29"><title><p>A prediction model for type 2 diabetes risk among Chinese people</p></title><aug><au><snm>Chien</snm><fnm>K</fnm></au><au><snm>Cai</snm><fnm>T</fnm></au><au><snm>Hsu</snm><fnm>H</fnm></au><au><snm>Su</snm><fnm>T</fnm></au><au><snm>Chang</snm><fnm>W</fnm></au><au><snm>Chen</snm><fnm>M</fnm></au><au><snm>Lee</snm><fnm>Y</fnm></au><au><snm>Hu</snm><fnm>FB</fnm></au></aug><source>Diabetologia</source><pubdate>2009</pubdate><volume>52</volume><fpage>443</fpage><lpage>450</lpage><xrefbib><pubidlist><pubid idtype="doi">10.1007/s00125-008-1232-4</pubid><pubid idtype="pmpid" link="fulltext">19057891</pubid></pubidlist></xrefbib></bibl><bibl id="B30"><title><p>Risk prediction models for the development of diabetes in Mauritian Indians</p></title><aug><au><snm>Gao</snm><fnm>WG</fnm></au><au><snm>Qiao</snm><fnm>Q</fnm></au><au><snm>Pitk&#228;niemi</snm><fnm>J</fnm></au><au><snm>Wild</snm><fnm>S</fnm></au><au><snm>Magliano</snm><fnm>D</fnm></au><au><snm>Shaw</snm><fnm>J</fnm></au><au><snm>S&#246;derberg</snm><fnm>S</fnm></au><au><snm>Zimmet</snm><fnm>P</fnm></au><au><snm>Chitson</snm><fnm>P</fnm></au><au><snm>Knowlessur</snm><fnm>S</fnm></au><au><snm>Alberti</snm><fnm>G</fnm></au><au><snm>Tuomilehto</snm><fnm>J</fnm></au></aug><source>Diabet Med</source><pubdate>2009</pubdate><volume>16</volume><fpage>996</fpage><lpage>1002</lpage></bibl><bibl id="B31"><title><p>Two risk-scoring systems for predicting incident diabetes mellitus in U.S. adults age 45 to 64 years</p></title><aug><au><snm>Kahn</snm><fnm>HS</fnm></au><au><snm>Cheng</snm><fnm>YJ</fnm></au><au><snm>Thompson</snm><fnm>TJ</fnm></au><au><snm>Imperatore</snm><fnm>G</fnm></au><au><snm>Gregg</snm><fnm>EW</fnm></au></aug><source>Ann Intern Med</source><pubdate>2009</pubdate><volume>150</volume><fpage>741</fpage><lpage>751</lpage><xrefbib><pubid idtype="pmpid" link="fulltext">19487709</pubid></xrefbib></bibl><bibl id="B32"><title><p>Development of a type 2 diabetes risk model from a panel of serum biomarkers from the Inter99 cohort</p></title><aug><au><snm>Kolberg</snm><fnm>JA</fnm></au><au><snm>J&#248;rgensen</snm><fnm>T</fnm></au><au><snm>Gerwien</snm><fnm>RW</fnm></au><au><snm>Hamren</snm><fnm>S</fnm></au><au><snm>McKenna</snm><fnm>MP</fnm></au><au><snm>Moler</snm><fnm>E</fnm></au><au><snm>Rowe</snm><fnm>MW</fnm></au><au><snm>Urdea</snm><fnm>MS</fnm></au><au><snm>Xu</snm><fnm>XM</fnm></au><au><snm>Hansen</snm><fnm>T</fnm></au><au><snm>Pedersen</snm><fnm>O</fnm></au><au><snm>Borch-Johnsen</snm><fnm>K</fnm></au></aug><source>Diabetes Care</source><pubdate>2009</pubdate><volume>32</volume><fpage>1207</fpage><lpage>1212</lpage><xrefbib><pubidlist><pubid idtype="doi">10.2337/dc08-1935</pubid><pubid idtype="pmcid">2699726</pubid><pubid idtype="pmpid" link="fulltext">19564473</pubid></pubidlist></xrefbib></bibl><bibl id="B33"><title><p>The diabetes risk score: a practical tool to predict type 2 diabetes risk</p></title><aug><au><snm>Lindstr&#246;m</snm><fnm>J</fnm></au><au><snm>Tuomilehto</snm><fnm>J</fnm></au></aug><source>Diabetes Care</source><pubdate>2003</pubdate><volume>26</volume><fpage>725</fpage><lpage>731</lpage><xrefbib><pubidlist><pubid idtype="doi">10.2337/diacare.26.3.725</pubid><pubid idtype="pmpid" link="fulltext">12610029</pubid></pubidlist></xrefbib></bibl><bibl id="B34"><title><p>Identifying individuals at high risk for diabetes: The Atherosclerosis Risk in Communities study</p></title><aug><au><snm>Schmidt</snm><fnm>MI</fnm></au><au><snm>Duncan</snm><fnm>BB</fnm></au><au><snm>Bang</snm><fnm>H</fnm></au><au><snm>Pankow</snm><fnm>JS</fnm></au><au><snm>Ballantyne</snm><fnm>CM</fnm></au><au><snm>Golden</snm><fnm>SH</fnm></au><au><snm>Folsom</snm><fnm>AR</fnm></au><au><snm>Chambless</snm><fnm>LE</fnm></au><au><cnm>Atherosclerosis Risk in Communities Investigators</cnm></au></aug><source>Diabetes Care</source><pubdate>2005</pubdate><volume>28</volume><fpage>2013</fpage><lpage>2018</lpage><xrefbib><pubidlist><pubid idtype="doi">10.2337/diacare.28.8.2013</pubid><pubid idtype="pmpid" link="fulltext">16043747</pubid></pubidlist></xrefbib></bibl><bibl id="B35"><title><p>An accurate risk score based on anthropometric, dietary, and lifestyle factors to predict the development of type 2 diabetes</p></title><aug><au><snm>Schulze</snm><fnm>MB</fnm></au><au><snm>Hoffmann</snm><fnm>K</fnm></au><au><snm>Boeing</snm><fnm>H</fnm></au><au><snm>Linseisen</snm><fnm>J</fnm></au><au><snm>Rohrmann</snm><fnm>S</fnm></au><au><snm>M&#246;hlig</snm><fnm>M</fnm></au><au><snm>Pfeiffer</snm><fnm>AF</fnm></au><au><snm>Spranger</snm><fnm>J</fnm></au><au><snm>Thamer</snm><fnm>C</fnm></au><au><snm>H&#228;ring</snm><fnm>HU</fnm></au><au><snm>Fritsche</snm><fnm>A</fnm></au><au><snm>Joost</snm><fnm>HG</fnm></au></aug><source>Diabetes Care</source><pubdate>2007</pubdate><volume>30</volume><fpage>510</fpage><lpage>515</lpage><xrefbib><pubidlist><pubid idtype="doi">10.2337/dc06-2089</pubid><pubid idtype="pmpid" link="fulltext">17327313</pubid></pubidlist></xrefbib></bibl><bibl id="B36"><title><p>Identification of persons at high risk for type 2 diabetes mellitus: Do we need the oral glucose tolerance test?</p></title><aug><au><snm>Stern</snm><fnm>MP</fnm></au><au><snm>Williams</snm><fnm>K</fnm></au><au><snm>Haffner</snm><fnm>SM</fnm></au></aug><source>Ann Intern Med</source><pubdate>2002</pubdate><volume>136</volume><fpage>575</fpage><lpage>581</lpage><xrefbib><pubid idtype="pmpid" link="fulltext">11955025</pubid></xrefbib></bibl><bibl id="B37"><title><p>An accurate risk score for estimation 5-year risk of type 2 diabetes based on a health screening population in Taiwan</p></title><aug><au><snm>Sun</snm><fnm>F</fnm></au><au><snm>Tao</snm><fnm>Q</fnm></au><au><snm>Zhan</snm><fnm>S</fnm></au></aug><source>Diabetes Res Clin Pract</source><pubdate>2009</pubdate><volume>85</volume><fpage>228</fpage><lpage>234</lpage><xrefbib><pubidlist><pubid idtype="doi">10.1016/j.diabres.2009.05.005</pubid><pubid idtype="pmpid" link="fulltext">19500871</pubid></pubidlist></xrefbib></bibl><bibl id="B38"><title><p>Development and validation of a risk-score model for subjects with impaired glucose tolerance for the assessment of the risk of type 2 diabetes mellitus: the STOP-NIDDM risk-score</p></title><aug><au><snm>Tuomilehto</snm><fnm>J</fnm></au><au><snm>Lindstr&#246;m</snm><fnm>J</fnm></au><au><snm>Hellmich</snm><fnm>M</fnm></au><au><snm>Lehmacher</snm><fnm>W</fnm></au><au><snm>Westermeier</snm><fnm>T</fnm></au><au><snm>Evers</snm><fnm>T</fnm></au><au><snm>Br&#252;ckner</snm><fnm>A</fnm></au><au><snm>Peltonen</snm><fnm>M</fnm></au><au><snm>Qiao</snm><fnm>Q</fnm></au><au><snm>Chiasson</snm><fnm>JL</fnm></au></aug><source>Diabetes Res Clin Pract</source><pubdate>2010</pubdate><volume>87</volume><fpage>267</fpage><lpage>274</lpage><xrefbib><pubidlist><pubid idtype="doi">10.1016/j.diabres.2009.11.011</pubid><pubid idtype="pmpid" link="fulltext">20022651</pubid></pubidlist></xrefbib></bibl><bibl id="B39"><title><p>Prediction of incident diabetes mellitus in middle-aged adults: the Framingham Offspring Study</p></title><aug><au><snm>Wilson</snm><fnm>PWF</fnm></au><au><snm>Meigs</snm><fnm>JB</fnm></au><au><snm>Sullivan</snm><fnm>L</fnm></au><au><snm>Fox</snm><fnm>CS</fnm></au><au><snm>Nathan</snm><fnm>DM</fnm></au><au><snm>D&apos;Agostino</snm><fnm>RB</fnm><suf>Sr</suf></au></aug><source>Arch Intern Med</source><pubdate>2007</pubdate><volume>167</volume><fpage>1068</fpage><lpage>1074</lpage><xrefbib><pubidlist><pubid idtype="doi">10.1001/archinte.167.10.1068</pubid><pubid idtype="pmpid" link="fulltext">17533210</pubid></pubidlist></xrefbib></bibl><bibl id="B40"><title><p>Determinants of new-onset diabetes among 19,257 hypertensive patients randomized in the Anglo-Scandinavian Cardiac Outcomes Trial-Blood Pressure Lowering Arm and the relative influence of antihypertensive medication</p></title><aug><au><snm>Gupta</snm><fnm>AK</fnm></au><au><snm>Dahlof</snm><fnm>B</fnm></au><au><snm>Dobson</snm><fnm>J</fnm></au><au><snm>Sever</snm><fnm>PS</fnm></au><au><snm>Wedel</snm><fnm>H</fnm></au><au><snm>Poulter</snm><fnm>NR</fnm></au><au><cnm>Anglo-Scandinavian Cardiac Outcomes Trial Investigators</cnm></au></aug><source>Diabetes Care</source><pubdate>2008</pubdate><volume>31</volume><fpage>982</fpage><lpage>988</lpage><xrefbib><pubidlist><pubid idtype="doi">10.2337/dc07-1768</pubid><pubid idtype="pmpid" link="fulltext">18235048</pubid></pubidlist></xrefbib></bibl><bibl id="B41"><title><p>Diabetes risk score in Oman: a tool to identify prevalent type 2 diabetes among Arabs of the Middle East</p></title><aug><au><snm>Al-Lawati</snm><fnm>JA</fnm></au><au><snm>Tuomilehto</snm><fnm>J</fnm></au></aug><source>Diabetes Res Clin Pract</source><pubdate>2007</pubdate><volume>77</volume><fpage>438</fpage><lpage>444</lpage><xrefbib><pubidlist><pubid idtype="doi">10.1016/j.diabres.2007.01.013</pubid><pubid idtype="pmpid" link="fulltext">17306410</pubid></pubidlist></xrefbib></bibl><bibl id="B42"><title><p>Performance of a predictive model to identify undiagnosed diabetes in a health care setting</p></title><aug><au><snm>Baan</snm><fnm>CA</fnm></au><au><snm>Ruige</snm><fnm>JB</fnm></au><au><snm>Stolk</snm><fnm>RP</fnm></au><au><snm>Witteman</snm><fnm>JCM</fnm></au><au><snm>Dekker</snm><fnm>JM</fnm></au><au><snm>Heine</snm><fnm>RJ</fnm></au><au><snm>Feskens</snm><fnm>EJM</fnm></au></aug><source>Diabetes Care</source><pubdate>1999</pubdate><volume>22</volume><fpage>213</fpage><lpage>219</lpage><xrefbib><pubidlist><pubid idtype="doi">10.2337/diacare.22.2.213</pubid><pubid idtype="pmpid" link="fulltext">10333936</pubid></pubidlist></xrefbib></bibl><bibl id="B43"><title><p>Development and validation of a patient self-assessment score for diabetes risk</p></title><aug><au><snm>Bang</snm><fnm>H</fnm></au><au><snm>Edwards</snm><fnm>AM</fnm></au><au><snm>Bomback</snm><fnm>AS</fnm></au><au><snm>Ballantyne</snm><fnm>CM</fnm></au><au><snm>Brillon</snm><fnm>D</fnm></au><au><snm>Callahan</snm><fnm>MA</fnm></au><au><snm>Teutsch</snm><fnm>SM</fnm></au><au><snm>Mushlin</snm><fnm>AI</fnm></au><au><snm>Kern</snm><fnm>LM</fnm></au></aug><source>Ann Intern Med</source><pubdate>2009</pubdate><volume>151</volume><fpage>775</fpage><lpage>783</lpage><xrefbib><pubid idtype="pmpid">19949143</pubid></xrefbib></bibl><bibl id="B44"><title><p>Development of a clinical risk score in predicting undiagnosed diabetes in urban Asian Indian adults: a population-based study</p></title><aug><au><snm>Chaturvedi</snm><fnm>V</fnm></au><au><snm>Reddy</snm><fnm>KS</fnm></au><au><snm>Prabhakaran</snm><fnm>D</fnm></au><au><snm>Jeemon</snm><fnm>P</fnm></au><au><snm>Ramakrishnan</snm><fnm>L</fnm></au><au><snm>Shah</snm><fnm>P</fnm></au><au><snm>Shah</snm><fnm>B</fnm></au></aug><source>CVD Prev Control</source><pubdate>2008</pubdate><volume>3</volume><fpage>141</fpage><lpage>151</lpage><xrefbib><pubid idtype="doi">10.1016/j.cvdpc.2008.07.002</pubid></xrefbib></bibl><bibl id="B45"><title><p>A simple Chinese risk score for undiagnosed diabetes</p></title><aug><au><snm>Gao</snm><fnm>WG</fnm></au><au><snm>Dong</snm><fnm>YH</fnm></au><au><snm>Pang</snm><fnm>ZC</fnm></au><au><snm>Nan</snm><fnm>HR</fnm></au><au><snm>Wang</snm><fnm>SJ</fnm></au><au><snm>Ren</snm><fnm>J</fnm></au><au><snm>Zhang</snm><fnm>L</fnm></au><au><snm>Tuomilehto</snm><fnm>J</fnm></au><au><snm>Qiao</snm><fnm>Q</fnm></au></aug><source>Diabet Med</source><pubdate>2010</pubdate><volume>27</volume><fpage>274</fpage><lpage>281</lpage><xrefbib><pubidlist><pubid idtype="doi">10.1111/j.1464-5491.2010.02943.x</pubid><pubid idtype="pmpid" link="fulltext">20536489</pubid></pubidlist></xrefbib></bibl><bibl id="B46"><title><p>A Danish diabetes risk score for targeted screening</p></title><aug><au><snm>Gl&#252;mer</snm><fnm>C</fnm></au><au><snm>Carstensen</snm><fnm>B</fnm></au><au><snm>Sabdbaek</snm><fnm>A</fnm></au><au><snm>Lauritzen</snm><fnm>T</fnm></au><au><snm>J&#248;rgensen</snm><fnm>T</fnm></au><au><snm>Borch-Johnsen</snm><fnm>K</fnm></au></aug><source>Diabetes Care</source><pubdate>2004</pubdate><volume>27</volume><fpage>727</fpage><lpage>733</lpage><xrefbib><pubidlist><pubid idtype="doi">10.2337/diacare.27.3.727</pubid><pubid idtype="pmpid" link="fulltext">14988293</pubid></pubidlist></xrefbib></bibl><bibl id="B47"><title><p>The development and validation of a diabetes risk score for high-risk Thai adults</p></title><aug><au><snm>Keesukphan</snm><fnm>P</fnm></au><au><snm>Chanprasertyothin</snm><fnm>S</fnm></au><au><snm>Ongphiphadhanakul</snm><fnm>B</fnm></au><au><snm>Puavilai</snm><fnm>G</fnm></au></aug><source>J Med Assoc Thai</source><pubdate>2007</pubdate><volume>90</volume><fpage>149</fpage><lpage>154</lpage><xrefbib><pubid idtype="pmpid">17621746</pubid></xrefbib></bibl><bibl id="B48"><title><p>A simple risk score to identify Southern Chinese at high risk for diabetes</p></title><aug><au><snm>Ko</snm><fnm>G</fnm></au><au><snm>So</snm><fnm>W</fnm></au><au><snm>Tong</snm><fnm>P</fnm></au><au><snm>Ma</snm><fnm>R</fnm></au><au><snm>Kong</snm><fnm>A</fnm></au><au><snm>Ozakit</snm><fnm>R</fnm></au><au><snm>Chow</snm><fnm>C</fnm></au><au><snm>Cockram</snm><fnm>C</fnm></au><au><snm>Chan</snm><fnm>J</fnm></au></aug><source>Diabet Med</source><pubdate>2010</pubdate><volume>27</volume><fpage>644</fpage><lpage>649</lpage><xrefbib><pubidlist><pubid idtype="doi">10.1111/j.1464-5491.2010.02993.x</pubid><pubid idtype="pmpid" link="fulltext">20546281</pubid></pubidlist></xrefbib></bibl><bibl id="B49"><title><p>A simplified Indian Diabetes Risk Score for screening for undiagnosed diabetic subjects</p></title><aug><au><snm>Mohan</snm><fnm>V</fnm></au><au><snm>Deepa</snm><fnm>R</fnm></au><au><snm>Deepa</snm><fnm>M</fnm></au><au><snm>Somannavar</snm><fnm>S</fnm></au><au><snm>Datta</snm><fnm>M</fnm></au></aug><source>J Assoc Physicians India</source><pubdate>2005</pubdate><volume>53</volume><fpage>759</fpage><lpage>763</lpage><xrefbib><pubid idtype="pmpid">16334618</pubid></xrefbib></bibl><bibl id="B50"><title><p>Derivation and external validation of a simple prediction model for the diagnosis of type 2 diabetes mellitus in the Brazilian urban population</p></title><aug><au><snm>Pires de Sousa</snm><fnm>AG</fnm></au><au><snm>Pereira</snm><fnm>AC</fnm></au><au><snm>Marquezine</snm><fnm>GF</fnm></au><au><snm>Marques do Nascimento-Neto</snm><fnm>R</fnm></au><au><snm>Freitas</snm><fnm>SN</fnm></au><au><snm>Nicolato</snm><fnm>RLdC</fnm></au><au><snm>Machado-Coelho</snm><fnm>GL</fnm></au><au><snm>Rodrigues</snm><fnm>SL</fnm></au><au><snm>Mill</snm><fnm>JG</fnm></au><au><snm>Krieger</snm><fnm>JE</fnm></au></aug><source>Eur J Epidemiol</source><pubdate>2009</pubdate><volume>24</volume><fpage>101</fpage><lpage>109</lpage><xrefbib><pubidlist><pubid idtype="doi">10.1007/s10654-009-9314-2</pubid><pubid idtype="pmpid" link="fulltext">19190989</pubid></pubidlist></xrefbib></bibl><bibl id="B51"><title><p>Derivation and validation of diabetes risk score for urban Asian Indians</p></title><aug><au><snm>Ramachandran</snm><fnm>A</fnm></au><au><snm>Snehalatha</snm><fnm>C</fnm></au><au><snm>Vijay</snm><fnm>C</fnm></au><au><snm>Wareham</snm><fnm>NJ</fnm></au><au><snm>Colagiuri</snm><fnm>S</fnm></au></aug><source>Diabetes Res Clin Pract</source><pubdate>2005</pubdate><volume>70</volume><fpage>63</fpage><lpage>70</lpage><xrefbib><pubidlist><pubid idtype="doi">10.1016/j.diabres.2005.02.016</pubid><pubid idtype="pmpid" link="fulltext">16126124</pubid></pubidlist></xrefbib></bibl><bibl id="B52"><title><p>Performance of an NIDDM screening questionnaire based on symptoms and risk factors</p></title><aug><au><snm>Ruige</snm><fnm>JB</fnm></au><au><snm>de Neeling</snm><fnm>JND</fnm></au><au><snm>Kostense</snm><fnm>PJ</fnm></au><au><snm>Bouter</snm><fnm>LM</fnm></au><au><snm>Heine</snm><fnm>RJ</fnm></au></aug><source>Diabetes Care</source><pubdate>1997</pubdate><volume>20</volume><fpage>491</fpage><lpage>496</lpage><xrefbib><pubidlist><pubid idtype="doi">10.2337/diacare.20.4.491</pubid><pubid idtype="pmpid">9096967</pubid></pubidlist></xrefbib></bibl><bibl id="B53"><title><p>A multivariate logistic regression equation to screen for diabetes: development and validation</p></title><aug><au><snm>Tabaei</snm><fnm>BP</fnm></au><au><snm>Herman</snm><fnm>WH</fnm></au></aug><source>Diabetes Care</source><pubdate>2002</pubdate><volume>25</volume><fpage>1999</fpage><lpage>2003</lpage><xrefbib><pubidlist><pubid idtype="doi">10.2337/diacare.25.11.1999</pubid><pubid idtype="pmpid" link="fulltext">12401746</pubid></pubidlist></xrefbib></bibl><bibl id="B54"><title><p>Prevalence of diabetes mellitus and the performance of a risk score among Hindustani Surinamese, African Surinamese and ethnic Dutch: a cross-sectional population-based study</p></title><aug><au><snm>Bindraban</snm><fnm>NR</fnm></au><au><snm>van Valkengoed</snm><fnm>IGM</fnm></au><au><snm>Mairuhu</snm><fnm>G</fnm></au><au><snm>Holleman</snm><fnm>F</fnm></au><au><snm>Hoekstra</snm><fnm>JBL</fnm></au><au><snm>Michels</snm><fnm>BPJ</fnm></au><au><snm>Koopmans</snm><fnm>RP</fnm></au><au><snm>Stronks</snm><fnm>K</fnm></au></aug><source>BMC Public Health</source><pubdate>2008</pubdate><volume>8</volume><fpage>271</fpage><xrefbib><pubidlist><pubid idtype="doi">10.1186/1471-2458-8-271</pubid><pubid idtype="pmcid">2533321</pubid><pubid idtype="pmpid" link="fulltext">18673544</pubid></pubidlist></xrefbib></bibl><bibl id="B55"><title><p>Diabetes risk score: towards earlier detection of type 2 diabetes in general practice</p></title><aug><au><snm>Griffin</snm><fnm>SJ</fnm></au><au><snm>Little</snm><fnm>PS</fnm></au><au><snm>Hales</snm><fnm>CN</fnm></au><au><snm>Kinmonth</snm><fnm>AL</fnm></au><au><snm>Wareham</snm><fnm>NJ</fnm></au></aug><source>Diabetes Metab Res Rev</source><pubdate>2000</pubdate><volume>16</volume><fpage>164</fpage><lpage>171</lpage><xrefbib><pubidlist><pubid idtype="doi">10.1002/1520-7560(200005/06)16:3&lt;164::AID-DMRR103&gt;3.0.CO;2-R</pubid><pubid idtype="pmpid" link="fulltext">10867715</pubid></pubidlist></xrefbib></bibl><bibl id="B56"><title><p>Diabetes Risk Calculator: a simple tool for detecting undiagnosed diabetes and pre-diabetes</p></title><aug><au><snm>Heikes</snm><fnm>KE</fnm></au><au><snm>Eddy</snm><fnm>DM</fnm></au><au><snm>Arondekar</snm><fnm>B</fnm></au><au><snm>Schlessinger</snm><fnm>L</fnm></au></aug><source>Diabetes Care</source><pubdate>2008</pubdate><volume>31</volume><fpage>1040</fpage><lpage>1045</lpage><xrefbib><pubidlist><pubid idtype="doi">10.2337/dc07-1150</pubid><pubid idtype="pmpid" link="fulltext">18070993</pubid></pubidlist></xrefbib></bibl><bibl id="B57"><title><p>Predicting the development of diabetes in older adults: the derivation and validation of a prediction rule</p></title><aug><au><snm>Kanaya</snm><fnm>AM</fnm></au><au><snm>Wassel Fyr</snm><fnm>CL</fnm></au><au><snm>de Rekeneire</snm><fnm>N</fnm></au><au><snm>Schwartz</snm><fnm>AV</fnm></au><au><snm>Goodpaster</snm><fnm>BH</fnm></au><au><snm>Newman</snm><fnm>AB</fnm></au><au><snm>Harris</snm><fnm>T</fnm></au><au><snm>Barrett-Connor</snm><fnm>E</fnm></au></aug><source>Diabetes Care</source><pubdate>2005</pubdate><volume>28</volume><fpage>404</fpage><lpage>408</lpage><xrefbib><pubidlist><pubid idtype="doi">10.2337/diacare.28.2.404</pubid><pubid idtype="pmpid" link="fulltext">15677800</pubid></pubidlist></xrefbib></bibl><bibl id="B58"><title><p>The Leicester Risk Assessment score for detecting undiagnosed type 2 diabetes and impaired glucose regulation for use in a multiethnic UK setting</p></title><aug><au><snm>Gray</snm><fnm>LJ</fnm></au><au><snm>Taub</snm><fnm>NA</fnm></au><au><snm>Khunti</snm><fnm>K</fnm></au><au><snm>Gardiner</snm><fnm>E</fnm></au><au><snm>Hiles</snm><fnm>S</fnm></au><au><snm>Webb</snm><fnm>DR</fnm></au><au><snm>Srinivasan</snm><fnm>BT</fnm></au><au><snm>Davies</snm><fnm>MJ</fnm></au></aug><source>Diabet Med</source><pubdate>2010</pubdate><volume>27</volume><fpage>887</fpage><lpage>895</lpage><xrefbib><pubidlist><pubid idtype="doi">10.1111/j.1464-5491.2010.03037.x</pubid><pubid idtype="pmpid" link="fulltext">20653746</pubid></pubidlist></xrefbib></bibl><bibl id="B59"><title><p>Diabetes in the dental office: using NHANES III to estimate the probability of undiagnosed disease</p></title><aug><au><snm>Borrell</snm><fnm>LN</fnm></au><au><snm>Kunzel</snm><fnm>C</fnm></au><au><snm>Lamster</snm><fnm>I</fnm></au><au><snm>Lalla</snm><fnm>E</fnm></au></aug><source>J Periodontal Res</source><pubdate>2007</pubdate><volume>42</volume><fpage>559</fpage><lpage>565</lpage><xrefbib><pubidlist><pubid idtype="doi">10.1111/j.1600-0765.2007.00983.x</pubid><pubid idtype="pmpid" link="fulltext">17956470</pubid></pubidlist></xrefbib></bibl><bibl id="B60"><title><p>Screening for diabetes in Kuwait and evaluation of risk scores</p></title><aug><au><snm>Al Khalaf</snm><fnm>MM</fnm></au><au><snm>Eid</snm><fnm>MM</fnm></au><au><snm>Najjar</snm><fnm>HA</fnm></au><au><snm>Alhajry</snm><fnm>KM</fnm></au><au><snm>Doi</snm><fnm>SA</fnm></au><au><snm>Thalib</snm><fnm>L</fnm></au></aug><source>East Mediterr Health J</source><pubdate>2010</pubdate><volume>16</volume><fpage>725</fpage><lpage>731</lpage><xrefbib><pubid idtype="pmpid">20799528</pubid></xrefbib></bibl><bibl id="B61"><title><p>A Chinese diabetes risk score for screening of undiagnosed diabetes and abnormal glucose tolerance</p></title><aug><au><snm>Liu</snm><fnm>M</fnm></au><au><snm>Pan</snm><fnm>C</fnm></au><au><snm>Jin</snm><fnm>M</fnm></au></aug><source>Diabetes Technol Ther</source><pubdate>2011</pubdate><volume>13</volume><fpage>501</fpage><lpage>507</lpage><xrefbib><pubidlist><pubid idtype="doi">10.1089/dia.2010.0106</pubid><pubid idtype="pmpid" link="fulltext">21406016</pubid></pubidlist></xrefbib></bibl><bibl id="B62"><title><p>Presentation of multivariate data for clinical use: the Framingham study risk score functions</p></title><aug><au><snm>Sullivan</snm><fnm>LM</fnm></au><au><snm>Massaro</snm><fnm>JM</fnm></au><au><snm>D&apos;Agostino</snm><fnm>RB</fnm><suf>Sr</suf></au></aug><source>Stat Med</source><pubdate>2004</pubdate><volume>23</volume><fpage>1631</fpage><lpage>1660</lpage><xrefbib><pubidlist><pubid idtype="doi">10.1002/sim.1742</pubid><pubid idtype="pmpid" link="fulltext">15122742</pubid></pubidlist></xrefbib></bibl><bibl id="B63"><title><p>A simulation study of the number of events per variable in logistic regression analysis</p></title><aug><au><snm>Peduzzi</snm><fnm>P</fnm></au><au><snm>Concato</snm><fnm>J</fnm></au><au><snm>Kemper</snm><fnm>E</fnm></au><au><snm>Holford</snm><fnm>TR</fnm></au><au><snm>Feinstein</snm><fnm>AR</fnm></au></aug><source>J Clin Epidemiol</source><pubdate>1996</pubdate><volume>49</volume><fpage>1373</fpage><lpage>1379</lpage><xrefbib><pubidlist><pubid idtype="doi">10.1016/S0895-4356(96)00236-3</pubid><pubid idtype="pmpid" link="fulltext">8970487</pubid></pubidlist></xrefbib></bibl><bibl id="B64"><aug><au><snm>Feinstein</snm><fnm>AR</fnm></au></aug><source>Multivariable Analysis: An Introduction</source><publisher>New Haven: Yale University Press</publisher><pubdate>1996</pubdate></bibl><bibl id="B65"><title><p>What you see may not be what you get: a brief, nontechnical introduction to overfitting in regression-type models</p></title><aug><au><snm>Babyak</snm><fnm>MA</fnm></au></aug><source>Psychosom Med</source><pubdate>2004</pubdate><volume>66</volume><fpage>411</fpage><lpage>421</lpage><xrefbib><pubidlist><pubid idtype="doi">10.1097/01.psy.0000127692.23278.a9</pubid><pubid idtype="pmpid" link="fulltext">15184705</pubid></pubidlist></xrefbib></bibl><bibl id="B66"><title><p>Importance of events per independent variable in proportional hazards analysis. I. Background, goals, and general strategy</p></title><aug><au><snm>Concato</snm><fnm>J</fnm></au><au><snm>Peduzzi</snm><fnm>P</fnm></au><au><snm>Holford</snm><fnm>TR</fnm></au><au><snm>Feinstein</snm><fnm>AR</fnm></au></aug><source>J Clin Epidemiol</source><pubdate>1995</pubdate><volume>48</volume><fpage>1495</fpage><lpage>1501</lpage><xrefbib><pubidlist><pubid idtype="doi">10.1016/0895-4356(95)00510-2</pubid><pubid idtype="pmpid" link="fulltext">8543963</pubid></pubidlist></xrefbib></bibl><bibl id="B67"><title><p>Dichotomizing continuous predictors in multiple regression: a bad idea</p></title><aug><au><snm>Royston</snm><fnm>P</fnm></au><au><snm>Altman</snm><fnm>DG</fnm></au><au><snm>Sauerbrei</snm><fnm>W</fnm></au></aug><source>Stat Med</source><pubdate>2006</pubdate><volume>25</volume><fpage>127</fpage><lpage>141</lpage><xrefbib><pubidlist><pubid idtype="doi">10.1002/sim.2331</pubid><pubid idtype="pmpid" link="fulltext">16217841</pubid></pubidlist></xrefbib></bibl><bibl id="B68"><title><p>Some prognostic models for traumatic brain injury were not valid</p></title><aug><au><snm>Hukkelhoven</snm><fnm>CWPM</fnm></au><au><snm>Rampen</snm><fnm>AJJ</fnm></au><au><snm>Maas</snm><fnm>AIR</fnm></au><au><snm>Farace</snm><fnm>E</fnm></au><au><snm>Habbema</snm><fnm>JDF</fnm></au><au><snm>Marmarou</snm><fnm>A</fnm></au><au><snm>Marshall</snm><fnm>LF</fnm></au><au><snm>Murray</snm><fnm>GD</fnm></au><au><snm>Steyerberg</snm><fnm>EW</fnm></au></aug><source>J Clin Epidemiol</source><pubdate>2006</pubdate><volume>59</volume><fpage>132</fpage><lpage>143</lpage><xrefbib><pubidlist><pubid idtype="doi">10.1016/j.jclinepi.2005.06.009</pubid><pubid idtype="pmpid" link="fulltext">16426948</pubid></pubidlist></xrefbib></bibl><bibl id="B69"><title><p>Prediction models in reproductive medicine</p></title><aug><au><snm>Leushuis</snm><fnm>E</fnm></au><au><snm>van der Steeg</snm><fnm>JW</fnm></au><au><snm>Steures</snm><fnm>P</fnm></au><au><snm>Bossuyt</snm><fnm>PMM</fnm></au><au><snm>Eijkemans</snm><fnm>MJC</fnm></au><au><snm>van der Veen</snm><fnm>F</fnm></au><au><snm>Mol</snm><fnm>BWJ</fnm></au><au><snm>Hompes</snm><fnm>PGA</fnm></au></aug><source>Hum Reprod Update</source><pubdate>2009</pubdate><volume>15</volume><fpage>537</fpage><lpage>552</lpage><xrefbib><pubidlist><pubid idtype="doi">10.1093/humupd/dmp013</pubid><pubid idtype="pmpid" link="fulltext">19435779</pubid></pubidlist></xrefbib></bibl><bibl id="B70"><title><p>Reporting methods in studies developing prognostic models in cancer: a review</p></title><aug><au><snm>Mallett</snm><fnm>S</fnm></au><au><snm>Royston</snm><fnm>P</fnm></au><au><snm>Dutton</snm><fnm>S</fnm></au><au><snm>Waters</snm><fnm>R</fnm></au><au><snm>Altman</snm><fnm>DG</fnm></au></aug><source>BMC Med</source><pubdate>2010</pubdate><volume>8</volume><fpage>20</fpage><xrefbib><pubidlist><pubid idtype="doi">10.1186/1741-7015-8-20</pubid><pubid idtype="pmcid">2856521</pubid><pubid idtype="pmpid" link="fulltext">20353578</pubid></pubidlist></xrefbib></bibl><bibl id="B71"><title><p>Reporting performance of prognostic models in cancer: a review</p></title><aug><au><snm>Mallett</snm><fnm>S</fnm></au><au><snm>Royston</snm><fnm>P</fnm></au><au><snm>Waters</snm><fnm>R</fnm></au><au><snm>Dutton</snm><fnm>S</fnm></au><au><snm>Altman</snm><fnm>DG</fnm></au></aug><source>BMC Med</source><pubdate>2010</pubdate><volume>8</volume><fpage>21</fpage><xrefbib><pubidlist><pubid idtype="doi">10.1186/1741-7015-8-21</pubid><pubid idtype="pmcid">2857810</pubid><pubid idtype="pmpid" link="fulltext">20353579</pubid></pubidlist></xrefbib></bibl><bibl id="B72"><title><p>A systematic review finds methodological improvements necessary for prognostic models in determining traumatic brain injury outcomes</p></title><aug><au><snm>Mushkudiani</snm><fnm>NA</fnm></au><au><snm>Hukkelhoven</snm><fnm>CWPM</fnm></au><au><snm>Hernandez</snm><fnm>AV</fnm></au><au><snm>Murray</snm><fnm>GD</fnm></au><au><snm>Choi</snm><fnm>SC</fnm></au><au><snm>Maas</snm><fnm>AIR</fnm></au><au><snm>Steyerberg</snm><fnm>EW</fnm></au></aug><source>J Clin Epidemiol</source><pubdate>2008</pubdate><volume>61</volume><fpage>331</fpage><lpage>343</lpage><xrefbib><pubidlist><pubid idtype="doi">10.1016/j.jclinepi.2007.06.011</pubid><pubid idtype="pmpid" link="fulltext">18313557</pubid></pubidlist></xrefbib></bibl><bibl id="B73"><title><p>Systematic review of prognostic models in traumatic brain injury</p></title><aug><au><snm>Perel</snm><fnm>P</fnm></au><au><snm>Edwards</snm><fnm>P</fnm></au><au><snm>Wentz</snm><fnm>R</fnm></au><au><snm>Roberts</snm><fnm>I</fnm></au></aug><source>BMC Med Inform Decis Mak</source><pubdate>2006</pubdate><volume>6</volume><fpage>38</fpage><xrefbib><pubidlist><pubid idtype="doi">10.1186/1472-6947-6-38</pubid><pubid idtype="pmcid">1657003</pubid><pubid idtype="pmpid" link="fulltext">17105661</pubid></pubidlist></xrefbib></bibl><bibl id="B74"><title><p>Clinical prediction rules: applications and methodological standards</p></title><aug><au><snm>Wasson</snm><fnm>JH</fnm></au><au><snm>Sox</snm><fnm>HC</fnm></au><au><snm>Neff</snm><fnm>RK</fnm></au><au><snm>Goldman</snm><fnm>L</fnm></au></aug><source>N Engl J Med</source><pubdate>1985</pubdate><volume>313</volume><fpage>793</fpage><lpage>799</lpage><xrefbib><pubidlist><pubid idtype="doi">10.1056/NEJM198509263131306</pubid><pubid idtype="pmpid" link="fulltext">3897864</pubid></pubidlist></xrefbib></bibl><bibl id="B75"><title><p>Effects of mismodelling and mismeasuring explanatory variables on tests of their association with a response variable</p></title><aug><au><snm>Lagakos</snm><fnm>SW</fnm></au></aug><source>Stat Med</source><pubdate>1988</pubdate><volume>7</volume><fpage>257</fpage><lpage>274</lpage><xrefbib><pubidlist><pubid idtype="doi">10.1002/sim.4780070126</pubid><pubid idtype="pmpid">3353607</pubid></pubidlist></xrefbib></bibl><bibl id="B76"><aug><au><snm>Royston</snm><fnm>P</fnm></au><au><snm>Sauerbrei</snm><fnm>W</fnm></au></aug><source>Multivariable Model-Building: A Pragmatic Approach to Regression Analysis Based on Fractional Polynomials for Modelling Continuous Variables</source><publisher>Chichester: John Wiley &amp; Sons</publisher><pubdate>2008</pubdate></bibl><bibl id="B77"><title><p>Regression with missing X's: a review</p></title><aug><au><snm>Little</snm><fnm>RA</fnm></au></aug><source>J Am Stat Assoc</source><pubdate>1992</pubdate><volume>87</volume><fpage>1227</fpage><lpage>1237</lpage><xrefbib><pubid idtype="doi">10.2307/2290664</pubid></xrefbib></bibl><bibl id="B78"><title><p>Missing covariate data within cancer prognostic studies: a review of current reporting and proposed guidelines</p></title><aug><au><snm>Burton</snm><fnm>A</fnm></au><au><snm>Altman</snm><fnm>DG</fnm></au></aug><source>Br J Cancer</source><pubdate>2004</pubdate><volume>91</volume><fpage>4</fpage><lpage>8</lpage><xrefbib><pubidlist><pubid idtype="doi">10.1038/sj.bjc.6601907</pubid><pubid idtype="pmcid">2364743</pubid><pubid idtype="pmpid" link="fulltext">15188004</pubid></pubidlist></xrefbib></bibl><bibl id="B79"><title><p>Comparison of techniques for handling missing covariate data withing prognostic modelling studies: a simulation study</p></title><aug><au><snm>Marshall</snm><fnm>A</fnm></au><au><snm>Altman</snm><fnm>DG</fnm></au><au><snm>Royston</snm><fnm>P</fnm></au><au><snm>Holder</snm><fnm>RL</fnm></au></aug><source>BMC Med Res Meth</source><pubdate>2010</pubdate><volume>10</volume><fpage>7</fpage><xrefbib><pubid idtype="doi">10.1186/1471-2288-10-7</pubid></xrefbib></bibl><bibl id="B80"><title><p>Development and validation of a prediction model with missing predictor data: a practical approach</p></title><aug><au><snm>Vergouwe</snm><fnm>Y</fnm></au><au><snm>Royston</snm><fnm>P</fnm></au><au><snm>Moons</snm><fnm>KGM</fnm></au><au><snm>Altman</snm><fnm>DG</fnm></au></aug><source>J Clin Epidemiol</source><pubdate>2010</pubdate><volume>63</volume><fpage>205</fpage><lpage>214</lpage><xrefbib><pubidlist><pubid idtype="doi">10.1016/j.jclinepi.2009.03.017</pubid><pubid idtype="pmpid" link="fulltext">19596181</pubid></pubidlist></xrefbib></bibl><bibl id="B81"><title><p>Inappropriate use of bivariable analysis to screen risk factors for use in multivariable analysis</p></title><aug><au><snm>Sun</snm><fnm>GW</fnm></au><au><snm>Shook</snm><fnm>TL</fnm></au><au><snm>Kay</snm><fnm>GL</fnm></au></aug><source>J Clin Epidemiol</source><pubdate>1996</pubdate><volume>49</volume><fpage>907</fpage><lpage>916</lpage><xrefbib><pubidlist><pubid idtype="doi">10.1016/0895-4356(96)00025-X</pubid><pubid idtype="pmpid" link="fulltext">8699212</pubid></pubidlist></xrefbib></bibl><bibl id="B82"><title><p>Automated variable selection methods for logistic regression produced unstable models for predicting acute myocardial infarction mortality</p></title><aug><au><snm>Austin</snm><fnm>PC</fnm></au><au><snm>Tu</snm><fnm>JV</fnm></au></aug><source>J Clin Epidemiol</source><pubdate>2004</pubdate><volume>57</volume><fpage>1138</fpage><lpage>1146</lpage><xrefbib><pubidlist><pubid idtype="doi">10.1016/j.jclinepi.2004.04.003</pubid><pubid idtype="pmpid" link="fulltext">15567629</pubid></pubidlist></xrefbib></bibl><bibl id="B83"><title><p>Stepwise selection in small data sets: a simulation study of bias in logistic regression analysis</p></title><aug><au><snm>Steyerberg</snm><fnm>EW</fnm></au><au><snm>Eijkemans</snm><fnm>MJ</fnm></au><au><snm>Habbema</snm><fnm>JD</fnm></au></aug><source>J Clin Epidemiol</source><pubdate>1999</pubdate><volume>52</volume><fpage>935</fpage><lpage>942</lpage><xrefbib><pubidlist><pubid idtype="doi">10.1016/S0895-4356(99)00103-1</pubid><pubid idtype="pmpid" link="fulltext">10513756</pubid></pubidlist></xrefbib></bibl><bibl id="B84"><title><p>Prognostic modelling with logistic regression analysis: a comparison of selection and estimation methods in small data sets</p></title><aug><au><snm>Steyerberg</snm><fnm>EW</fnm></au><au><snm>Eijkemans</snm><fnm>MJC</fnm></au><au><snm>Harrell</snm><fnm>FE</fnm><suf>Jr</suf></au><au><snm>Habbema</snm><fnm>JDF</fnm></au></aug><source>Stat Med</source><pubdate>2000</pubdate><volume>19</volume><fpage>1059</fpage><lpage>1079</lpage><xrefbib><pubidlist><pubid idtype="doi">10.1002/(SICI)1097-0258(20000430)19:8&lt;1059::AID-SIM412&gt;3.0.CO;2-0</pubid><pubid idtype="pmpid" link="fulltext">10790680</pubid></pubidlist></xrefbib></bibl><bibl id="B85"><title><p>What do we mean by validating a prognostic model?</p></title><aug><au><snm>Altman</snm><fnm>DG</fnm></au><au><snm>Royston</snm><fnm>P</fnm></au></aug><source>Stat Med</source><pubdate>2000</pubdate><volume>19</volume><fpage>453</fpage><lpage>473</lpage><xrefbib><pubidlist><pubid idtype="doi">10.1002/(SICI)1097-0258(20000229)19:4&lt;453::AID-SIM350&gt;3.0.CO;2-5</pubid><pubid idtype="pmpid" link="fulltext">10694730</pubid></pubidlist></xrefbib></bibl><bibl id="B86"><title><p>Prognosis and prognostic research: validating a prognostic model</p></title><aug><au><snm>Altman</snm><fnm>DG</fnm></au><au><snm>Vergouwe</snm><fnm>Y</fnm></au><au><snm>Royston</snm><fnm>P</fnm></au><au><snm>Moons</snm><fnm>KGM</fnm></au></aug><source>BMJ</source><pubdate>2009</pubdate><volume>338</volume><fpage>b605</fpage><xrefbib><pubidlist><pubid idtype="doi">10.1136/bmj.b605</pubid><pubid idtype="pmpid" link="fulltext">19477892</pubid></pubidlist></xrefbib></bibl><bibl id="B87"><title><p>Clinical prediction rules: a review and suggested modifications of methodological standards</p></title><aug><au><snm>Laupacis</snm><fnm>A</fnm></au><au><snm>Sekar</snm><fnm>N</fnm></au><au><snm>Stiell</snm><fnm>IG</fnm></au></aug><source>JAMA</source><pubdate>1997</pubdate><volume>277</volume><fpage>488</fpage><lpage>494</lpage><xrefbib><pubidlist><pubid idtype="doi">10.1001/jama.277.6.488</pubid><pubid idtype="pmpid">9020274</pubid></pubidlist></xrefbib></bibl><bibl id="B88"><title><p>Deriving clinical prediction rules from stroke outcome research</p></title><aug><au><snm>Hier</snm><fnm>DB</fnm></au><au><snm>Edelstein</snm><fnm>G</fnm></au></aug><source>Stroke</source><pubdate>1991</pubdate><volume>22</volume><fpage>1431</fpage><lpage>1436</lpage><xrefbib><pubidlist><pubid idtype="doi">10.1161/01.STR.22.11.1431</pubid><pubid idtype="pmpid" link="fulltext">1750053</pubid></pubidlist></xrefbib></bibl><bibl id="B89"><title><p>Root caries risk indicators: a systematic review of risk models</p></title><aug><au><snm>Ritter</snm><fnm>AV</fnm></au><au><snm>Shugars</snm><fnm>DA</fnm></au><au><snm>Bader</snm><fnm>JD</fnm></au></aug><source>Community Dent Oral Epidemiol</source><pubdate>2010</pubdate><volume>38</volume><fpage>383</fpage><lpage>397</lpage><xrefbib><pubidlist><pubid idtype="doi">10.1111/j.1600-0528.2010.00551.x</pubid><pubid idtype="pmpid" link="fulltext">20545716</pubid></pubidlist></xrefbib></bibl><bibl id="B90"><title><p>CONSORT 2010 statement: updated guidelines for reporting parallel group randomized trials</p></title><aug><au><snm>Schulz</snm><fnm>KF</fnm></au><au><snm>Altman</snm><fnm>DG</fnm></au><au><snm>Moher</snm><fnm>D</fnm></au><au><cnm>CONSORT Group</cnm></au></aug><source>Ann Intern Med</source><pubdate>2010</pubdate><volume>152</volume><fpage>726</fpage><lpage>732</lpage><xrefbib><pubid idtype="pmpid">20335313</pubid></xrefbib></bibl><bibl id="B91"><title><p>STrengthening the REporting of Genetic Association Studies (STREGA): an extension of the STROBE statement</p></title><aug><au><snm>Little</snm><fnm>J</fnm></au><au><snm>Higgins</snm><fnm>JP</fnm></au><au><snm>Ioannidis</snm><fnm>JP</fnm></au><au><snm>Moher</snm><fnm>D</fnm></au><au><snm>Gagnon</snm><fnm>F</fnm></au><au><snm>von Elm</snm><fnm>E</fnm></au><au><snm>Khoury</snm><fnm>MJ</fnm></au><au><snm>Cohen</snm><fnm>B</fnm></au><au><snm>Davey-Smith</snm><fnm>G</fnm></au><au><snm>Grimshaw</snm><fnm>J</fnm></au><au><snm>Scheet</snm><fnm>P</fnm></au><au><snm>Gwinn</snm><fnm>M</fnm></au><au><snm>Williamson</snm><fnm>RE</fnm></au><au><snm>Zou</snm><fnm>GY</fnm></au><au><snm>Hutchings</snm><fnm>K</fnm></au><au><snm>Johnson</snm><fnm>CY</fnm></au><au><snm>Tait</snm><fnm>V</fnm></au><au><snm>Wiens</snm><fnm>M</fnm></au><au><snm>Golding</snm><fnm>J</fnm></au><au><snm>van Duijn</snm><fnm>C</fnm></au><au><snm>McLaughlin</snm><fnm>J</fnm></au><au><snm>Paterson</snm><fnm>A</fnm></au><au><snm>Wells</snm><fnm>G</fnm></au><au><snm>Fortier</snm><fnm>I</fnm></au><au><snm>Freedman</snm><fnm>M</fnm></au><au><snm>Zecevic</snm><fnm>M</fnm></au><au><snm>King</snm><fnm>R</fnm></au><au><snm>Infante-Rivard</snm><fnm>C</fnm></au><au><snm>Stewart</snm><fnm>A</fnm></au><au><snm>Birkett</snm><fnm>N</fnm></au><au><cnm>STrengthening the REporting of Genetic Association Studies</cnm></au></aug><source>PLoS Med</source><pubdate>2009</pubdate><volume>6</volume><fpage>e22</fpage><xrefbib><pubidlist><pubid idtype="doi">10.1371/journal.pmed.1000022</pubid><pubid idtype="pmcid">2634792</pubid><pubid idtype="pmpid" link="fulltext">19192942</pubid></pubidlist></xrefbib></bibl><bibl id="B92"><title><p>REporting recommendations for tumor MARKer prognostic studies (REMARK)</p></title><aug><au><snm>McShane</snm><fnm>LM</fnm></au><au><snm>Altman</snm><fnm>DG</fnm></au><au><snm>Sauerbrei</snm><fnm>W</fnm></au><au><snm>Taube</snm><fnm>SE</fnm></au><au><snm>Gion</snm><fnm>M</fnm></au><au><snm>Clark</snm><fnm>GM</fnm></au><au><cnm>Statistics Subcommittee of NCI-EORTC Working Group on Cancer Diagnostics</cnm></au></aug><source>Breast Cancer Res Treat</source><pubdate>2006</pubdate><volume>100</volume><fpage>229</fpage><lpage>235</lpage><xrefbib><pubidlist><pubid idtype="doi">10.1007/s10549-006-9242-8</pubid><pubid idtype="pmpid" link="fulltext">16932852</pubid></pubidlist></xrefbib></bibl><bibl id="B93"><title><p>The quality of reports of randomised trials in 2000 and 2006: comparative study of articles indexed in PubMed</p></title><aug><au><snm>Hopewell</snm><fnm>S</fnm></au><au><snm>Dutton</snm><fnm>S</fnm></au><au><snm>Yu</snm><fnm>LM</fnm></au><au><snm>Chan</snm><fnm>AW</fnm></au><au><snm>Altman</snm><fnm>DG</fnm></au></aug><source>BMJ</source><pubdate>2010</pubdate><volume>340</volume><fpage>c723</fpage><xrefbib><pubidlist><pubid idtype="doi">10.1136/bmj.c723</pubid><pubid idtype="pmcid">2844941</pubid><pubid idtype="pmpid" link="fulltext">20332510</pubid></pubidlist></xrefbib></bibl><bibl id="B94"><title><p>Does the CONSORT checklist improve the quality of reports of randomised controlled trials? A systematic review</p></title><aug><au><snm>Plint</snm><fnm>AC</fnm></au><au><snm>Moher</snm><fnm>D</fnm></au><au><snm>Morrison</snm><fnm>A</fnm></au><au><snm>Schulz</snm><fnm>K</fnm></au><au><snm>Altman</snm><fnm>DG</fnm></au><au><snm>Hill</snm><fnm>C</fnm></au><au><snm>Gaboury</snm><fnm>I</fnm></au></aug><source>Med J Aust</source><pubdate>2006</pubdate><volume>185</volume><fpage>263</fpage><lpage>267</lpage><xrefbib><pubid idtype="pmpid" link="fulltext">16948622</pubid></xrefbib></bibl><bibl id="B95"><title><p>Recommended guidelines for the conduct and evaluation of prognostic studies in veterinary oncology</p></title><aug><au><snm>Webster</snm><fnm>JD</fnm></au><au><snm>Dennis</snm><fnm>MM</fnm></au><au><snm>Dervisis</snm><fnm>N</fnm></au><au><snm>Heller</snm><fnm>J</fnm></au><au><snm>Bacon</snm><fnm>NJ</fnm></au><au><snm>Bergman</snm><fnm>PJ</fnm></au><au><snm>Bienzle</snm><fnm>D</fnm></au><au><snm>Cassali</snm><fnm>G</fnm></au><au><snm>Castagnaro</snm><fnm>M</fnm></au><au><snm>Cullen</snm><fnm>J</fnm></au><au><snm>Esplin</snm><fnm>DG</fnm></au><au><snm>Pe&#241;a</snm><fnm>L</fnm></au><au><snm>Goldschmidt</snm><fnm>MH</fnm></au><au><snm>Hahn</snm><fnm>KA</fnm></au><au><snm>Henry</snm><fnm>CJ</fnm></au><au><snm>Hellm&#233;n</snm><fnm>E</fnm></au><au><snm>Kamstock</snm><fnm>D</fnm></au><au><snm>Kirpensteijn</snm><fnm>J</fnm></au><au><snm>Kitchell</snm><fnm>BE</fnm></au><au><snm>Amorim</snm><fnm>RL</fnm></au><au><snm>Lenz</snm><fnm>SD</fnm></au><au><snm>Lipscomb</snm><fnm>TP</fnm></au><au><snm>McEntee</snm><fnm>M</fnm></au><au><snm>McGill</snm><fnm>LD</fnm></au><au><snm>McKnight</snm><fnm>CA</fnm></au><au><snm>McManus</snm><fnm>PM</fnm></au><au><snm>Moore</snm><fnm>AS</fnm></au><au><snm>Moore</snm><fnm>PF</fnm></au><au><snm>Moroff</snm><fnm>SD</fnm></au><au><snm>Nakayama</snm><fnm>H</fnm></au><au><cnm>American College of Veterinary Pathologists' Oncology Committee</cnm></au><etal/></aug><source>Vet Pathol</source><pubdate>2011</pubdate><volume>48</volume><fpage>7</fpage><lpage>18</lpage><xrefbib><pubidlist><pubid idtype="doi">10.1177/0300985810377187</pubid><pubid idtype="pmpid" link="fulltext">20664014</pubid></pubidlist></xrefbib></bibl></refgrp>
<sec>
<st>
<p>Pre-publication history</p>
</st>
<p>The pre-publication history for this paper can be accessed here:</p>
<p>
<url>http://www.biomedcentral.com/1741-7015/9/103/prepub</url>
</p>
</sec>
</bm></art>