Identifying incident colorectal and lung cancer cases in health service utilisation databases in Australia: a validation study.
Goldsbury D, Weber M, Yap S, Banks E, O'Connell DL, Canfell K.
Data from centralised, population-based statutory cancer registries are generally considered the 'gold standard' for confirming incident cases of cancer. When these are not available, or more current information is needed, hospital or other routinely collected population-level data may be feasible alternative sources. We aimed to determine the validity of various methods using routinely collected administrative health data for ascertaining incident cases of colorectal or lung cancer in participants from the 45 and Up Study in New South Wales (NSW), Australia.
For 266,844 participants in the 45 and Up Study (recruited 2006-2009) ascertainment of incident colorectal or lung cancers was assessed using diagnosis and treatment records in linked administrative health datasets (hospital, emergency department, Medicare and pharmaceutical claims, death records). This was compared with ascertainment via the NSW Cancer Registry (NSWCR, the 'gold standard') for a period for which both data sources were available for participants.
A total of 2253 colorectal and 1019 lung cancers were recorded for study participants in the NSWCR over the period 2006-2010. A diagnosis of primary cancer recorded in the statewide Admitted Patient Data Collection identified the majority of NSWCR colorectal and lung cancers, with sensitivities and positive predictive values (PPV) of 95% and 91% for colorectal cancer and 81% and 85% for lung cancer, respectively. Using additional information on lung cancer deaths from death records increased sensitivity to 84% (PPV 83%) for lung cancer, but did not improve ascertainment of colorectal cancers. Hospital procedure codes for colorectal cancer surgery identified cases with sensitivity 81% and PPV 54%. No other individual indicator had sensitivity >50% or PPV >65% for either cancer type and no combination of indicators increased both the sensitivity and PPV above that achieved using the hospital cancer diagnosis data. All specificities were close to 100%; 95% confidence intervals for sensitivity and PPV were generally +/-2%.
In NSW, identifying new cases of colorectal and lung cancer from administrative health datasets, such as hospital records, is a feasible alternative when cancer registry data are not available. However, the strengths and limitations of the different data sources should be borne in mind.