• Ei tuloksia

4. METHODS USED IN THE STUDY

4.2 Statistical methods used in analysis

To assess the test accuracies of the cervical cancer screening tests (Papers I and II)

The accuracy of vIA, vILI, vIAM, cytology and HPv testing was assessed by estimating the following parameters: sensitivity, specificity, positive predictive value (PPv) and negative predictive value (NPv) of the tests. This was done by first getting a crosstabulation of the screening test results (categorized into positive and negative) and the true disease status results (categorized into diseased and not diseased) (Table 4.2).

Table 4.2 Cross-tabulation of screening test and reference standard results

Test result

True disease status Diseased Not Diseased

Test+ a b

Test- c d

a= true positives; b=false positives;

c=false negatives; d=true negatives

Estimates of the parameters were then obtained using the formulas indicated below.

Sensitivity = a(a + c)

Specificity = d (b + d)

Using meta-analytical methods, sensitivity and specificity of the five tests and the ratio of the sensitivity and specificity of one of the test compared to the other tests were assessed for each category of CIN using random effect models, allowing for inter-setting heterogeneity. [Sharp et al., 1997; Sutton et al., 1998]

To explore the sources of heterogeneity (Paper I)

Sources of heterogeneity were explored by assessing the association between test accuracies and individual and study characteristics. The influence of age, study centre, and time period on study outcomes was assessed using logistic regression and summary receiver operating characteristic (SROC) regression. [Moses et al., 1993]

Age was aggregated into 5-year groups (restricted to women between 30 and 64), and study period by tertiles, using date of screening or chronological rank ID. Study period was considered as a proxy for accumulated experience of the test providers.

Logistic regression was used to assess the influence of study characteristics on each dichotomous diagnostic parameter (sensitivity and specificity) separately. By using SROC regression, the impact of these covariates simultaneously on sensitivity, specificity and diagnostic odds ratio (DOR, an overall accuracy measure that integrates sensitivity and specificity), was evaluated. The DOR, given by

defines the odds of a positive result among women with, for instance, CIN2+ to the odds of a positive test among women without CIN2+.

The coefficients of the linear SROC regression equation, D = b0 + b1 * S

describe the relation between terms, D and S, the difference (D) and the sum (S), respectively, of the logits of the true and false positivity rates, where

PPV = a(a + b)

NPV = d (c + d)

DOR = =odds(sensitivity)

odds(1 - specificity) sensitivity/(1 - sensitivity) (1 - specificity)/specificity

and

when the coefficient of the S term (β1) in SROC regression is significantly different from zero, it indicates that there is change of accuracy due to varying degree of positivity of the screen test.

The three covariates were added in the linear model as indicated in the formula below, allowing for an explanation of the variation of sensitivity and specificity by study characteristics

D = b0 + b1 * S + b2 * Age + b3 * Period + b4 * Site To assess gain in test performance (Paper II)

The combined test was defined as testing with a single conventional vIA test [or vILI test], plus vILI [or vIA] used as an additional test. The aim was to assess the value of vILI [vIA] as an additional test, beyond the value of vIA alone [vILI alone]. The combined test was termed positive if either vIA or vILI had a positive result.

In addition to the estimation of sensitivity, specificity, PPv and NPv, the accuracy of the combined test compared to vIA alone or vILI alone was evaluated using the positive likelihood ratio (LR+) and negative likelihood ratio (LR-) and their 95%CI. [Macaskill et al., 2002] The formulae for these additional parameters are given below.

The odds of disease following a positive test are obtained by multiplying the prior odds of disease (λ) by LR+. Thus, the PPv can be obtained by

Similarly, the NPv can be is expressed as D = ln Sensitivity - ln

(1 - Sensitivity)

⎛⎜

⎞⎟

(1 - Specificity) (Specificity)

⎛⎜

⎞⎟

⎠= ln(DOR)

LR+ = Sensitivity (1 - Specificity) LR- = 1 -Sensitivity

(Specificity)

PPV = λLR+

(1 + λLR+)

The PPv is the same as the prior-test probability of disease, and a positive test result has no diagnostic value when LR+ =1. Likewise, the NPv is the same as the prior-test probability of non-disease when LR− =1. The PPv increases when LR+ increases above 1, whereas the NPv increases when LR- decreases below 1.

If LR+ of the combined test (LR+comb) is greater than LR+ of the single test (LR+sing) and the 95% CI of (LR+comb) / (LR+sing) does not include 1, the combined test would be preferred. This is because, in this case, the use of the combined test significantly improves the LR+, which in turn means a significant increase in the PPv. Alternatively, if LR- of the combined test (LR-comb) is greater than LR- of the single test (LR-sing) and the 95% CI of (LR-comb) / (LR-sing) does not include 1, then we would prefer the single test. In such a case, the NPv is significantly increased as LR- is significantly improved (decreased) when the single test is used.

If one or both of LR+ and LR- do not improve significantly, there is then no clear choice between the single test and the combined test. In this situation, the decision to use the combined test [or not] will be influenced by the trade-off in the expected number of additional number of false positive (FP) results one is prepared to accept for each additional true positive (TP) detected, which in turn depends on the prevalence of disease in the study population. The formulae used for the calculation of trade-off (T) per person tested and the ratio (R always >0) of the number of additional false positives per additional true positive found and its 95%CI, as demonstrated by Macaskill, [Macaskill et al., 2002] are given below.

Among the diseased, the probability of each possible pair of test results is given by p+jk = Pr(single test = j; combined test = k | D = +) where j and k represent the results for single test and combined test, respectively, j, k = -/+ and − = negative and + = positive. The probabilities corresponding to each pair of test results among the non-diseased are represented by p-jk = Pr(single test = j; combined test = k | D = -).

These two probabilities are estimated from the cross tabulation of the distribution of the single and combined test results among the diseased and non-diseased. From the estimates of the two probabilities, the joint probabilities π+jk, and π-jk in Table 4.3 are then calculated.

NPV = 1

(1 + λLR-)

Table 4.3 Joint probabilities of pairs of test results for the single and the combined test among the diseased and non-diseased

Disease D+

D-Combined test Combined test

Single test + - +

-+ π+++[= p+++ + p++-] 0 π-++[= p-++ + p-+-] 0 - π+-+[= p+-+] π+--[= p+--] π--+[= p--+] π---[= p---]

The trade-off, T, is given by

where θ is the prevalence of the disease in the population.

T=0 indicates equivalence of the two tests; T>0 implies that the combined test would be preferred; and T<0 would lead to preference of the single test.

The critical value of R, (R*), when T =0 is estimated as

with the corresponding asymptotic standard error of ln(R*) (SE(lnR*)), using the delta method, given by

where nD- is the number of non-diseased and nD+ is the number of diseased individuals.

By varying prevalence across a range of plausible values of the test accuracy parameters, one can assess whether the corresponding value of R* lies in an acceptable range. The choice of R* will depend on the added cost of the adjunct test and the utilities for treating a person with disease and treating a person without disease.

The gain in test performance was also evaluated using a simple graphical method (also using likelihood ratios, see Figure 4.1). [Biggerstaff, 2000] Figure 4.1a (4.1b) shows the accuracy of vIA (vILI) alone in Receiver Operating Characteristic (ROC) space, an alternative way to describe test accuracy. The rectangle in the upper right T = Rθπ+-+ - (1 - θ)π --+,

R* = θπ+-+ (1 - θ)π --+

SE(lnR*) = 1 - π --+ 1 - π -++ nD- π --+ + nD+ π -++

⎛⎜

⎞⎟

corner represents the area in which the sensitivity and specificity of a combined test must lie. The slope of the line from (0, 0) that passes through (1 - specificity, sensitivity) gives LR+ for the single test. Similarly, the slope of the line from (1, 1) that passes through (1 - specificity, sensitivity) gives LR- for the single test. These two lines divide the rectangle into three regions. The combined test would be preferred if its point (1 - specificity, sensitivity) falls in region c, or the single test would be preferred if the point falls in region s. In region t, a trade-off occurs and no clear choice would be made between the tests based purely on the likelihood ratios.

To evaluate the effect of visual inspection screening of the cervix on both cervical cancer incidence and mortality (Paper III)

Intention-to-treat analysis was used in which all eligible women in the clusters randomized were considered irrespective of their participation in the interview or screening. Multivariate analysis of cancer incidence and mortality endpoints was carried out using Cox proportional hazards regression, taking into account cluster design and adjusting for age, education, marital status and parity.

Participation in screening and treatment, screen-positivity and stage distribution were calculated as proportions. For the calculation of incidence rates, the person-years of follow-up in both groups were calculated from the date of study entry of the figure 4.4 Sensitivity and specificity for a) VIA alone and combined test and b) VIlI alone and combined test when disease outcome=CIN 2-3+.

Key: The slope of the line passing through coordinate (0,0) is equal to the positive likelihood ratio of the single test. likewise, the slope of the line passing through coordinate (1,1) is equal to the negative likelihood ratio of the single test. VIA: Visual inspection with acetic acid. VIlI: Visual inspection with lugol’s iodine. CIN 2-3+: Cervical intraepithelial neoplasia grades 2 and 3 and cancer.

woman to the date of diagnosis, death, migration or last follow-up visit, whichever came first. For mortality rates, the person-years of follow-up were calculated from the date of study entry of the woman to the date of death, migration or last follow-up visit. The earliest date of entry was January 2000 and the latest date of exit was December 2006.

To evaluate the effect of visual inspection screening of the oral cavity on oral cancer mortality (Paper IV)

Intention-to-treat analysis was employed and analysis was carried out using the cluster as the unit of analysis to consider clustering. The comparison of rate ratios was performed using the heuristic 95% confidence interval (CI) of the rate ratios.

[Bennett et al., 2002]

Participation in screening, screen positivity, compliance for referral, stage distribution and case fatality were calculated as proportions and survival was computed by Kaplan-Meier analysis. [Kaplan et al., 1958] For the calculation of incidence and mortality rates among all eligible women, the number of person-years in the intervention and control groups was calculated from the date of study entry of the individual to 31 December 2004 or death.

To assess the effect of the major risk factors of cancer of the oral cavity (Paper V)

The effects of paan chewing, tobacco smoking or alcohol drinking on the risk of oral cancer were estimated with odds ratios (ORs) and their 95% confidence interval (CIs), derived from conditional logistic regression analysis with adjustment for education, religion and the other two habits. Continuous variables such as years of chewing, smoking or drinking, and frequency of use were categorized by dividing the distributions among exposed controls into approximate tertiles. Trend tests for ordered variables were performed by assigning the score j to the jth exposure level of a categorical variable (where j = 1, 2, …) and treating it as a continuous predictor in conditional logistic regression. For the calculation of pack-years, the amount of tobacco was estimated as 1 gram per cigarette, 0.5 grams per bidi and 2 grams per other types. [Balaram et al., 2002; IARC, 1986]

Attributable fractions for each habit [Miettinen, 1974] and a combination of habits [Bruzzi et al., 1985] were obtained using ORs estimates from the conditional regression models. ORs estimates for a combination of two habits were obtained after adjusting for the third habit.