Statistics
Receiver Operating Characteristic (ROC) curve
As discussed in section 3.9 (see additional information at the end of this document) the uncertainty demonstrated by observers in assessing XRays for dental decay can be used as the basis for constructing a Receiver Operating Characteristic (ROC) curve. An observer is asked to quantify his diagnosis of 30 extracted teeth for presence of decay using the five-point scale suggested above:
1. Decay definitely present
2. Decay probably present
3. Unsure
4. Decay probably absent
5. Decay definitely absent
Having been assessed, the teeth were sectioned and viewed microscopically to establish the gold standard. These are the results:
observer assessment gold standard
1 decay
4 no decay
5 no decay
2 decay
5 no decay
3 decay
1 decay
1 decay
3 decay
5 decay
4 no decay
2 decay
3 no decay
3 decay
1 decay
1 decay
5 no decay
5 no decay
2 decay
1 decay
5 no decay
5 no decay
4 no decay
5 decay
2 no decay
3 no decay
1 decay
1 decay
1 decay
5 no decay
Assignment questions / requirements
a. Draw an ROC curve for this data.
b. Calculate sensitivity, specificity, positive predictive value and negative predictive value using observer assessments of 1 and 2 to indicate decay and assessments of 3, 4 and 5 to indicate no decay.
c. Calculate sensitivity, specificity, positive predictive value and negative predictive value using observer assessments of 1, 2 and 3 to indicate decay and assessments of 4 and 5 to indicate no decay.
d. What do the results tell you?
ADDITIONAL SUPPORTIVE INFORMATION
3.9 RECEIVER OPERATING CHARACTERISTIC (ROC) CURVE ANALYSIS
We have so far assumed that a diagnostic method yields a simple positive or negative result. However this is frequently not the case. An observer assessing a radiograph for the approximal enamel decay may not be completely certain that a radiolucency is detectable. This is true of many common diseases: a single test does not yield a clear diagnosis. Consider, for instance, the small artificial example in figure 1.
Group Measurement
1 7
1 10
1 13
1 14
1 16
1 17
1 18
1 20
1 21
1 25
2 17
2 19
2 20
2 22
2 23
2 24
2 25
2 27
2 29
2 33
Table 7. Example data to illustrate the ROC curve
In this example data we take group 2 as diseased cases, group 1 as normals. We want to use a measurement, called “measurement”, as our test variable to provide a diagnosis. We may select any value of measurement as our cut-point for diagnosing between diseased and not diseased- each of these assessments has its own sensitivity and specificity. For example if we choose 20 for our test value and a measurement of greater than or equal to 20 is taken as indicating disease, we may summarise the results as in Table 8.
Disease
+ve -ve
Our Diagnosis +ve 7 2
-ve 3 8
Table 8. Example data- disease v. diagnosis using cutpoint of 20.
In this case the sensitivity of our test would be 7/10 = 0.7 and the specificity 8/10 = 0.8.
Figure 1 shows the overlap of the individual data points. Clearly, for our test to be 100% sensitive (detecting every occurrence of disease) the cut-point for measurement would need to be at 17 or below. Table 9 summarises these results.
Disease
+ve -ve
Our Diagnosis +ve 10 5
-ve 0 5
Table 9. Example data- disease v. diagnosis using cutpoint of 17.
The specificity in this case would be only 0.5; we would accept many cases as diseased that should be rejected.
Figure 2 shows the Receiver Operating Characteristic (ROC) curve for this dataset. It graphs sensitivity against 1-specificity. You could use such a curve to determine the most suitable cut point for your diagnostic test. In making the decision you would need to weigh the ‘costs’ of failing to diagnose a diseased case against diagnosing a disease-free case as diseased
Test Result Variable(s): MEASUREMENT
Positive if Greater Than or Equal To(a) Sensitivity 1 – Specificity
6.00 1.000 1.000
8.50 1.000 .900
11.50 1.000 .800
13.50 1.000 .700
15.00 1.000 .600
16.50 1.000 .500
17.50 .900 .400
18.50 .900 .300
19.50 .800 .300
20.50 .700 .200
21.50 .700 .100
22.50 .600 .100
23.50 .500 .100
24.50 .400 .100
26.00 .300 .000
28.00 .200 .000
31.00 .100 .000
34.00 .000 .000
Table 10. Example data: Coordinates of ROC Curve.
The term Receiver Operating Characteristic curve comes from signal detection theory and was developed from radar. In addition to its use, as discussed here, as a measure of diagnosis, it is also particularly useful for the evaluation of detectability in imaging systems since it takes into account observer performance. The curve describes the compromise that must be made between the frequency of TP and FP diagnoses in determining the decision criterion or, to put it another way, what false positive rate has to be accepted in order to obtain the desired true positive rate. This cut-off point is called the “operating position”. In figure 2 the bottom left hand corner of the curve shows a high threshold value based on a stringent decision criterion. Relatively few TP diagnoses are made but at least there are no FP ones. On the other hand, the top right hand end of the curve shows the result of a low threshold value with a less stringent, more relaxed decision criterion. The number of TP diagnoses approaches perfection but at the expense of a very large number of FP ones.
ROC curves are widely used to measure diagnostic accuracy. This is expressed as Az, the area under the curve. In figure 2 the green diagonal denotes random chance: Az is 0.5 and the observers are guessing. In contrast, the red zigzag line of our data indicates better accuracy- in fact, the Az value is 0.865.
ROC curves are used in practice to assess the accuracy of XRays for diagnosing decay. When used to assess approximal enamel lesions, the ROC curve is just above the diagonal, indicating that observers are so inaccurate as to make the XRays worthless for this purpose. In contrast, accuracy for occlusal dentinal decay is much better (Az = 0.8) indicating that XRays are reasonably efficacious (Hintze et al 1994).
The uncertainty demonstrated by observers in assessing XRays for dental decay can be exploited as the basis for constructing an ROC curve. The observer is asked to quantify his diagnosis on a five-point scale:
1. Decay definitely present.
2. Decay probably present.
3. Unsure.
4. Decay probably absent.
5. Decay definitely absent.
Alternatively, a six point scale can be used, sub-dividing category 3 above into two decay possibly present and decay possibly absent. The sensitivity and specificity are then calculated at each point and the ROC curve may be plotted from this data.
ORDER THIS ESSAY HERE NOW AND GET A DISCOUNT !!!