1 General Purpose
Clinical trials of disease management
require accurate tests for making a diagnosis/ patient follow-up.
Whatever test, screening, laboratory or physical, investigators
involved need to know how good it is. The goodness of a diagnostic
test is a complex question that is usually estimated according to
three criteria: (1) its reproducibility, (2) precision, and (3)
validity. Reproducibility is synonymous to reliability, and is,
generally, assessed by the size of differences between duplicate
measures. Precision of a test is synonymous to the spread in the
test results, and can be estimated, e.g., by standard deviations /
standard errors. Validity is synonymous to accuracy, and can be
defined as a test’s ability to show which individuals have the
disease in question and which do not. Unlike the first two
criteria, the third is hard to quantify, first, because it is
generally assessed by two estimators rather than one, namely
sensitivity and specificity, defined as the chance of a true
positive and true negative test, respectively.
2 Schematic Overview of Type of Data File

3 Primary Scientific Question
Is some lab score an accurate predictor
of the presence of a disease.
4 Data Example
The primary scientific question of the
data file was: is the underneath vascular lab score test accurate
for demonstrating the presence of peripheral vascular disease. What
cutoff score does provide the best sensitivity/specificity.
presence peripheral vascular disease
(0 = no, 1 = yes)
|
vascular lab score
|
,00
|
1,00
|
,00
|
2,00
|
,00
|
2,00
|
,00
|
3,00
|
,00
|
3,00
|
,00
|
3,00
|
,00
|
4,00
|
,00
|
4,00
|
,00
|
4,00
|
,00
|
4,00
|
The entire data file is in
extras.springer.com, and is entitled “chapter53validatingqualit”.
First, we will try and make a graph of the data.
5 Drawing Histograms
Command:
-
Analyze....Graphs....Legacy Dialogs....Histogram....Variable:score....Rows: disease ....click OK.
The above histograms summarize the
data. The upper graph shows the frequencies of various scores of
all patients with vascular disease as confirmed by angiograms, the
lower graph of the patients without. The scores of the diseased
patients are generally much larger, but there is also a
considerable overlap. The overlap can be expressed by sensitivity
(number of true positive/number of false positive patients) and
specificity (number of true negative patients / number of false
negative patients). The magnitude of the sensitivity and
specificity depends on the cutoff level used for defining patients
positive or negative. sensitivities and specificities continually
change as we move the cutoff level along the x-axis. A Roc
(receiver operating characteristic) curve summarizes all
sensitivities and specificities obtained by this action. With help
of the Roc curve the best cutoff for optimal diagnostic accuracy of
the test is found.
6 Validating the Qualitative Diagnostic Test
For analysis the SPSS module ROC Curve
is required.
Command:
-
Graphs....ROC Curve....Test Variable Score....State Variable: disease....Value of State: Variable 1....mark: ROC Curve....mark: With diagonal reference line.... mark: Coordinate points of ROC Curve....click OK.
The best cutoff value of the
sensitivity and 1-specificity is the place on the curve with the
shortest distance to the top ofy-axis where both sensitivity and
1-specificity equal 1 (100 %). The place is found by adding up
sensitivities and specificities as summarized in the table on the
next page.
Coordinates of the curveTest result
variable(s): score
Positive if greater than or equal
toa
|
Sensitivity
|
1-Specificity
|
---|---|---|
,0000
|
1,000
|
1,000
|
1,5000
|
1,000
|
,996
|
2,5000
|
1,000
|
,989
|
3,5000
|
1,000
|
,978
|
4,5000
|
1,000
|
,959
|
5,5000
|
1,000
|
,929
|
6,5000
|
1,000
|
,884
|
7,5000
|
1,000
|
,835
|
8,5000
|
1,000
|
,768
|
9,5000
|
1,000
|
,697
|
10,5000
|
1,000
|
,622
|
11,5000
|
1,000
|
,543
|
12,5000
|
1,000
|
,464
|
13,5000
|
1,000
|
,382
|
14,5000
|
1,000
|
,307
|
15,5000
|
,994
|
,240
|
16,5000
|
,984
|
,172
|
17,5000
|
,971
|
,116
|
18,5000
|
,951
|
,071
|
19,5000
|
,925
|
,049
|
20,5000
|
,893
|
,030
|
21,5000
|
,847
|
,019
|
22,5000
|
,789
|
,007
|
23,5000
|
,724
|
,000
|
24,5000
|
,649
|
,000
|
25,5000
|
,578
|
,000
|
26,5000
|
,500
|
,000
|
27,5000
|
,429
|
,000
|
28,5000
|
,354
|
,000
|
29,5000
|
,282
|
,000
|
30,5000
|
,214
|
,000
|
31,5000
|
,153
|
,000
|
32,5000
|
,101
|
,000
|
33,5000
|
,062
|
,000
|
34,5000
|
,036
|
,000
|
35,5000
|
,019
|
,000
|
36,5000
|
,010
|
,000
|
37,5000
|
,003
|
,000
|
39,0000
|
,000
|
,000
|
The best cutoff value of the
sensitivity and 1-specificity is the place on the curve with the
shortest distance to the top of y-axis where both sensitivity and
1-specificity equal 1 (100 %). The place is found by adding up
sensitivities and specificities as summarized in the underneath
table.
Sensitivity
|
1-specificity
|
sensitivity − (1-specificity) (=
sensitivity + specificity-1)
|
0.971
|
0.116
|
0.855
|
0.951
|
0.071
|
0.880
|
0.925
|
0.049
|
0.876
|
At a sensitivity of 0.951 and a
“1-specificity” (= false positives) of 0.071 the best add-up sum is
found (1.880). Looking back at the first column of the table from
the previous page the cutoff score > 18.5 is the best cutoff,
which means a score of 19 produces the fewest false positive and
fewest false negative tests.
7 Conclusion
Clinical trials of disease management
require accurate tests for making a diagnosis/for patient
follow-up. Accuracy of qualitative diagnostic tests is assessed
with two estimators, sensitivity and specificity. Roc curves are
convenient for summarizing the data, and finding the best fit
cutoff values for your data. A problem is that sensitivity and
specificity are severely dependent on one another. If one is high,
the other is, as a rule, low.
8 Note
More background, theoretical and
mathematical information of validation of qualitative data is given
in Statistics applied to clinicals studies 5th edition, Chaps. 50
and 51, Springer Heidelberg Germany, 2012, from the same
authors.