Prev Next

Ton J. Cleophas and Aeilko H. ZwindermanSPSS for Starters and 2nd Levelers10.1007/978-3-319-20600-4_53

53. Validating Qualitative Diagnostic Tests (575 Patients)

Ton J. Cleophas^{1, 2} and Aeilko H. Zwinderman^2,
3

(1)

Department Medicine, Albert Schweitzer Hospital, Dordrecht, The Netherlands

(2)

European College Pharmaceutical Medicine, Lyon, France

(3)

Department Biostatistics, Academic Medical Center, Amsterdam, The Netherlands

1 General Purpose

Clinical trials of disease management require accurate tests for making a diagnosis/ patient follow-up. Whatever test, screening, laboratory or physical, investigators involved need to know how good it is. The goodness of a diagnostic test is a complex question that is usually estimated according to three criteria: (1) its reproducibility, (2) precision, and (3) validity. Reproducibility is synonymous to reliability, and is, generally, assessed by the size of differences between duplicate measures. Precision of a test is synonymous to the spread in the test results, and can be estimated, e.g., by standard deviations / standard errors. Validity is synonymous to accuracy, and can be defined as a test’s ability to show which individuals have the disease in question and which do not. Unlike the first two criteria, the third is hard to quantify, first, because it is generally assessed by two estimators rather than one, namely sensitivity and specificity, defined as the chance of a true positive and true negative test, respectively.

2 Schematic Overview of Type of Data File

3 Primary Scientific Question

Is some lab score an accurate predictor of the presence of a disease.

4 Data Example

The primary scientific question of the data file was: is the underneath vascular lab score test accurate for demonstrating the presence of peripheral vascular disease. What cutoff score does provide the best sensitivity/specificity.

presence peripheral vascular disease (0 = no, 1 = yes)	vascular lab score
,00	1,00
,00	2,00
,00	2,00
,00	3,00
,00	3,00
,00	3,00
,00	4,00
,00	4,00
,00	4,00
,00	4,00

The entire data file is in extras.springer.com, and is entitled “chapter53validatingqualit”. First, we will try and make a graph of the data.

5 Drawing Histograms

Command:

Analyze....Graphs....Legacy Dialogs....Histogram....Variable:score....Rows: disease ....click OK.

The above histograms summarize the data. The upper graph shows the frequencies of various scores of all patients with vascular disease as confirmed by angiograms, the lower graph of the patients without. The scores of the diseased patients are generally much larger, but there is also a considerable overlap. The overlap can be expressed by sensitivity (number of true positive/number of false positive patients) and specificity (number of true negative patients / number of false negative patients). The magnitude of the sensitivity and specificity depends on the cutoff level used for defining patients positive or negative. sensitivities and specificities continually change as we move the cutoff level along the x-axis. A Roc (receiver operating characteristic) curve summarizes all sensitivities and specificities obtained by this action. With help of the Roc curve the best cutoff for optimal diagnostic accuracy of the test is found.

6 Validating the Qualitative Diagnostic Test

For analysis the SPSS module ROC Curve is required.

Command:

Graphs....ROC Curve....Test Variable Score....State Variable: disease....Value of State: Variable 1....mark: ROC Curve....mark: With diagonal reference line.... mark: Coordinate points of ROC Curve....click OK.

The best cutoff value of the sensitivity and 1-specificity is the place on the curve with the shortest distance to the top ofy-axis where both sensitivity and 1-specificity equal 1 (100 %). The place is found by adding up sensitivities and specificities as summarized in the table on the next page.

Coordinates of the curveTest result variable(s): score

Positive if greater than or equal to^a	Sensitivity	1-Specificity
,0000	1,000	1,000
1,5000	1,000	,996
2,5000	1,000	,989
3,5000	1,000	,978
4,5000	1,000	,959
5,5000	1,000	,929
6,5000	1,000	,884
7,5000	1,000	,835
8,5000	1,000	,768
9,5000	1,000	,697
10,5000	1,000	,622
11,5000	1,000	,543
12,5000	1,000	,464
13,5000	1,000	,382
14,5000	1,000	,307
15,5000	,994	,240
16,5000	,984	,172
17,5000	,971	,116
18,5000	,951	,071
19,5000	,925	,049
20,5000	,893	,030
21,5000	,847	,019
22,5000	,789	,007
23,5000	,724	,000
24,5000	,649	,000
25,5000	,578	,000
26,5000	,500	,000
27,5000	,429	,000
28,5000	,354	,000
29,5000	,282	,000
30,5000	,214	,000
31,5000	,153	,000
32,5000	,101	,000
33,5000	,062	,000
34,5000	,036	,000
35,5000	,019	,000
36,5000	,010	,000
37,5000	,003	,000
39,0000	,000	,000

The test result variable(s): score has at least one tie between the positive actual state group and the negative actual state group.

^aThe smallest cutoff value is the minimum observed test value minus 1, and the largest cutoff value is the maximum observed test value plus 1. All the other cutoff values are the averages of two consecutive ordered observed test values

The best cutoff value of the sensitivity and 1-specificity is the place on the curve with the shortest distance to the top of y-axis where both sensitivity and 1-specificity equal 1 (100 %). The place is found by adding up sensitivities and specificities as summarized in the underneath table.

Sensitivity	1-specificity	sensitivity − (1-specificity) (= sensitivity + specificity-1)
0.971	0.116	0.855
0.951	0.071	0.880
0.925	0.049	0.876

At a sensitivity of 0.951 and a “1-specificity” (= false positives) of 0.071 the best add-up sum is found (1.880). Looking back at the first column of the table from the previous page the cutoff score > 18.5 is the best cutoff, which means a score of 19 produces the fewest false positive and fewest false negative tests.

7 Conclusion

Clinical trials of disease management require accurate tests for making a diagnosis/for patient follow-up. Accuracy of qualitative diagnostic tests is assessed with two estimators, sensitivity and specificity. Roc curves are convenient for summarizing the data, and finding the best fit cutoff values for your data. A problem is that sensitivity and specificity are severely dependent on one another. If one is high, the other is, as a rule, low.

8 Note

More background, theoretical and mathematical information of validation of qualitative data is given in Statistics applied to clinicals studies 5th edition, Chaps. 50 and 51, Springer Heidelberg Germany, 2012, from the same authors.

Prev Next