Ton J. Cleophas and Aeilko H. ZwindermanStatistical Analysis of Clinical Data on a Pocket CalculatorStatistics on a Pocket Calculator10.1007/978-94-007-1211-9_20© Springer Science+Business Media B.V. 2011

20. Kappas for Reliability Assessment of Binary Data

Ton J. Cleophas1, 2   and Aeilko H. Zwinderman2, 3  
(1)
Department of Medicine, Albert Schweitzer Hospital, Dordrecht, The Netherlands
(2)
European College of Pharmaceutical Medicine, Lyon, France
(3)
Department of Epidemiology and Biostatistics, Academic Medical Center, Amsterdam, The Netherlands
 
 
Ton J. Cleophas (Corresponding author)
 
Aeilko H. Zwinderman
Abstract
The reproducibility of continuous data can be estimated with duplicate standard deviations (Chap. 19). With binary data Cohen’s kappas are used for the purpose. Reliability assessment of diagnostic procedures is an important part of the validity assessment of scientific research.
The reproducibility of continuous data can be estimated with duplicate standard deviations (Chap. 19). With binary data Cohen’s kappas are used for the purpose. Reliability assessment of diagnostic procedures is an important part of the validity assessment of scientific research.
Example
Positive (pos) or negative (neg) laboratory tests of 30 patients are assessed. All patiënts are tested a second time in order to estimate the level of reproducibility of the test.
 
1st time
 
   
pos
neg
 
2nd time
pos
10
 5
15
 
neg
4
11
15
   
14
16
30
If the test is not reproducible at all, then we will find twice the same result in 50% of the patients, and a different result the second time in the other 50% of the patients.
 $$\begin{array}{rll}\rm{Overall}&\hbox{30 tests have been carried out twice}\\ \hbox{We observe} & \hbox{10 times} 2\, \times \hbox{positive and}\\ & \hbox{11 times} 2\, \times {\rm negative}.\end{array}$$
And thus, twice the same is found in
 $$\begin{array}{lll}\hbox{21 patients which is considerable more than in half of the cases}, \\ \hbox{which should have been 15 times}.\end{array}$$
Minimal indicates the number of duplicate observations if reproducibility were zero, maximal indicates the number of duplicate observations if the reproducibility were 100%.
 $$\begin{array}{lll}{\rm Kappa} &= \frac{\hbox{observed - minimal}}{\hbox{maximal - minimal}}\\ &= \frac{21 - 15}{30 - 15}\\ &= 0.4\end{array}$$
A kappa-value of 0.0 means that reproducibility is very poor.
A kappa of 1.0 would have meant excellent reproducibility.
In our example we observed a kappa of 0.4, which means reproducibility is very moderate.