1 General Purpose
For a proper assessment of seasonality,
information of a second year of observation is needed, as well as
information not only of, e.g., the months of January and July, but
also of adjacent months. In order to unequivocally demonstrate
seasonality, all of this information included in a single test is
provided by autocorrelation.
The above graph gives a simulated
seasonal pattern of C-reactive protein levels in a healthy subject.
Lagcurves (dotted) are partial copies of the datacurve moved to the
left as indicated by the arrows.
-
First-row graphs: the datacurve and the lagcurve have largely simultaneous positive and negative departures from the mean, and, thus, have a strong positive correlation with one another (correlation coefficient ≈ +0.6).
-
Second-row graphs: this lagcurve has little correlation with the datacurve anymore (correlation coefficient ≈ 0.0).
-
Third-row graphs: this lagcurve has a strong negative correlation with the datacurve (correlation coefficient ≈ −1.0).
-
Fourth-row graphs: this lagcurve has a strong positive correlation with the datacurve (correlation coefficient ≈ +1.0).
2 Schematic Overview of Type of Data File
3 Primary Scientific Question
Do repeatedly measured outcome value
follow a seasonal pattern.
4 Data Example
Primary question: do repeatedly
measured CRP values in a healthy subject follow a seasonal pattern.
If the datacurve values are averaged values with their se (standard
error), then xi will change into (xi + se),
and xi+1 into (xi+1 + se). This is no
problem, since the se-values will even out in the regression
equation, and the overall magnitude of the autocorrelation
coefficient will remain unchanged, irrespective of the magnitude of
the se. And, so, se-values need not be further taken into account
in the autocorrelation of time series with means, unless they are
very large. A data file is given below.
Average C-reactive protein in group of
healthy subjects (mg/l)
|
Month
|
1,98
|
1
|
1,97
|
2
|
1,83
|
3
|
1,75
|
4
|
1,59
|
5
|
1,54
|
6
|
1,48
|
7
|
1,54
|
8
|
1,59
|
9
|
1,87
|
10
|
The entire data file is in
extras.springer.com, and is entitled “chapter58seasonality”. Start
by opening the data file in SPSS. We will first try and make a
graph of the data.
5 Graphs of Data
Command:
-
Graphs….Chart Builder….click Scatter/Dot….click mean C-reactive protein level and drag to the Y-Axis….click time and drag to the X-Axis….click OK….. double-click in Chart Editor….click Interpolation Line….Properties: click Straight Line.
The above graph shows that the average
monthly C-reactive protein levels look inconsistent. A graph of
bi-monthly averages is drawn. The data are already in the above
data file.
Average C-reactive protein in group of
healthy subjects (mg/l)
|
Month
|
1,90
|
2,00
|
1,87
|
4,00
|
1,56
|
6,00
|
1,67
|
8,00
|
1,73
|
10,00
|
1,84
|
12,00
|
1,89
|
14,00
|
1,84
|
16,00
|
1,61
|
18,00
|
1,67
|
20,00
|
1,67
|
22,00
|
1,90
|
24,00
|
Command:
-
Graphs….Chart Builder….click Scatter/Dot….click mean C-reactive protein level and drag to the Y-Axis….click time and drag to the X-Axis….click OK….. double-click in Chart Editor….click Interpolation Line….Properties: click Straight Line.
The above bi-monthly graph shows a
rather seasonal pattern. Autocorrelation is, subsequently, used to
test significant seasonality of these data. SPSS Statistical
Software is used.
6 Assessing Seasonality with Autocorrelations
For analysis the statistical model
Autocorrelations in the module Forecasting is required.
Command:
-
Analyze….Forecasting.…Autocorrelations….move monthly percentages into Variable Box.…mark Autocorrelations….mark Partial Autocorrelations.…OK.
The above graph of monthly
autocorrelation coefficients with their 95 % confidence
intervals is given by SPSS, and it shows that the magnitude of the
monthly autocorrelations changes sinusoidally. The significant
positive autocorrelations at the month no. 13 (correlation
coefficients of 0,42 (SE 0,14, t-value 3,0, p < 0,01)) further
supports seasonality, and so does the pattern of partial
autocorrelation coefficients (not shown): it gradually falls, and a
partial autocorrelation coefficient of zero is observed one month
after month 13. The strength of the seasonality is assessed using
the magnitude of r2 = 0,422 = 0,18. This
would mean that the lagcurve predicts the datacurve by only
18 %, and, thus, that 82 % is unexplained. And so, the
seasonality may be statistically significant, but it is pretty
weak, and a lot of unexplained variability, otherwise called noise,
is in these data.
7 Conclusion
Autocorrelation is able to demonstrate
statistically significant seasonality of disease, and it does so
even with imperfect data.
8 Note
More background, theoretical and
mathematical information about seasonality assessments is given in
Statistics applied to clinical studies 5th edition, Chap. 64,
Springer Heidelberg Germany, 2012, from the same authors.