Sunday, January 20, 2013

Lake Mendota Freeze Durations

Q. Is the freeze duration of Lake Mendota decreasing over the years?

Freeze duration data for Lake Mendota and other Madison lakes is available online. $N=156$ data points for Lake Mendota are available starting in the year 1855 [1]. Let us run a simple statistical test to check for a negative slope trend.

Consider a linear model, i.e.,
$$ y_n = \beta_1 n + \beta_0 + w_n, $$
where $y_n$ denotes the freeze duration at any data point $0 \leq n \leq N-1$ and $w_n$ is i.i.d. Gaussian noise with zero mean and unknown variance. The regression coefficients $\hat{\beta_0}$ and $\hat{\beta_1}$ can be calculated from data using linear regression (see Figure below).

Now, setting up the hypotheses:
H_0: \beta_1 = 0
H_1: \beta_1 < 0.

Figure: Linear fit to Lake Mendota freeze duration data

It is known that $$t = \frac{\hat{\beta_1}}{SE}$$ follows a t-distribution with $N-2$ degrees of freedom. Here $$SE = \sqrt{\frac{\frac{1}{N-2}\sum_{n=0}^{N-1} (y_n - \hat{\beta_1} n- \hat{\beta_0} )^2}{\;\;\;\;\;\;\;\sum_{n=0}^{N-1} \left(n-\frac{(N-1)(N-2)}{\;\;\;\;\;2N}\right)^2}}$$ is called the standard error [2].

A. Unless there is an algebra error in my calculations, the t-test shows that there is statistically significant negative slope trend in this data (at both 1% and 5% significance levels).

[1] UW-Madison Department of Atmospheric and Oceanic Sciences. URL: (Accessed: Jan 20, 2012)
[2] T. Tarpey, Course Notes for Stat-630 at Wright State University. URL: (Accessed: Jan 20, 2012).

I have copied data from [1] in a .csv file for easy access. You can download it here. And here's some Octave code. Make sure the .csv file is in the same directory.

data = csvread('MendotaFreezeDurations.csv');
N = numel(data(:,1));
ply = polyfit(data(:,1), data(:,2), 1);
% ply = [-0.18829   117.82053]
SSE = sum((data(:,2)-polyval(ply, data(:,1))).^2);
% SSE =  4.6469e+04
MSE = SSE/(N-2);
SE = sqrt(MSE/sum( (data(:,1)-mean(data(:,1))).^2 ));
% SE =  0.030491
tstatistic = ply(1)/SE;
% tstatistic = -6.1754
tthresh1 = -tinv(0.99,N-2)
tthresh5 = -tinv(0.95,N-2)
if tstatistic < tthresh1, 

  disp('H0 rejected at 1%'); 
  disp('H0 accepted at 1%'); 
if tstatistic < tthresh5, 

  disp('H0 rejected at 5%'); 
  disp('H0 accepted at 5%'); 
% H0 is rejected in both cases. 

% There is a statistically significant 
% downward sloping trend.

plot(data(:,1), data(:,2),... 
  data(:,1), polyval(ply, data(:,1)));
axis tight
ylabel('freeze duration (days)')
xlabel('data point')
title('lake Mendota freeze duration linear regression')
print -depsc freezedurplot.eps

No comments:

Post a Comment