**Q. Is the freeze duration of Lake Mendota decreasing over the years?**

Freeze duration data for Lake Mendota and other Madison lakes is available online. $N=156$ data points for Lake Mendota are available starting in the year 1855 [1]. Let us run a simple statistical test to check for a negative slope trend.

Consider a linear model, i.e.,

$$ y_n = \beta_1 n + \beta_0 + w_n, $$

where $y_n$ denotes the freeze duration at any data point $0 \leq n \leq N-1$ and $w_n$ is i.i.d. Gaussian noise with zero mean and unknown variance. The regression coefficients $\hat{\beta_0}$ and $\hat{\beta_1}$ can be calculated from data using linear regression (see Figure below).

Now, setting up the hypotheses:

$$

H_0: \beta_1 = 0

$$

and

$$

H_1: \beta_1 < 0.

$$

Figure: Linear fit to Lake Mendota freeze duration data |

It is known that $$t = \frac{\hat{\beta_1}}{SE}$$ follows a t-distribution with $N-2$ degrees of freedom. Here $$SE = \sqrt{\frac{\frac{1}{N-2}\sum_{n=0}^{N-1} (y_n - \hat{\beta_1} n- \hat{\beta_0} )^2}{\;\;\;\;\;\;\;\sum_{n=0}^{N-1} \left(n-\frac{(N-1)(N-2)}{\;\;\;\;\;2N}\right)^2}}$$ is called the standard error [2].

**A. Unless there is an algebra error in my calculations, the t-test shows that there is statistically significant negative slope trend in this data (at both 1% and 5% significance levels).**

**References:**

[1] UW-Madison Department of Atmospheric and Oceanic Sciences. URL: http://www.aos.wisc.edu/~sco/lakes/Mendota-ice.html (Accessed: Jan 20, 2012)

[2] T. Tarpey, Course Notes for Stat-630 at Wright State University. URL: http://www.wright.edu/~thaddeus.tarpey/stt630chap8.pdf (Accessed: Jan 20, 2012).

**Appendix:**

I have copied data from [1] in a .csv file for easy access. You can download it here. And here's some Octave code. Make sure the .csv file is in the same directory.

data = csvread('MendotaFreezeDurations.csv');

N = numel(data(:,1));

ply = polyfit(data(:,1), data(:,2), 1);

% ply = [-0.18829 117.82053]

SSE = sum((data(:,2)-polyval(ply, data(:,1))).^2);

% SSE = 4.6469e+04

MSE = SSE/(N-2);

SE = sqrt(MSE/sum( (data(:,1)-mean(data(:,1))).^2 ));

% SE = 0.030491

tstatistic = ply(1)/SE;

% tstatistic = -6.1754

tthresh1 = -tinv(0.99,N-2)

tthresh5 = -tinv(0.95,N-2)

if tstatistic < tthresh1,

disp('H0 rejected at 1%');

else,

disp('H0 accepted at 1%');

end;

if tstatistic < tthresh5,

disp('H0 rejected at 5%');

else,

disp('H0 accepted at 5%');

end;

% H0 is rejected in both cases.

% There is a statistically significant

% downward sloping trend.

plot(data(:,1), data(:,2),...

data(:,1), polyval(ply, data(:,1)));

axis tight

ylabel('freeze duration (days)')

xlabel('data point')

title('lake Mendota freeze duration linear regression')

print -depsc freezedurplot.eps

## No comments:

## Post a Comment