Programming the statistical procedures from SAS

proc glmselect for time series data

Reply
Frequent Contributor
Posts: 110

proc glmselect for time series data

[ Edited ]

I just came accross this:

 

Associations between Mycobacterium avium subsp. paratuberculosis antibodies in bulk tank milk, seaso...

 

Here the author transforms the time (in days) using sin(2*pi*(day/365)) or cos(2*pi*(day/365)). Is this an acceptable way to model seasonality when using proc glmselect to model a dependent variable that depends on time and other factors (aka tim series prediction)?

Super User
Posts: 19,877

Re: proc glmselect for time series data

Posted in reply to csetzkorn

This may be better posted on stats.stackexchange.com

 

The question is more statistical than how to program something in SAS type question. There are statisticians on here and you may get an answer, but more likely to get a response on the forum mentioned above.

PROC Star
PROC Star
Posts: 231

Re: proc glmselect for time series data

In your example, DAY is measured on a circular scale: DAY = 1 and DAY = 366 occupy the same position in an annual cycle. 

 

DAY is converted into radian units by 2*pi*(DAY/365). If we define the angle theta as 2*pi*(DAY/365), then we convert from polar coordinates (assuming that radius = 1) to rectangular coordinates (x, y) as x = cos(theta) and y = sin(theta).

 

The prediction of a linear variable X from a circular variable THETA is known in the circular statistics area as "linear-circular association". From the text by N.I. Fisher https://www.amazon.com/Statistical-Analysis-Circular-Data-Fisher/dp/0521568900 (p 139):

"In the linear-circular case, we focus on measuring association between X and THETA with a (possible) view to predicting the mean value of X for a given value theta of THETA. A simple regression model for this type of association has the form ...

E(THETA = theta) = a_0 + a cos(theta)sin(theta) .... it is a simple linear regression model (linear in the regression variables cos(theta) and sin(theta), that is) and can be fitted routinely by methods in any general statistical package, so we shall not discuss the regression aspect further."

 

I see the use of sin/cos variables applied in ecology to variables like aspect (the continuous version of north-east-south-west). Notably aspect is different than time, and what works for aspect may not be appropriate for time.

 

So it could be an acceptable way. If you have many years of data, then time series analysis might be an alternative, perhaps better approach. The paper you reference has observations over only two years.

 

 

 

SAS Super FREQ
Posts: 3,756

Re: proc glmselect for time series data

Posted in reply to csetzkorn

I've seen Dave Dickey (an expert in time series analysis) take the sine transform of time in his papers/talks. See

the example in http://support.sas.com/resources/papers/proceedings14/1275-2014.pdf

He has a description similar to the one that @sld gave in 

http://www2.sas.com/proceedings/sugi29/201-29.pdf

 

 

Ask a Question
Discussion stats
  • 3 replies
  • 180 views
  • 4 likes
  • 4 in conversation