BookmarkSubscribeRSS Feed
GPatel
Pyrite | Level 9

I have test data as shown below. How can I compute trend factor and create a macro variable for the same?

Data Test;

infile cards missover;

Input YEAR   ACTUAL best8.;

cards;

1995 4356.26

1996 8520.93

1997 8161.92

1998 8715.83

1999 4060.38

2000 5446.24

2001 2212.86

2002 4348.00

2003 3803.34

2004 3095.96

2005 8455.64

2006 3553.20

;

run;

7 REPLIES 7
JacobSimonsen
Barite | Level 11

Hi,

There are several procedures which can estimate a linear trend. One of them are GLM

proc glm data=test;

  model actual=year;

run;

in macro form:

%macro trend(data,dependent, independent);

  ods output parameterestimates=parameterestimates;

  proc glm data=&data.;

  model &dependent=&independent;

quit;

%mend;

%trend(test,actual,year);

Then you can find the trend estimate in the output window and in the data "parameterestimates".

You also calculate the number manually, which of course is more cumbersome, but also more fun:-). Then you also will need a variable which contain the product of the two variables:

Data Test;

infile cards missover;

Input YEAR   ACTUAL best8.;

product=actual*year;

cards;

.

.

.

run;

proc means data=test uss sum n;

  var year product actual;

  output out=out sum=yearsum productsum actualsum css=yearcss productcss actualcss n=yearn procuctn actualn;

run;

*write the number in the log window:;

data _NULL_;

  set out;

  beta=(productsum-yearsum*actualsum/yearn)/yearcss;

  put beta;

run;

PGStats
Opal | Level 21

If you want to create a macro variable with the value of the annual trend, you need to transfer the value from ParameterEstimates dataset to a macro variable with a datastep (or with proc sql), like this, for example:

proc reg data=test;

model actual = year / dwprob;

ods output ParameterEstimates=ParameterEstimates;

run;

data _null_;

set ParameterEstimates;

where upcase(Variable) = "YEAR";

call symputx("Trend", Estimate);

run;

%put Trend is : &Trend;

PG

PG
JacobSimonsen
Barite | Level 11

The question is already solved above. But I also like this solution because it runs within one single datastep:

Data Test;

  infile cards missover ; 

  Input YEAR   ACTUAL best8.;

  retain productsum yearsum actualsum yearuss;

if _N_=1 then do;

  yearsum=0;

  productsum=0;

  actualsum=0;

  yearuss=0;

end;

  productsum=productsum+actual*year;

  yearsum=yearsum+year;

  actualsum=actualsum+actual;

  yearuss=yearuss+year**2;

if _N_=12 then do;

  yearcss=yearuss-(yearsum)**2/_N_;

  put productsum= yearsum= actualsum= yearuss= yearcss=;

  beta=(productsum-yearsum*actualsum/_N_)/yearcss;

  call symputx("Trend", Estimate);

end;

cards;

1995 4356.26

1996 8520.93

1997 8161.92

1998 8715.83

1999 4060.38

2000 5446.24

2001 2212.86

2002 4348.00

2003 3803.34

2004 3095.96

2005 8455.64

2006 3553.20

;

run;

PGStats
Opal | Level 21

Or, with the help of sum statements and the eof= option :

Data _null_;

  infile cards missover eof=done;

  Input YEAR ACTUAL;

  productsum+actual*year;

  yearsum+year;

  actualsum+actual;

  yearuss+year**2;

  N+1;

  return;

done:

  trend = (productsum -yearsum*actualsum/N) / (yearuss-yearsum**2/N);

  call symputx("Trend", trend);

cards;

1995 4356.26

1996 8520.93

1997 8161.92

1998 8715.83

1999 4060.38

2000 5446.24

2001 2212.86

2002 4348.00

2003 3803.34

2004 3095.96

2005 8455.64

2006 3553.20

;

%put Trend = &Trend;

But then you will miss all the useful diagnostics of regression procedures, including the statistical significance of the trend.

PG

PG
JacobSimonsen
Barite | Level 11

Thanks Pierre, your solution is very elegant. The very good thing is that with this solution is that the data is never written on disk, which give good performance (for large data of course). Your program can be extended even further if one will have the statistical test, but then it starts to get more difficult.

/*access to a function library with matrixalgebra function is required, see attached file. Here I just have the matrixalgebra functions in the work library*/

options cmplib=work.func;

Data _null_;

  infile cards missover eof=done;

  array x{1,2} _temporary_;

  array xt{2,1} _temporary_;

  array xx_temp{2,2} _temporary_;

  array xx{2,2} _temporary_;

  array varians{2,2} _temporary_;

  array xy{2,1} _temporary_;

  array beta{2,1} _temporary_;

  Input YEAR ACTUAL;

  x[1,1]=1;

  x[1,2]=year;

  N+1;

  yearsum+year;

  yearuss+year**2;

  actualsum+actual;

  actualuss+actual**2;

  productsum+actual*year;

  call trans(x,xt);

  call multiplicer(xt,x,xx_temp);

  if _N_=1 then do;

    call zero(xx);

    call zero(xy);

    error=0;

  end;

  call add(xx,xx_temp,xx);

  pred=0;

  do i=1 to dim(x,2);

    xy[i,1]+x[1,i]*actual;

    pred+x[1,i]*actual;

  end;

  return;

  done:

  error=(actualuss-(actualsum**2)/N -( (productsum -yearsum*actualsum/N)**2)/(yearuss-yearsum**2/N))/(N-dim(x,2));

  call invers(xx,varians);

  call multiplicer(varians,xy,beta);

  call show(beta);

  put @1 'Estimate' @20 'Std Err' @ 40 'P-value';

  do i=1 to dim(beta,1);

    stderr=sqrt(varians[i,i]*error);

    *pvalue=sdf('chisq',(beta[i,1]/stderr)**2,1);

    pvalue=2*sdf('t',abs(beta[i,1]/stderr),N-dim(beta,1));

    put @1 beta[i,1] @20 stderr @40 pvalue pvalue6.4;

  end;

  call symputx("Trend", beta[2,1]);

cards;

1995 4356.26

1996 8520.93

1997 8161.92

1998 8715.83

1999 4060.38

2000 5446.24

2001 2212.86

2002 4348.00

2003 3803.34

2004 3095.96

2005 8455.64

2006 3553.20

;

run;

PGStats
Opal | Level 21

More difficult indeed! Nicely done! - PG

PG
GPatel
Pyrite | Level 9

To all:

Thanks to all for providing elegant solution/s to my post.

Girish

sas-innovate-2024.png

Join us for SAS Innovate April 16-19 at the Aria in Las Vegas. Bring the team and save big with our group pricing for a limited time only.

Pre-conference courses and tutorials are filling up fast and are always a sellout. Register today to reserve your seat.

 

Register now!

What is ANOVA?

ANOVA, or Analysis Of Variance, is used to compare the averages or means of two or more populations to better understand how they differ. Watch this tutorial for more.

Find more tutorials on the SAS Users YouTube channel.

Discussion stats
  • 7 replies
  • 3970 views
  • 3 likes
  • 3 in conversation