Thanks Pierre, your solution is very elegant. The very good thing is that with this solution is that the data is never written on disk, which give good performance (for large data of course). Your program can be extended even further if one will have the statistical test, but then it starts to get more difficult. /*access to a function library with matrixalgebra function is required, see attached file. Here I just have the matrixalgebra functions in the work library*/ options cmplib=work.func; Data _null_; infile cards missover eof=done; array x{1,2} _temporary_; array xt{2,1} _temporary_; array xx_temp{2,2} _temporary_; array xx{2,2} _temporary_; array varians{2,2} _temporary_; array xy{2,1} _temporary_; array beta{2,1} _temporary_; Input YEAR ACTUAL; x[1,1]=1; x[1,2]=year; N+1; yearsum+year; yearuss+year**2; actualsum+actual; actualuss+actual**2; productsum+actual*year; call trans(x,xt); call multiplicer(xt,x,xx_temp); if _N_=1 then do; call zero(xx); call zero(xy); error=0; end; call add(xx,xx_temp,xx); pred=0; do i=1 to dim(x,2); xy[i,1]+x[1,i]*actual; pred+x[1,i]*actual; end; return; done: error=(actualuss-(actualsum**2)/N -( (productsum -yearsum*actualsum/N)**2)/(yearuss-yearsum**2/N))/(N-dim(x,2)); call invers(xx,varians); call multiplicer(varians,xy,beta); call show(beta); put @1 'Estimate' @20 'Std Err' @ 40 'P-value'; do i=1 to dim(beta,1); stderr=sqrt(varians[i,i]*error); *pvalue=sdf('chisq',(beta[i,1]/stderr)**2,1); pvalue=2*sdf('t',abs(beta[i,1]/stderr),N-dim(beta,1)); put @1 beta[i,1] @20 stderr @40 pvalue pvalue6.4; end; call symputx("Trend", beta[2,1]); cards; 1995 4356.26 1996 8520.93 1997 8161.92 1998 8715.83 1999 4060.38 2000 5446.24 2001 2212.86 2002 4348.00 2003 3803.34 2004 3095.96 2005 8455.64 2006 3553.20 ; run;
... View more