DATA Step, Macro, Functions and more

Creating lags of multiple variables using an array.

Reply
Occasional Contributor
Posts: 19

Creating lags of multiple variables using an array.

I'm trying to find  a way to efficiently loop through a large number of variables (~50), and lags 1-4 of each variable.  I have tried to do this using an array:

 

data test;
array lags(*) x1 x2 x3;
set full;
do i=1 to dim(lags);
lags(i)=lag1(lags(i));
end;
run;

This will work, but it replaces the orginial variables with a lag, but the they all have the original names. What I'm hoping to achieve:

 

1). All data in one work file.

2). Lags 1-4 of all variables

3). Variables renamed to represent their respective lag order ex: x1L1, x1L2, x1L3, x1L4...

Super User
Super User
Posts: 7,946

Re: Creating lags of multiple variables using an array.

[ Edited ]

Sorry, I don't see that being a good structure to work with.  Why do you need such a structure as your just going to make a lot of work for yourself going forward, if you already have 50 variables (which is an awful lot) then add another 200 variables etc.  How are you going to program with this?  What I would suggest is to normalise you existing data, so those 50 variables go down the table in the form of paramter value response - the lag is then implicit in the data list:

From:

var1 var2 var3 

a      b       c

 

To:

var  result

1     a

2     b

3     c

 

 Let me give another example.  In your new structure, for each record you will duplicate this 4 times onto subsequent rows, hence your dataset size is going to grow quite a bit - and it will not contain any new information as its just copies.  If you normalise your data, add a sequence variable, you can very easily merge data,  so if you want previous row it is a.var=b.var and a.seq=b.seq-1.

So the data remains minmal, but has the option of merging to get lags.

Occasional Contributor
Posts: 19

Re: Creating lags of multiple variables using an array.

What I am trying to program with this is see if there's a good combination of various lags that have predictive power in a regression modeling framework.  I was hoping to run the variables through a stepwise algorithim.  I have done this already with the variables with no lags, but I was going to try again to see if a combination of various lagged variables provides any benefit.  I wouldn't think it would be too difficult to work within the structure I started if there was just  a way I could rename all variables in the array to have the lag specific suffix.

Super User
Posts: 5,503

Re: Creating lags of multiple variables using an array.

The easiest way would be to use generic variable names.  After the analysis, take whatever names pop out as significant and find out what they mean by comparing them to the original program.  For example:

 

data want;

set have;

array original {50} list all 50 names here for the existing variables;

array L1_ {50};

array L2_ {50};

array L3_ {50};

array L4_ {50};

do _n_=1 to 50;

   L1_{_n_} = lag1(original{_n_});

   L2_{_n_} = lag2(original{_n_});

   L3_{_n_} = lag3(original{_n_});

   L4_{_n_} = lag4(original{_n_});

end;

run;

 

So if L3_47 pops out in the regression, you will need to go back to the original program and find the name of the 47th original variable.  There are ways to automate this, but they have pitfalls and wouldn't be appropriate for a one-time analysis.

Super User
Posts: 10,023

Re: Creating lags of multiple variables using an array.

Actually it is IML thing . 

 

data lag1;
output;
set sashelp.class;
keep name age;
rename name=name_lag1 age=age_lag1;
run;
data lag2;
output;
output;
set sashelp.class;
keep name age;
rename name=name_lag2 age=age_lag2;
run;
data lag3;
output;
output;
output;
set sashelp.class;
keep name age;
rename name=name_lag3 age=age_lag3;
run;
data lag4;
output;
output;
output;
output;
set sashelp.class;
keep name age;
rename name=name_lag4 age=age_lag4;
run;


data want;
 merge sashelp.class(keep=name age in=ina) lag:;
 if ina;
run;
Super User
Posts: 10,023

Re: Creating lags of multiple variables using an array.

Actually It is IML thing .

 

 

proc iml;
use sashelp.class;
read all var _num_ into x[c=vnames];
close;

do i=1 to ncol(x);
 temp=vnames[i];
 names=temp||(strip(temp)+'_lag1':strip(temp)+'_lag4');
 want=lag(x[,i],0:4);

 create ('want_'+temp) from want[c=names];
 append from want;
 close;
end;

quit;

data final_want;
 merge want_:;
run;
Ask a Question
Discussion stats
  • 5 replies
  • 743 views
  • 1 like
  • 4 in conversation