BookmarkSubscribeRSS Feed
dsriggs
Fluorite | Level 6

I'm trying to find  a way to efficiently loop through a large number of variables (~50), and lags 1-4 of each variable.  I have tried to do this using an array:

 

data test;
array lags(*) x1 x2 x3;
set full;
do i=1 to dim(lags);
lags(i)=lag1(lags(i));
end;
run;

This will work, but it replaces the orginial variables with a lag, but the they all have the original names. What I'm hoping to achieve:

 

1). All data in one work file.

2). Lags 1-4 of all variables

3). Variables renamed to represent their respective lag order ex: x1L1, x1L2, x1L3, x1L4...

5 REPLIES 5
RW9
Diamond | Level 26 RW9
Diamond | Level 26

Sorry, I don't see that being a good structure to work with.  Why do you need such a structure as your just going to make a lot of work for yourself going forward, if you already have 50 variables (which is an awful lot) then add another 200 variables etc.  How are you going to program with this?  What I would suggest is to normalise you existing data, so those 50 variables go down the table in the form of paramter value response - the lag is then implicit in the data list:

From:

var1 var2 var3 

a      b       c

 

To:

var  result

1     a

2     b

3     c

 

 Let me give another example.  In your new structure, for each record you will duplicate this 4 times onto subsequent rows, hence your dataset size is going to grow quite a bit - and it will not contain any new information as its just copies.  If you normalise your data, add a sequence variable, you can very easily merge data,  so if you want previous row it is a.var=b.var and a.seq=b.seq-1.

So the data remains minmal, but has the option of merging to get lags.

dsriggs
Fluorite | Level 6

What I am trying to program with this is see if there's a good combination of various lags that have predictive power in a regression modeling framework.  I was hoping to run the variables through a stepwise algorithim.  I have done this already with the variables with no lags, but I was going to try again to see if a combination of various lagged variables provides any benefit.  I wouldn't think it would be too difficult to work within the structure I started if there was just  a way I could rename all variables in the array to have the lag specific suffix.

Astounding
PROC Star

The easiest way would be to use generic variable names.  After the analysis, take whatever names pop out as significant and find out what they mean by comparing them to the original program.  For example:

 

data want;

set have;

array original {50} list all 50 names here for the existing variables;

array L1_ {50};

array L2_ {50};

array L3_ {50};

array L4_ {50};

do _n_=1 to 50;

   L1_{_n_} = lag1(original{_n_});

   L2_{_n_} = lag2(original{_n_});

   L3_{_n_} = lag3(original{_n_});

   L4_{_n_} = lag4(original{_n_});

end;

run;

 

So if L3_47 pops out in the regression, you will need to go back to the original program and find the name of the 47th original variable.  There are ways to automate this, but they have pitfalls and wouldn't be appropriate for a one-time analysis.

Ksharp
Super User

Actually it is IML thing . 

 

data lag1;
output;
set sashelp.class;
keep name age;
rename name=name_lag1 age=age_lag1;
run;
data lag2;
output;
output;
set sashelp.class;
keep name age;
rename name=name_lag2 age=age_lag2;
run;
data lag3;
output;
output;
output;
set sashelp.class;
keep name age;
rename name=name_lag3 age=age_lag3;
run;
data lag4;
output;
output;
output;
output;
set sashelp.class;
keep name age;
rename name=name_lag4 age=age_lag4;
run;


data want;
 merge sashelp.class(keep=name age in=ina) lag:;
 if ina;
run;
Ksharp
Super User

Actually It is IML thing .

 

 

proc iml;
use sashelp.class;
read all var _num_ into x[c=vnames];
close;

do i=1 to ncol(x);
 temp=vnames[i];
 names=temp||(strip(temp)+'_lag1':strip(temp)+'_lag4');
 want=lag(x[,i],0:4);

 create ('want_'+temp) from want[c=names];
 append from want;
 close;
end;

quit;

data final_want;
 merge want_:;
run;

sas-innovate-2024.png

Join us for SAS Innovate April 16-19 at the Aria in Las Vegas. Bring the team and save big with our group pricing for a limited time only.

Pre-conference courses and tutorials are filling up fast and are always a sellout. Register today to reserve your seat.

 

Register now!

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

Click image to register for webinarClick image to register for webinar

Classroom Training Available!

Select SAS Training centers are offering in-person courses. View upcoming courses for:

View all other training opportunities.

Discussion stats
  • 5 replies
  • 5097 views
  • 2 likes
  • 4 in conversation