DATA Step, Macro, Functions and more

Looping Multiple Variables Question

Reply
Occasional Contributor
Posts: 14

Looping Multiple Variables Question

I have a dataset with multiple variables for a numeric data-element titled Nr_days ((Nr_days1, Nr_days2, etc.) I use PROC TRANSPOSE to create the dataset and don't know the exact number of variables each time the dataset gets recreated. How can I write a macro that loops through these variables and adds up the value of each into a separate variable? I think I can do the adding portion, I just don't know how to tell SAS to loop until the last variable. Any help would be greatly appreciated.
Super Contributor
Posts: 359

Re: Looping Multiple Variables Question

If I am reading you right you have a variable that is being added to your prefix of NR_DAYS to form your new variable name.

proc sql;
select max(varname) into :lim from xxx; * your source data;
quit;

in your data step use x=sum(of nr_days1 - nr_days&lim ); I left out the OF


Message was edited by: Flip
Occasional Contributor
Posts: 14

Re: Looping Multiple Variables Question

Flip, Thank you for your input. It's great info but not exactly what I wanted. Actually, the PROC TRANSPOSE creates the variables based on my Prefix Nr_Days and in this case 377 of them (NR_DAYS1-NR_DAYS377). Last week, the same TRANSPOSE created 352. What I really want to do is use a macro like %do %Until the end of the variable counter. Then sum the results of each variable into a new variable. How can I tell SAS what the last variable is without putting a specific number there? I don't want to have to say NR-DAYS377.
Super Contributor
Posts: 359

Re: Looping Multiple Variables Question

In your transpose you have a PREFIX= NR_DAYS option.
You should also have an ID statement. Use that ID variable in the SQL I gave you then it should work.


&lim is then the maximum value. If all you want is the sum, no looping is required. Show me your transpose code and I will have a better idea what you are doing.


Message was edited by: Flip
Occasional Contributor
Posts: 14

Re: Looping Multiple Variables Question

My Transpose code is quite simple. Basically, the TEMP file is a text file containing multiple records with NAME and DAYS_AWAY as variables (this data is extracted from a relational database and will have multiple lines of the same person). The transpose groups all the DAYS_AWAY variables and labels them DAYS_AWAY1 through DAYS_AWAY377 (in this example). The out= in the example below equals SIX because I have five other transposes...I merge them all into one dataset. Once I get all the variables I just want to know in a macro statement (not sql) how to determine what that last DAYS_AWAYx variable is. Today I know it's 377 but tomorrow the last variable might be higher or lower. How can I do a macro using %do %until whatever DAYS_AWAYx is?

PROC TRANSPOSE DATA=TEMP OUT=SIX (DROP=_NAME_) PREFIX=DAYS_AWAY;
BY NAME;
VAR DAYS_AWAY;
RUN;
Super Contributor
Posts: 359

Re: Looping Multiple Variables Question

I can tell you half a dozen ways to do this, but don't know a macro function to just give you what you want. The simplest one I know of would be:

data temp;
input name $ days_away;
cards;
fred 12
fred 15
fred 17
fred 23
joe 5
joe 7
joe 12
run;
proc sql;
create table seven as select name, sum(days_away) as sumday from temp group by name;
quit;

PROC TRANSPOSE DATA=TEMP OUT=SIX (DROP=_NAME_) PREFIX=DAYS_AWAY;
BY NAME;
VAR DAYS_AWAY;
RUN;

data six;
merge six seven;
by name;
run;

Otherwise you are going to have to run a datastep or sql to find the maximum number of values by name or find the maximum suffix for the variable name from the sashelp views or dictionary tables. You can parse the variable name from them to get that value.
Super Contributor
Posts: 359

Re: Looping Multiple Variables Question

Try this

proc sql;
select max(a.cnt) into :lim from (select count(days_away) as cnt, name from temp group by name) a ;
quit;

%let lim = &lim;
data eight;
set six;
total = sum(of days_away1 - days_away&lim);
run;
Occasional Contributor
Posts: 14

Re: Looping Multiple Variables Question

That didn't seem to work. It doesn't like the (days_away) as cnt piece because there actually isn't a variable named days_away...the variables are labeled days_away1, days_away2, days_away3, etc. Here's what I ended up doing:

DATA TEMP (KEEP=NAME DAYS_AWAY_SUM DAYS_AWAY1-DAYS_AWAY377);
SET SIX;

DAYS_AWAY_SUM = 0;

%MACRO SUMDAYS;
%DO I = 1 %TO 377;
IF DAYS_AWAY&I NE . THEN DAYS_AWAY_SUM = DAYS_AWAY_SUM + DAYS_AWAY&I;
%END;
%MEND SUMDAYS;
%SUMDAYS;
RUN;

You'll notice in the %DO I = 1 %to 377, I really want to avoid hard-typing the 377 because the transpose will change this number from week-to-week. Last week it was 352.

The answer might be in using PROC SQL but unfortunately, I do not know enough about it to speak intelligently with it.

Thanks for your help, though Smiley Happy

If you ever stumble on a macro way to automate that TO number please let me know.
Super Contributor
Posts: 359

Re: Looping Multiple Variables Question

That sql is run on the data BEFORE the transpose. So there has to be a days_away since it is what you are transposing.

Also you do not need the loop at all since you are just suming. SUM will sum up the values without worry about missings.

Just use
DAYS_AWAY_SUM = sum( of DAYS_WAY1 - DAYS_AWAY&lim);
SAS Super FREQ
Posts: 8,743

Re: Looping Multiple Variables Question

Hi:
Just a finicky teacher here....in order to keep things straight, where macro definition happens, where macro compilation happens, when data step compilations happens, when data step execution happens, my recommendation would be to NOT write your program like this:
[pre]
DATA TEMP (KEEP=NAME DAYS_AWAY_SUM DAYS_AWAY1-DAYS_AWAY377);
SET SIX;

DAYS_AWAY_SUM = 0;

%MACRO SUMDAYS;
%DO I = 1 %TO 377;
IF DAYS_AWAY&I NE . THEN DAYS_AWAY_SUM = DAYS_AWAY_SUM + DAYS_AWAY&I;
%END;
%MEND SUMDAYS;
%SUMDAYS;
RUN;
[/pre]

A more accurate way to create and then invoke the macro program is like this:
[pre]
** macro program is compiled and stored in WORK.SASMACR catalog;
%MACRO SUMDAYS;
%DO I = 1 %TO 377;
IF DAYS_AWAY&I NE . THEN DAYS_AWAY_SUM = DAYS_AWAY_SUM + DAYS_AWAY&I;
%END;
%MEND SUMDAYS;

** now use the compiled macro program to generate 377 IF statements;
DATA TEMP (KEEP=NAME DAYS_AWAY_SUM DAYS_AWAY1-DAYS_AWAY377);
SET SIX;
DAYS_AWAY_SUM = 0;
%SUMDAYS;
RUN;
[/pre]

But, apart from that, if you used a SAS ARRAY, you would not need to generate 377 IF statements using a MACRO %DO loop. Instead, your DATA step DO loop would be:
[pre]
array da days_away:;
...
do i = 1 to dim(da);
if da(i) ne . then d_a_sum = d_a_sum + da(i);
end;
[/pre]

The colon modifier in the variable list makes ALL variables that start "days_away" as array members. This means that you can now have a regular data step DO loop that goes from 1 to the DIM(array-name) -- for my second DATA step program, I did -NOT- have to know that the first program created 19 numbered variables.

Is there a reason you were using a MACRO loop instead of an ARRAY and a DATA step loop????

cynthia
** Here is a program that illustrates the use of ARRAYs;
[pre]
** make data with an array;
** this would be your proc transpose step;
data classarr;
set sashelp.class;
array da days_away1-days_away19;
do i = 1 to 19;
da(i) = age * i + height;
end;
output;
run;

** note that I do not need to know size of array for this step to work;
** but ALL the variables that start with "days_away" will be array members;
** so I named the SUM variable something different.;
data testarr;
set classarr;
array da days_away:;
d_a_sum = 0;
do i = 1 to dim(da);
if da(i) ne . then d_a_sum = d_a_sum + da(i);
end;
run;

ods listing;
proc print data=testarr;
run;
[/pre]
Occasional Contributor
Posts: 14

Re: Looping Multiple Variables Question

Flip, thank you very much. When I ran it in the right order it worked perfectly.
Occasional Contributor
Posts: 14

Re: Looping Multiple Variables Question

Cynthia,

Thank you so much. This is really the way I started this project but could not figure out how to represent that last number correctly. I didn't want to have to run code to determine how many variables were made each time I did the transpose.

Thank you so much.

I have learned quite a bit from both you and Flip in just the last 2 hours...thank you again Smiley Happy
Contributor
Posts: 29

Re: Looping Multiple Variables Question

Whenever you have a data set with columns like COL1, COL2, COL3 etc and you know they are numeric and you want to add them up in a data step (by creating a new column) you can do the following:
[pre]
data exp;
set exp;

sumofAllCol=sum(of COLSmiley Happy;

run;
[/pre]

Note:
You don't have to know the how many COLn you have!

COL: will add up any column with the prefix COL.
Ask a Question
Discussion stats
  • 12 replies
  • 223 views
  • 0 likes
  • 4 in conversation