BookmarkSubscribeRSS Feed
☑ This topic is solved. Need further help from the community? Please sign in and ask a new question.
Eileen1496
Obsidian | Level 7

Hi everyone,

 

I'm trying to create a 2 layer loop to do the following task, but I cannot figure how to write it. My goal is:

I have sas files position0-position15, in each position file I have user_id start with 0 to user_id start with 9 (user_id is character variable). I want to loop through all position&j files and put user_id start with the same number i (i = 0-9) across position1-position15 in the same file called user_id&i, how can I do this in SAS loop?

 

I'm able to write one layer of loop (loop from 0 to 9 within position0 file:

%macro uid_st_pos0;
%do i=0 %to 9;
  data temp.position0_select&i;
    set raw.individual_position_0 (keep=user_id ULTIMATE_PARENT_RCID STATE STARTDATE SENIORITY SALARY
rn region rcid position_id naics msa job_category enddate country);
    where substr(user_id, 1, 1) = "&i";
  run;
%end;

%mend uid_st_pos0;
%uid_st_pos0;

My attempt for 2 layers is as follow, I'm not sure how to add macro here: 

* do the same for position file 0-15;
%do j = 0 to 15; 
	%do i=0 %to 9;
	  data temp.position&j_select&i;
	    set raw.individual_position_&j (keep=user_id ULTIMATE_PARENT_RCID STATE STARTDATE SENIORITY SALARY
	rn region rcid position_id naics msa job_category enddate country);
	    where substr(user_id, 1, 1) = "&i";
	  run;
	%end;
%end;
run;
1 ACCEPTED SOLUTION

Accepted Solutions
Quentin
Super User

@Eileen1496 wrote:

Hi Quentin,

 

Thanks for your reply. I do miss % there as you pointed out when I'm editing the post. When I'm running my original code, the error message is ERROR: The %DO statement is not valid in open code


Ahh, that error message is clear.  I always forget that %DO loops are now allowed outside of a macro definition.

 

If you want to use nested %DO loops, you cannot use them in "open code".  You have to define a macro, as shown in my little example, and then you can use nested %DO loops inside of the macro definition.

BASUG is hosting free webinars Next up: Mike Raithel presenting on validating data files on Wednesday July 17. Register now at the Boston Area SAS Users Group event page: https://www.basug.org/events.

View solution in original post

12 REPLIES 12
PaigeMiller
Diamond | Level 26

Allow me to provide a "thinking out of the box" solution.

 

My solution is don't use macros. Simply combine the data sets together with the SET command, into one great big data set, then use DATA step commands (like ARRAY and data step DO loops) to extract and manipulate the data you want.

 

Beyond that, working with one big data set is also easier than breaking the data up into many many many data sets named something like this:

 

temp.position0_select&i;

 

 

--
Paige Miller
Eileen1496
Obsidian | Level 7

Hi PaigeMiller,

 

Thank you! I also think about this, but I'm worried in this way: 1. the dataset after combining them all together is too large 2. thus if i process any code on the large data set, it would take a long time, if something went wrong, i have to wait for a long time to realize that and redo everything. Hence, I want to try splitting small dataset. But yes in this case, combing and apply my correct code works. 

PaigeMiller
Diamond | Level 26

@Eileen1496 wrote:

Hi PaigeMiller,

 

Thank you! I also think about this, but I'm worried in this way: 1. the dataset after combining them all together is too large 2. thus if i process any code on the large data set, it would take a long time, if something went wrong, i have to wait for a long time to realize that and redo everything. Hence, I want to try splitting small dataset. But yes in this case, combing and apply my correct code works. 


@Eileen1496 

 

Please explain the background and context of what you are doing, please explain the background of the actual large problem you are trying to solve. Do not limit the discussion to simply programming issues. I can't help further if I don't know what you are doing and why you are doing it.

 

If you want to test code without waiting a long time, you can use the large data set with data set options such as OBS=1000, which will run your code on the first 1000 data points to see if it works. Splitting data up in the long run will take a lot longer (for the computer to run, and a lot longer to program it properly) than working with just one data set.

--
Paige Miller
Astounding
PROC Star

This statement is sure to cause a problem:

data temp.position&j_select&i;

Here, SAS doesn't know whether the name of the macro variable should be &j or &j_ or &j_s (etc.).  To let SAS know, add a dot when referring to it:

data temp.position&j._select&i;
Eileen1496
Obsidian | Level 7

Hi Astounding,

 

Thanks for pointing this out, it is my first time writing a loop in SAS. I thought &j would tell the software to put the value of j (e.g. j now equals to 3) at the place of "&j". Just to make sure I understand, are you saying even I just define a macro variable called j, it is still possible for SAS to think j_s is the macro variable, and thus SAS could not find my definition of this j_s macro (because I only make macro j, not j_s), thus reporting an error?

Quentin
Super User

@Eileen1496 wrote:

Hi Astounding,

 

Thanks for pointing this out, it is my first time writing a loop in SAS. I thought &j would tell the software to put the value of j (e.g. j now equals to 3) at the place of "&j". Just to make sure I understand, are you saying even I just define a macro variable called j, it is still possible for SAS to think j_s is the macro variable, and thus SAS could not find my definition of this j_s macro (because I only make macro j, not j_s), thus reporting an error?


Yes, you can test with code like:

%let i=1 ;
%let j=2 ;
%put position&j_select&i  ;

The log shows that SAS looked for a macro variable named j_select:

1    %let i=1 ;
2    %let j=2 ;
3    %put position&j_select&i  ;
WARNING: Apparent symbolic reference J_SELECT not resolved.
position&j_select1

You need a dot after &j to tell SAS that the name of the macro variable has ended:

%let i=1 ;
%let j=2 ;
%put position&j._select&i  ;
BASUG is hosting free webinars Next up: Mike Raithel presenting on validating data files on Wednesday July 17. Register now at the Boston Area SAS Users Group event page: https://www.basug.org/events.
Tom
Super User Tom
Super User

You appear to be going from 16 to 160 datasets. WHY?

Why do you have 16 input datasets with the same type of information?

Why do you want to generate 160 more datasets?

 

I am not sure I understand what you are doing but I doubt that re-reading the same datasets 10 times is needed.

 

To read in 16 datasets with a numeric suffix you just need a single SET statement.

set raw.individual_position_0 - raw.individual_position_15;

To write out multiple datasets just list them all on the DATA statement.  You cannot use a range of daasets there so you will need to type them all out.  You perhaps use macro code to generate the list, but you could also just use a data step to write the code using CALL EXECUTE() or just writing to a text file and include it.

 

Then in the body of the data step just write to different output datasets based on your criteria.

Sounds like perhaps the criteria about which file to write is based on the first character of SUBJID and perhaps the value POSITION_ID?

data 
  temp.position0_select0
  temp.position0_select1
...
  temp.position1_select0
...
  temp.position15_select9
;
  set raw.individual_position_0 - raw.individual_position_9;
  keep user_id ULTIMATE_PARENT_RCID STATE STARTDATE SENIORITY SALARY
	rn region rcid position_id naics msa job_category enddate country
   ;
   select (10*POSITION_ID+input(user_id,1.));
     when (0) output temp.position_0_select0;
     when (1) output temp.position_0_select1;
...
     when (159) output temp_position_15_select9;
     otherwise;
   end;
run;

 

Eileen1496
Obsidian | Level 7

So there are 2 hierachies, one is position files, I have 16 files in total, they are called position_0 to position_15. Each of them contain the same sets of variables, one of which is user_id. User_id start with 0-9. So for position_j, I would split it into 10 files, each file contains user_id start with i (i is one of the 0-9 value), and name this position&j._userid&i. Since I have 16 such position files in total, there should be 16*10 position&j._userid&i files.

 

Quentin
Super User

Agree with the advice from @PaigeMiller . 

 

The good news is, it looks like you have the same list of variables in all your input datasets.  So you can combine them together with  a step like:

 

data temp.positions;
  set raw.individual_position_: (keep=user_id ULTIMATE_PARENT_RCID STATE STARTDATE SENIORITY SALARY rn region rcid position_id naics msa job_category enddate country);
run;

Then, after you have all the data in one dataset, you can use a WHERE statement to filter by userid, or use by group processing, etc.

 

You could even split it up into multiple datasets.  But it's rare that having data split into multiple datasets is helpful.

 

That said, the macro language does support having nested loops, e.g.:

 

%macro try() ;
  %local i j ;

  %do i = 0 %to 15; 
    %do j=0 %to 9;
      %put &=i &=j ;
    %end;
  %end;
%mend try ;

%try()

You don't say what is going wrong in your original code.  Are you getting errors in the log ?  Or just the wrong result.  I did notice in the code you posted there is a % sign missing on the TO for the first %DO loop:

%do j = 0 to 15; 

That would cause a macro compilation error, which would be written to the log.

BASUG is hosting free webinars Next up: Mike Raithel presenting on validating data files on Wednesday July 17. Register now at the Boston Area SAS Users Group event page: https://www.basug.org/events.
Eileen1496
Obsidian | Level 7

Hi Quentin,

 

Thanks for your reply. I do miss % there as you pointed out when I'm editing the post. When I'm running my original code, the error message is ERROR: The %DO statement is not valid in open code

Quentin
Super User

@Eileen1496 wrote:

Hi Quentin,

 

Thanks for your reply. I do miss % there as you pointed out when I'm editing the post. When I'm running my original code, the error message is ERROR: The %DO statement is not valid in open code


Ahh, that error message is clear.  I always forget that %DO loops are now allowed outside of a macro definition.

 

If you want to use nested %DO loops, you cannot use them in "open code".  You have to define a macro, as shown in my little example, and then you can use nested %DO loops inside of the macro definition.

BASUG is hosting free webinars Next up: Mike Raithel presenting on validating data files on Wednesday July 17. Register now at the Boston Area SAS Users Group event page: https://www.basug.org/events.

Ready to join fellow brilliant minds for the SAS Hackathon?

Build your skills. Make connections. Enjoy creative freedom. Maybe change the world. Registration is now open through August 30th. Visit the SAS Hackathon homepage.

Register today!
How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

Click image to register for webinarClick image to register for webinar

Classroom Training Available!

Select SAS Training centers are offering in-person courses. View upcoming courses for:

View all other training opportunities.

Discussion stats
  • 12 replies
  • 1754 views
  • 0 likes
  • 5 in conversation