DATA Step, Macro, Functions and more

Resolving loop counter inside a DATA step without using MACRO.

Accepted Solution Solved
Reply
New Contributor
Posts: 4
Accepted Solution

Resolving loop counter inside a DATA step without using MACRO.

[ Edited ]

Hello everyone,

 

I am currently using hash objects to join tables and I would like to loop through 5 tables (e.g. Table_2010, ..., Table_2015) inside my DATA step. To do that, I would like to concatenante the loop counter with a string, let's say "Table_".

 

My progress so far is the following :

 

DATA TABLE_2010;
   INPUT ID VAL;
   DATALINES;
1  63
2  22
3  71
;
RUN;
DATA TABLE_2011; INPUT ID VAL; DATALINES; 4 68 5 55 6 54 ; RUN; DATA RES; DO i = 2010 TO 2011; DO UNTIL (EOF) ; SET 'TABLE_'&i END = EOF; OTHER_VAL = CATS('TEST',i); OUTPUT; END; END; STOP; RUN;

This code doesn't currently work (otherwise I wouldn't be posting here) and I can see at least 3 issues :

 

  1. If I run this, the first error in the log indicates "Apparent symbolic reference i not resolved."
  2. The loop looks like it stops after 2010, I suspect EOF is the cause of this and should be replaced by EOF_2010 and EOF_2011.
  3. I'm not sure wheter STOP; instruction should be put before or after the END; of the outside loop.

Replacing "TABLE_&i" with CATS("TABLE_",i) doesn't work either ("Invalid option name "TABLE_" error).

 

I am kind of lost here and any help would be appreciated Smiley Happy.


Accepted Solutions
Solution
‎09-25-2015 06:23 AM
Super User
Posts: 6,946

Re: Resolving loop counter inside a DATA step without using MACRO.

Write this macro:

%MACRO one_table(year);
DO UNTIL (EOF_&year) ;
  SET TABLE_&year END = EOF_&year;
  CALL MISSING(NAME);
  rc = h.find ();
  OUTPUT;
END;
%MEND;

and call it for every year:

%one_table(2010)
%one_table(2011)

or use a macro %do loop within the macro to repeat the code.

This is what the macro engine is for.

---------------------------------------------------------------------------------------------
Maxims of Maximally Efficient SAS Programmers

View solution in original post


All Replies
Regular Contributor
Posts: 213

Re: Resolving loop counter inside a DATA step without using MACRO.

Hi,

 

To mu knowledge, the SET statement is executed at compilation time, therefore you can not just change the table name on the fly!

 

Here is an alternative approach that could allow you to do that.

 

data _null_;
    do table='table1','table2','table3' ; /* Specify your list of tables */
        rc = open(table);
        if (not rc) then
        do;
            m=sysmsg();
              put m;
        end;
        else

        do;

            /* .... Some Other SAS Statements ....*/
            rc=close(rc);

       end;
    end;
run;

 

Hope this helps,

Ahmed

Super User
Posts: 6,946

Re: Resolving loop counter inside a DATA step without using MACRO.

Let´s dig a little into how the SAS interpreter works:

- program text is gathered

- if a macro reference (¨&¨or ¨%¨)is encountered, text is handed over to the macro processor

- if text for a data step is encountered, the data step is compiled and

- executed

 

So your &i is handed over to the macro processor LONG before the data step even starts to run, and would need for the macro variable i to already be defined.

 

Also, as alreay mentioned, theSET statement within a data step can not be done dynamically. The reason is that the table named in the SET has a major influence on the data structures present in the data step, which are determined at compile time.

 

To make your step dynamic, I suggest the following:

%macro dynamic_sets;

data res;

set

%do i = 2010 %to 2011;

  table_&i (in=in&i)

%end;

;

%do i = 2010 %to 2011;

if in&i then other_val = ¨test&i¨;

%end;

run;

%mend;

%dynamic_sets;

---------------------------------------------------------------------------------------------
Maxims of Maximally Efficient SAS Programmers
Super User
Super User
Posts: 7,407

Re: Resolving loop counter inside a DATA step without using MACRO.

Could you post test data and required output.  Its not entirely clear to me what your attempting to do here, why can you not use indsname for this?

data want;
  length store $20;
  set table_: indsname=dsname;
  store=scan(dsname,2,".");
run;

Valued Guide
Posts: 858

Re: Resolving loop counter inside a DATA step without using MACRO.

The way you have it set up isn't logically correct.  Here it is broken down into two steps.  Hope this helps:

 

DATA TABLE_2010;
   INPUT ID VAL;
   DATALINES;
1  63
2  22
3  71
;
 
DATA TABLE_2011;
   INPUT ID VAL;
   DATALINES;
4  68
5  55
6  54
;
 
%macro first;
%do i = 2010 %to 2011;
data table_&i;
set table_&i;
other_val = "test_&i";
run;
%end;
%mend;
%first;

data final;
%macro second;
set

%do i = 2010 %to 2011;
 table_&i
%end;
%mend;
%second;
;
run;

 

I would recommend writting this out long hand and seeing what you can do without macros first. 

 

Mark

New Contributor
Posts: 4

Re: Resolving loop counter inside a DATA step without using MACRO.

[ Edited ]

Thank you everyone for your answers, especially Kurt for the explanation on how things work.

 

I feel like I should provide a more complete code and here it is :

 

DATA SMALL;
   INPUT ID NAME $;
   DATALINES;
1 Red
2 Green
3 Blue
;
RUN;
DATA TABLE_2010; INPUT ID YEAR VAL; DATALINES; 1 2010 63 2 2010 22 3 2010 71 ; RUN;
DATA TABLE_2011; INPUT ID YEAR VAL; DATALINES; 1 2011 68 2 2011 55 3 2011 54 ; RUN; DATA WANT (drop = rc); declare hash h (hashexp:16); rc = h.DefineKey ('ID'); rc = h.DefineData ('NAME'); rc = h.DefineDone (); do until (eof1); set SMALL end = eof1; rc = h.add (); end; DO UNTIL (EOF_2010) ; SET TABLE_2010 END = EOF_2010; CALL MISSING(NAME); rc = h.find (); OUTPUT; END; DO UNTIL (EOF_2011) ; SET TABLE_2011 END = EOF_2011; CALL MISSING(NAME); rc = h.find (); OUTPUT; END; STOP; RUN;

This part works as intended but my goal is to go through TABLE_2010 and TABLE_2011 without having to write the same code (DO UNTIL... ENDSmiley Wink twice in this case. That's why some of the suggestions won't work, because their alter my code too much for what I want to do. I think I can't wrap everything in a macro because it will load the SMALL table in memory twice, which I am trying to avoid (there are actually more like 20 tables loaded and some of them are over 100 000 rows). The yearly tables are about 1M rows each.

 

I hope this makes things clearer for you.

Solution
‎09-25-2015 06:23 AM
Super User
Posts: 6,946

Re: Resolving loop counter inside a DATA step without using MACRO.

Write this macro:

%MACRO one_table(year);
DO UNTIL (EOF_&year) ;
  SET TABLE_&year END = EOF_&year;
  CALL MISSING(NAME);
  rc = h.find ();
  OUTPUT;
END;
%MEND;

and call it for every year:

%one_table(2010)
%one_table(2011)

or use a macro %do loop within the macro to repeat the code.

This is what the macro engine is for.

---------------------------------------------------------------------------------------------
Maxims of Maximally Efficient SAS Programmers
Super User
Super User
Posts: 7,407

Re: Resolving loop counter inside a DATA step without using MACRO.

Nope, I am sorry, I still do not see the point of all that code.  What you are outputting is a union of the second and third dataset merged with the first.  This can be simply written as:

proc sql;
  create table WANT2 as
  select A.*,
            B.NAME
  from (select * from TABLE_2010 union all select * from TABLE_2011) A
  left join SMALL B
  on A.ID=B.ID
  order by YEAR,ID;
quit;

 

Please also note that writing everything in uppercase and not finishing datasteps makes your code harder to read.

New Contributor
Posts: 4

Re: Resolving loop counter inside a DATA step without using MACRO.

[ Edited ]

Thank you Kurt, this solution seems to be working on my example, I will try to make it work in my code Smiley Happy. One question though : is it wrong to write the entire macro inside the data step or should I only call it?

 

As for RW9's suggestion, my point is to avoid using SQL joins because they take way too much of time compared to joining using hash objects. Why should I do something in one hour when it can be done in two minutes (those are real numbers)? Moreover, thank you for pointing out that I forgot to finish my datasteps, I will correct them. As for writing in uppercase, I would gladly take any directions on what to write in upper/lowercase because I have no knowledge on that matter.

Super User
Super User
Posts: 7,407

Re: Resolving loop counter inside a DATA step without using MACRO.

Just use datastep version then:

data inter;

   set table:;

run;

proc sort data=inter;

  by id;

run;

data want;

  merge inter small;

  by id;

run;

 

As for GPP, its something you will see on a lot of code here.  Avoid using upper case (except SQL) as its harder to read.  Finishing datasteps, consistent indents using spaces rather than tabs (although the forum here is bad for code formatting).  Just helps reading code Smiley Surprised)

New Contributor
Posts: 4

Re: Resolving loop counter inside a DATA step without using MACRO.

Thank you for the tips RW9, they are duly noted.

 

I also advise you to read the followings articles which compare hash tables vs SQL joins vs merge : http://support.sas.com/resources/papers/proceedings09/071-2009.pdf and http://www2.sas.com/proceedings/sugi31/244-31.pdf

☑ This topic is SOLVED.

Need further help from the community? Please ask a new question.

Discussion stats
  • 10 replies
  • 425 views
  • 1 like
  • 5 in conversation