BookmarkSubscribeRSS Feed
🔒 This topic is solved and locked. Need further help from the community? Please sign in and ask a new question.
sasanon1664
Calcite | Level 5

Hello everyone,

 

I am currently using hash objects to join tables and I would like to loop through 5 tables (e.g. Table_2010, ..., Table_2015) inside my DATA step. To do that, I would like to concatenante the loop counter with a string, let's say "Table_".

 

My progress so far is the following :

 

DATA TABLE_2010;
   INPUT ID VAL;
   DATALINES;
1  63
2  22
3  71
;
RUN;
DATA TABLE_2011; INPUT ID VAL; DATALINES; 4 68 5 55 6 54 ; RUN; DATA RES; DO i = 2010 TO 2011; DO UNTIL (EOF) ; SET 'TABLE_'&i END = EOF; OTHER_VAL = CATS('TEST',i); OUTPUT; END; END; STOP; RUN;

This code doesn't currently work (otherwise I wouldn't be posting here) and I can see at least 3 issues :

 

  1. If I run this, the first error in the log indicates "Apparent symbolic reference i not resolved."
  2. The loop looks like it stops after 2010, I suspect EOF is the cause of this and should be replaced by EOF_2010 and EOF_2011.
  3. I'm not sure wheter STOP; instruction should be put before or after the END; of the outside loop.

Replacing "TABLE_&i" with CATS("TABLE_",i) doesn't work either ("Invalid option name "TABLE_" error).

 

I am kind of lost here and any help would be appreciated :).

1 ACCEPTED SOLUTION

Accepted Solutions
Kurt_Bremser
Super User

Write this macro:

%MACRO one_table(year);
DO UNTIL (EOF_&year) ;
  SET TABLE_&year END = EOF_&year;
  CALL MISSING(NAME);
  rc = h.find ();
  OUTPUT;
END;
%MEND;

and call it for every year:

%one_table(2010)
%one_table(2011)

or use a macro %do loop within the macro to repeat the code.

This is what the macro engine is for.

View solution in original post

10 REPLIES 10
AhmedAl_Attar
Ammonite | Level 13

Hi,

 

To mu knowledge, the SET statement is executed at compilation time, therefore you can not just change the table name on the fly!

 

Here is an alternative approach that could allow you to do that.

 

data _null_;
    do table='table1','table2','table3' ; /* Specify your list of tables */
        rc = open(table);
        if (not rc) then
        do;
            m=sysmsg();
              put m;
        end;
        else

        do;

            /* .... Some Other SAS Statements ....*/
            rc=close(rc);

       end;
    end;
run;

 

Hope this helps,

Ahmed

Kurt_Bremser
Super User

Let´s dig a little into how the SAS interpreter works:

- program text is gathered

- if a macro reference (¨&¨or ¨%¨)is encountered, text is handed over to the macro processor

- if text for a data step is encountered, the data step is compiled and

- executed

 

So your &i is handed over to the macro processor LONG before the data step even starts to run, and would need for the macro variable i to already be defined.

 

Also, as alreay mentioned, theSET statement within a data step can not be done dynamically. The reason is that the table named in the SET has a major influence on the data structures present in the data step, which are determined at compile time.

 

To make your step dynamic, I suggest the following:

%macro dynamic_sets;

data res;

set

%do i = 2010 %to 2011;

  table_&i (in=in&i)

%end;

;

%do i = 2010 %to 2011;

if in&i then other_val = ¨test&i¨;

%end;

run;

%mend;

%dynamic_sets;

RW9
Diamond | Level 26 RW9
Diamond | Level 26

Could you post test data and required output.  Its not entirely clear to me what your attempting to do here, why can you not use indsname for this?

data want;
  length store $20;
  set table_: indsname=dsname;
  store=scan(dsname,2,".");
run;

Steelers_In_DC
Barite | Level 11

The way you have it set up isn't logically correct.  Here it is broken down into two steps.  Hope this helps:

 

DATA TABLE_2010;
   INPUT ID VAL;
   DATALINES;
1  63
2  22
3  71
;
 
DATA TABLE_2011;
   INPUT ID VAL;
   DATALINES;
4  68
5  55
6  54
;
 
%macro first;
%do i = 2010 %to 2011;
data table_&i;
set table_&i;
other_val = "test_&i";
run;
%end;
%mend;
%first;

data final;
%macro second;
set

%do i = 2010 %to 2011;
 table_&i
%end;
%mend;
%second;
;
run;

 

I would recommend writting this out long hand and seeing what you can do without macros first. 

 

Mark

sasanon1664
Calcite | Level 5

Thank you everyone for your answers, especially Kurt for the explanation on how things work.

 

I feel like I should provide a more complete code and here it is :

 

DATA SMALL;
   INPUT ID NAME $;
   DATALINES;
1 Red
2 Green
3 Blue
;
RUN;
DATA TABLE_2010; INPUT ID YEAR VAL; DATALINES; 1 2010 63 2 2010 22 3 2010 71 ; RUN;
DATA TABLE_2011; INPUT ID YEAR VAL; DATALINES; 1 2011 68 2 2011 55 3 2011 54 ; RUN; DATA WANT (drop = rc); declare hash h (hashexp:16); rc = h.DefineKey ('ID'); rc = h.DefineData ('NAME'); rc = h.DefineDone (); do until (eof1); set SMALL end = eof1; rc = h.add (); end; DO UNTIL (EOF_2010) ; SET TABLE_2010 END = EOF_2010; CALL MISSING(NAME); rc = h.find (); OUTPUT; END; DO UNTIL (EOF_2011) ; SET TABLE_2011 END = EOF_2011; CALL MISSING(NAME); rc = h.find (); OUTPUT; END; STOP; RUN;

This part works as intended but my goal is to go through TABLE_2010 and TABLE_2011 without having to write the same code (DO UNTIL... END;) twice in this case. That's why some of the suggestions won't work, because their alter my code too much for what I want to do. I think I can't wrap everything in a macro because it will load the SMALL table in memory twice, which I am trying to avoid (there are actually more like 20 tables loaded and some of them are over 100 000 rows). The yearly tables are about 1M rows each.

 

I hope this makes things clearer for you.

Kurt_Bremser
Super User

Write this macro:

%MACRO one_table(year);
DO UNTIL (EOF_&year) ;
  SET TABLE_&year END = EOF_&year;
  CALL MISSING(NAME);
  rc = h.find ();
  OUTPUT;
END;
%MEND;

and call it for every year:

%one_table(2010)
%one_table(2011)

or use a macro %do loop within the macro to repeat the code.

This is what the macro engine is for.

RW9
Diamond | Level 26 RW9
Diamond | Level 26

Nope, I am sorry, I still do not see the point of all that code.  What you are outputting is a union of the second and third dataset merged with the first.  This can be simply written as:

proc sql;
  create table WANT2 as
  select A.*,
            B.NAME
  from (select * from TABLE_2010 union all select * from TABLE_2011) A
  left join SMALL B
  on A.ID=B.ID
  order by YEAR,ID;
quit;

 

Please also note that writing everything in uppercase and not finishing datasteps makes your code harder to read.

sasanon1664
Calcite | Level 5

Thank you Kurt, this solution seems to be working on my example, I will try to make it work in my code :). One question though : is it wrong to write the entire macro inside the data step or should I only call it?

 

As for RW9's suggestion, my point is to avoid using SQL joins because they take way too much of time compared to joining using hash objects. Why should I do something in one hour when it can be done in two minutes (those are real numbers)? Moreover, thank you for pointing out that I forgot to finish my datasteps, I will correct them. As for writing in uppercase, I would gladly take any directions on what to write in upper/lowercase because I have no knowledge on that matter.

RW9
Diamond | Level 26 RW9
Diamond | Level 26

Just use datastep version then:

data inter;

   set table:;

run;

proc sort data=inter;

  by id;

run;

data want;

  merge inter small;

  by id;

run;

 

As for GPP, its something you will see on a lot of code here.  Avoid using upper case (except SQL) as its harder to read.  Finishing datasteps, consistent indents using spaces rather than tabs (although the forum here is bad for code formatting).  Just helps reading code 🐵

sasanon1664
Calcite | Level 5

Thank you for the tips RW9, they are duly noted.

 

I also advise you to read the followings articles which compare hash tables vs SQL joins vs merge : http://support.sas.com/resources/papers/proceedings09/071-2009.pdf and http://www2.sas.com/proceedings/sugi31/244-31.pdf

SAS Innovate 2025: Save the Date

 SAS Innovate 2025 is scheduled for May 6-9 in Orlando, FL. Sign up to be first to learn about the agenda and registration!

Save the date!

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

SAS Training: Just a Click Away

 Ready to level-up your skills? Choose your own adventure.

Browse our catalog!

Discussion stats
  • 10 replies
  • 2411 views
  • 1 like
  • 5 in conversation