BookmarkSubscribeRSS Feed
🔒 This topic is solved and locked. Need further help from the community? Please sign in and ask a new question.
Phil_NZ
Barite | Level 11

Hi SAS User, I face a problem about DO LOOP inside macro.

 

I generate sheet1, sheet2_outx, sheet3_outx, sheet4_outx as the code below

%macro ImportAndTranspose(
      File=
      , StartSheet=
      , EndSheet=
   );
   
   %local i;
   
   %do i = &StartSheet. %to &EndSheet.;

    %if &i=1 %then %do;
	
    proc import datafile= "&File." 
                  out= sheet&i.
                  dbms= xlsx 
                  replace;
          range= "Sheet1$A:X";
          getnames= yes;
      run;
      
      proc sort data= sheet&i.;
      by Type;
      run;
	%end;
    %else %if &i ne 1  %then %do;
      proc import datafile= "&File." 
                  out= sheet&i.
                  dbms= xlsx 
                  replace;
          range= "Sheet&i.$A:AG";
          getnames= yes;
      run;
      
      proc sort data= sheet&i.;
      by Type;
      run;

      proc transpose data= sheet&i. 
            out= sheet&i._out(rename=(COL1=s&i. _NAME_=Year) drop=_label_);
         by Type;
         VAR '1988'N - '2019'N;
      run;
	  proc print data=sheet&i._out;
	  run;

      data sheet&i._outx;
      set sheet&i._out;

      if s&i. not in: ('NA', '$$') then s&i.2=input(s&i., 32.);
	  drop s&i.;
      run;
	  %end;


   %end;
   
%mend;

%ImportAndTranspose(
      File= C:\Users\pnguyen\Desktop\New folder\Argentina__.xlsx
      , StartSheet= 1
      , EndSheet= 4);

So, Sheet1 will have 25 columns including: Type, ename, name, mnem, dscd, bdate. like the picture below

1.PNG

And sheet2_outx, sheet3_outx, sheet4_outx are similar as below (just difference in s22, s32, s42)

My97_0-1609753762905.png

My aim is to write a SQL to merge sheet1 and sheet2_outx, sheet3_outx, sheet4_outx, to get the result like that:

Type    ename                                   mnem    dscd    Year    s22   s32 s42
131566	CENTRAL PUERTO B DEAD - DUPL SEE 134645	NA	131566	1988	.     .   .
131566	CENTRAL PUERTO B DEAD - DUPL SEE 134645	NA	131566	1989	.     .   .
131566	CENTRAL PUERTO B DEAD - DUPL SEE 134645	NA	131566	1990	.     .   .
131566	CENTRAL PUERTO B DEAD - DUPL SEE 134645	NA	131566	1991	.     .   .
131566	CENTRAL PUERTO B DEAD - DUPL SEE 134645	NA	131566	1992	.     .   .
131566	CENTRAL PUERTO B DEAD - DUPL SEE 134645	NA	131566	1993	.     .   .
131566	CENTRAL PUERTO B DEAD - DUPL SEE 134645	NA	131566	1994	.     .   .
131566	CENTRAL PUERTO B DEAD - DUPL SEE 134645	NA	131566	1995	.     .   .
131566	CENTRAL PUERTO B DEAD - DUPL SEE 134645	NA	131566	1996	.     .   .

I tried to wrote a macro DO LOOP as below

%macro sqlloop (start, end);
	 proc sql;
	 create Table Argentina as
	 %do i = &start.+1 %to &end.;
	 select a.Type, a.ename,a.mnem,a.dscd, b.Year ,b.s&i.2
     from sheet1 as a, sheet&i._outx as b 
	 where a.Type=b.Type;
	 %end;
	 quit;
	 %mend;

	 %sqlloop(start=1,end=4);

However, the results is just like that 

My97_1-1609754459633.png

 

There is no error so far but it is not the result I want

I have a couple of questions about the situation:

1> I am not sure if there is anything wrong with my code?

2> In my macro for DO LOOP, it means that the program needs to run through sheet2_outx, sheet3_outx, sheet4_outx. Is there any chance that we can make it run only one time?

3> Is there any chance that I can put the macro sqlloop inside macro ImportandTranspose to get the result I want?

 

 

Many thanks in advance.

 

 
Thank you for your help, have a fabulous and productive day! I am a novice today, but someday when I accumulate enough knowledge, I can help others in my capacity.
1 ACCEPTED SOLUTION

Accepted Solutions
s_lassen
Meteorite | Level 14

To see if something is wrong with the generated code, you can

  1. Set OPTION MPRINT and see in the log what is generated
  2. Look through the log to see if there are any error messages

What your code seems to do is this:

  • It creates a table named Argentina, which contains the data from SHEET1 and SHEET2_OUTX
  • It prints the data from merging the data from SHEET1 with the data from the other sheets one by one (you may not see the print if you have ODS ALL OFF).

It is probably a lot easier to do this as a couple of data steps. I think what you want comes down to something like this:

data merge1;
  merge 
    sheet1(keep=Type ename mnem dscd) 
    sheet2_out(keep=type year s22)
    ;
  by type;
run;

data want;
  merge
    merge1
    sheet3_out(keep=type year s32)
    sheet4_out(keep=type year s42)
    ;
  by type year;
run;

If that works, you can then turn it into a macro, so that you can easily increase the number of sheets.

 

That is the general secret to writing macros that work: First, write code that works. Then, generalize it with macro statements.

View solution in original post

6 REPLIES 6
Kurt_Bremser
Super User

As a first step, write the code without(!) any macro coding. Hint: you will have to join the "secondary" tables first on type and year, then join the result to the "primary" table on type alone.

Only when you have that working code before your eyes, you will be able to see how to automate it.

Phil_NZ
Barite | Level 11

Thank you very much for your insightful idea @Kurt_Bremser  @ballardw  @s_lassen .

 

I also chopped the code down to normalize code rather than being stuck in macro as suggested.

Hi @s_lassen , your code worked perfectly:

 

1.PNG

 

 

However, I want to challenge myself quite a bit by using proc SQL in this case, but the new code written took SAS a lot of time and performed wrongly as well as costing a lot of operating memory (more than 110 GB). However, I decided to face the problem.

 

Here is my code for merging (following @Kurt_Bremser 's suggestion, I merge the "secondary" first then merge with the "first"), as below

proc sql;
create table merge1 as select *
from
  sheet2_outx
  full join
  sheet3_outx
  on sheet2_outx.Type=sheet3_outx.Type
  full join
  sheet4_out
  on sheet3_outx.Type=sheet4_out.Type

;
quit;

proc sql;
create table merge2 as 

select a.*, b.*

from sheet1 as a, merge1 as b;
quit;

The log so far

 

WARNING: Variable Type already exists on file WORK.MERGE1.
WARNING: Variable Year already exists on file WORK.MERGE1.
WARNING: Variable Type already exists on file WORK.MERGE1.
WARNING: Variable Year already exists on file WORK.MERGE1.
NOTE: Table WORK.MERGE1 created, with 14843904 rows and 5 columns.

40         quit;
NOTE: PROCEDURE SQL used (Total process time):
      real time           2.88 seconds
      cpu time            1.76 seconds
      

41         
42         proc sql;
43         create table merge2 as
44         
45         select a.*, b.*
46         
2                                                          The SAS System                             07:57 Tuesday, January 5, 2021

47         from sheet1 as a, merge1 as b;
NOTE: The execution of this query involves performing one or more Cartesian product joins that can not be optimized.
WARNING: Variable Type already exists on file WORK.MERGE2.

And it took SAS a long period of time to execute this code, similar to the comment that "SQL processes the entire table in memory. As the size of your tables increase you may experience performance degradation.  MERGE processes a row at a time so rarely has memory limitations." from ArtC via the link https://communities.sas.com/t5/SAS-Programming/difference-between-merg-and-proc-sql-join/td-p/33399#....

In my case, I have 34 sheets in a file and merge them together, so do you think I should stick to MERGE or PROC SQL in case? 

 

Many thanks in advance and cheers.

 
Thank you for your help, have a fabulous and productive day! I am a novice today, but someday when I accumulate enough knowledge, I can help others in my capacity.
s_lassen
Meteorite | Level 14

To see if something is wrong with the generated code, you can

  1. Set OPTION MPRINT and see in the log what is generated
  2. Look through the log to see if there are any error messages

What your code seems to do is this:

  • It creates a table named Argentina, which contains the data from SHEET1 and SHEET2_OUTX
  • It prints the data from merging the data from SHEET1 with the data from the other sheets one by one (you may not see the print if you have ODS ALL OFF).

It is probably a lot easier to do this as a couple of data steps. I think what you want comes down to something like this:

data merge1;
  merge 
    sheet1(keep=Type ename mnem dscd) 
    sheet2_out(keep=type year s22)
    ;
  by type;
run;

data want;
  merge
    merge1
    sheet3_out(keep=type year s32)
    sheet4_out(keep=type year s42)
    ;
  by type year;
run;

If that works, you can then turn it into a macro, so that you can easily increase the number of sheets.

 

That is the general secret to writing macros that work: First, write code that works. Then, generalize it with macro statements.

ballardw
Super User

Your do loop in the Proc SQL is equivalent to

Proc sql;
   create table Argentina as
   select a.typy, a.ename, a.dscd, b,year, b.<somejunk>
   from sheet1 as a,sheet1.outx as b
   where a.type=b.type;
   select a.typy, a.ename, a.dscd, b,year, b.<somejunk2>
   from sheet1 as a,sheet2.outx as b
   where a.type=b.type;
   select a.typy, a.ename, a.dscd, b,year, b.<somejunk3>
   from sheet1 as a,sheet3.outx as b
   where a.type=b.type;
<repeat until &end>

If none of this generated actual syntax errors your results windows should have filled up with the select results of i=2,3,4.

You  only have one Create table select operation. The remaining selects do not have a table to go to.

And if you try to send them each to the same table then only the last would be the result as you are not appending results, if that is the intent.

 

 

Phil_NZ
Barite | Level 11

My code using MERGE now is:

data merge1;
merge 
sheet1
sheet2_outx
;
by type;
run;

data want;
merge 
merge1
sheet3_outx 
sheet4_outx 
;
by type year;
run;

And SAS run quickly there and then!

 
 
Thank you for your help, have a fabulous and productive day! I am a novice today, but someday when I accumulate enough knowledge, I can help others in my capacity.

Ready to join fellow brilliant minds for the SAS Hackathon?

Build your skills. Make connections. Enjoy creative freedom. Maybe change the world. Registration is now open through August 30th. Visit the SAS Hackathon homepage.

Register today!
How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

Click image to register for webinarClick image to register for webinar

Classroom Training Available!

Select SAS Training centers are offering in-person courses. View upcoming courses for:

View all other training opportunities.

Discussion stats
  • 6 replies
  • 928 views
  • 5 likes
  • 4 in conversation