DATA Step, Macro, Functions and more

Dynamic SQL Joins and Unions in a Macro DO Loop?

Accepted Solution Solved
Reply
Occasional Contributor
Posts: 12
Accepted Solution

Dynamic SQL Joins and Unions in a Macro DO Loop?

I'm working to merge around 25 individual survey data files, in waves of around (but not exactly) 5 at a time, and I'm finding all sorts of issues with user-defined formats, informats, and the Case IDs being used in each file. While some of these issues must be worked out by hand, I'd like to build a list of all the distinct CASE_ID values between all 25 datasets, and create comparison datasets with the formats, values and labels across each of the ~5ish dataset groups. The easiest way to do this, especially the formatting stuff, seems to be with Proc SQL. But I was hoping that I could build a macro that would generate my join statements (for the formats) and union statements (for the case ids) on the fly to save some time. Is it possible to have a macro generate syntax within a proc sql statement that goes beyond listing column/variable names and numbers?

 

I attempted to build something like this to give me a full list of the case IDs from my 25 files (which all begin with the prefix DATA), but it doesn't run the way I expected.  I get an error about expecting a "select" statement after my union statement:

%macro combineids;
proc sql;
select memname into :datasets separated by ' '
from dictionary.tables
where libname = "WORK" and memname like "DATA%";
%let i=1; %let dataset=%scan(&datasets,&i);
create table all_id as
%do i=1 %to %sysfunc(countw(&datasets.));
%if i=1 %then;
select CASE_ID from &dataset.
%else
UNION ALL select CASE_ID from &dataset.
%let i=%eval(&i+1);
%let dataset=%scan(&datasets,&I);
%end;
;
quit;
%mend;

If this is something that's even possible to do, I suspect I may need to wrap the code in a way so that SAS knows the output is executable? 

 

 

 

I realize there may be an easier way to do this with a datastep, but if it's possible, I'd prefer to create SQL syntax since I'd like to use a similar method to generate field selections and join statements to put my grouped formats files together. I haven't fully worked the logic out for that yet, but presume I'm working from a table called fmtcomp that contains all the distinct variable/format/value combinations in my datafiles, and &datasets is a macro variable containing all my proc formats cntlout format datasets. It'd look vaguely like this:

%macro compfmt;
proc sql;
select memname, format, 
%let i=1, %let dataset=%scan(&datasets,&i)
%do i=1 to %sysfunc(countw(%datasets,%str( )));
&dataset.start as start&i., &dataset.label as label&i.,
%end
from fmtcomp
%let i=1, %let dataset=%scan(&datasets,&i)
%do i=1 to %sysfunc(countw(%datasets,%str( )));
join &dataset. on fmtcomp.format = &dataset..format and fmtcomp.start = &dataset..start
%end

Is it possible to generate union and join statements within proc sql syntax like this on the fly, or am I stuck figuring out how to do both of these operations through datasteps instead (and I should note, there are type issues between the datasets that prevent a simple merge)? 


Accepted Solutions
Solution
‎07-27-2017 11:46 AM
Super User
Posts: 7,866

Re: Dynamic SQL Joins and Unions in a Macro DO Loop?

This:

%if i=1 %then;

is the culprit. You end the %then branch without doing anything.

These two lines should be put into 1 line of macro code:

%if i=1 %then select CASE_ID from &dataset.;

Note that the semicolon ends the macro statement and not the select statement that is handed off to the proc SQL.

 

I would greatly simplify the macro:

%macro combineids;
proc sql;
select memname into :datasets separated by ' '
from dictionary.tables
where libname = "WORK" and memname like "DATA%";
create table all_id as
select CASE_ID from %scan(&datasets,1)
%do i=2 %to %sysfunc(countw(&datasets.));
UNION ALL select CASE_ID from %scan(&datasets,&i)
%end;
;
quit;
%mend;

But the data step solution for a "union" is so much easier that it beats the sql by lightyears:

data all_id;
set work.data: (keep=case_id);
run;

Note that you can use indsname= in the set statement so that you can put statements into the data step that are dependent on which dataset is being read.

For the joins, you're probably better off by using my macro suggestion as a basis for your code.

And do not manipulate the index variable of a %do (or do) loop inside the lop, as that skewers the incrementation, and you'll skip (or repeat) iterations.

---------------------------------------------------------------------------------------------
Maxims of Maximally Efficient SAS Programmers

View solution in original post


All Replies
Solution
‎07-27-2017 11:46 AM
Super User
Posts: 7,866

Re: Dynamic SQL Joins and Unions in a Macro DO Loop?

This:

%if i=1 %then;

is the culprit. You end the %then branch without doing anything.

These two lines should be put into 1 line of macro code:

%if i=1 %then select CASE_ID from &dataset.;

Note that the semicolon ends the macro statement and not the select statement that is handed off to the proc SQL.

 

I would greatly simplify the macro:

%macro combineids;
proc sql;
select memname into :datasets separated by ' '
from dictionary.tables
where libname = "WORK" and memname like "DATA%";
create table all_id as
select CASE_ID from %scan(&datasets,1)
%do i=2 %to %sysfunc(countw(&datasets.));
UNION ALL select CASE_ID from %scan(&datasets,&i)
%end;
;
quit;
%mend;

But the data step solution for a "union" is so much easier that it beats the sql by lightyears:

data all_id;
set work.data: (keep=case_id);
run;

Note that you can use indsname= in the set statement so that you can put statements into the data step that are dependent on which dataset is being read.

For the joins, you're probably better off by using my macro suggestion as a basis for your code.

And do not manipulate the index variable of a %do (or do) loop inside the lop, as that skewers the incrementation, and you'll skip (or repeat) iterations.

---------------------------------------------------------------------------------------------
Maxims of Maximally Efficient SAS Programmers
Occasional Contributor
Posts: 12

Re: Dynamic SQL Joins and Unions in a Macro DO Loop?

Posted in reply to KurtBremser

Perfect! This works fantastically for my ID union situation. I'll try this for my joins when I get to them, and report back.

PROC Star
Posts: 1,760

Re: Dynamic SQL Joins and Unions in a Macro DO Loop?

[ Edited ]

For gathering all the CASE_IDs, I would just use:


data CASE_IDS;
  length CASE_ID $80;
  set %do i=1 %to %sysfunc(countw(&datasets.));
         %scan(&datasets,&i)(keep=CASE_ID)
       %end;
  ;  
  if CASE_ID ne lag(CASE_ID);
run;

 

This will keep all the values regardless of their original type, and you can then summarise.

You can probably do something similar for the formats using the metadata output tables, I am unsure I followed your train of thought.

 

 

Occasional Contributor
Posts: 12

Re: Dynamic SQL Joins and Unions in a Macro DO Loop?

This is great! I was wondering how to navigate my type issue, and it seems the answer is that the loop tells it just to keep that single variable from each dataset in the merge. Thanks!

☑ This topic is solved.

Need further help from the community? Please ask a new question.

Discussion stats
  • 4 replies
  • 213 views
  • 2 likes
  • 3 in conversation