Help using Base SAS procedures

How to wrap repetitive PROC SQLs into a macro? Or otherwise reduce repetitiveness of proc sqls?

Accepted Solution Solved
Reply
Contributor
Posts: 41
Accepted Solution

How to wrap repetitive PROC SQLs into a macro? Or otherwise reduce repetitiveness of proc sqls?

I have two questions...

 

1. Is there a way to simplify the repetitiveness of the PROC SQLs below?

 

Specifically, I used repeated proc sqls to

 

STEP A) Take out any inpatient claims with the DX codes 123, 12345 and compile them into "compiled" dataset, AND

STEP B) Take out all eligibility claims from eligibility files if their enrolid shows up in "compiled"

 

2. I think if a patient has enrolID 12345 in year 2011, and then loses eligibility, another patient may appear as enrolID 12345. How can I change the code below to make sure  STEP B takes out eligibility claims that match on enrolID and a corresponding "year" variable (assuming both eligibility claims and inpatient claims has a common 'year' variable?)

 

Step A and step B in its current state below.

 

/*0.1.0 SETUP LIBS */

libname raw "N:\" ACCESS = READONLY;
libname temp "M:\";

/*0.1.1 GENERATE LIST OF PTS WITH CORRECT DX CODE*/

PROC SQL NOPRINT;
	CREATE TABLE gg_def10 AS
	SELECT *
	FROM 'N:\ccaes103.sas7bdat'
	WHERE (DX1 IN ('123', '12345')) 
	AND (AGE >= (18));
QUIT;

PROC SQL NOPRINT;
	CREATE TABLE gg_def11 AS
	SELECT *
	FROM 'N:\ccaes113.sas7bdat'
	WHERE (DX1 IN ('123', '12345')) 
	AND (AGE >= (18));
QUIT;

PROC SQL NOPRINT;
	CREATE TABLE gg_def12 AS
	SELECT *
	FROM 'N:\ccaes122.sas7bdat'
	WHERE (DX1 IN ('123', '12345')) 
	AND (AGE >= (18));
QUIT;

PROC SQL NOPRINT;
	CREATE TABLE gg_def13 AS
	SELECT *
	FROM 'N:\ccaes130.sas7bdat'
	WHERE (DX1 IN ('123', '12345')) 
	AND (AGE >= (18));
QUIT;

/*Stack the inpatient claims from 2010 - 2013Q3*/

data temp.compiled;
	set gg_def10 gg_def11 gg_def12 gg_def13;

/*0.1.1 CLEAN ELIGIBILITY */

	/*0.1.1.1 COLLAPSE ELIGIBILITY CLAIMS */ 
	
PROC SQL NOPRINT;
	CREATE TABLE elig_10 AS
	SELECT *
	FROM 'N:\ccaet103.sas7bdat'
	WHERE ENROLID IN (SELECT DISTINCT ENROLID from temp.compiled);
QUIT;

PROC SQL NOPRINT;
	CREATE TABLE elig_11 AS
	SELECT *
	FROM 'N:\ccaet113.sas7bdat'
	WHERE ENROLID IN (SELECT DISTINCT ENROLID from temp.compiled);
QUIT;

PROC SQL NOPRINT;
	CREATE TABLE elig_12 AS
	SELECT *
	FROM 'N:\ccaet122.sas7bdat'
	WHERE ENROLID IN (SELECT DISTINCT ENROLID from temp.compiled);
QUIT;

PROC SQL NOPRINT;
	CREATE TABLE elig_13 AS
	SELECT *
	FROM 'N:\ccaet132.sas7bdat'
	WHERE ENROLID IN (SELECT DISTINCT ENROLID from temp.compiled);
QUIT;

data temp.compiled_elig;
	set elig_10 elig_11 elig_12 elig_13;

Accepted Solutions
Highlighted
Solution
a month ago
Super User
Posts: 3,365

Re: How to wrap repetitive PROC SQLs into a macro? Or otherwise reduce repetitiveness of proc sqls?

You don't need macro. Just read all of your datasets in a DATA step:

 

data temp.compiled;
  set raw.ccaes103
        raw.ccaes113
        raw.ccaes122
        raw.ccaes130;
  where dx1 in ('123', '12345') and age >= 18;
run;

View solution in original post


All Replies
Highlighted
Solution
a month ago
Super User
Posts: 3,365

Re: How to wrap repetitive PROC SQLs into a macro? Or otherwise reduce repetitiveness of proc sqls?

You don't need macro. Just read all of your datasets in a DATA step:

 

data temp.compiled;
  set raw.ccaes103
        raw.ccaes113
        raw.ccaes122
        raw.ccaes130;
  where dx1 in ('123', '12345') and age >= 18;
run;
Contributor
Posts: 41

Re: How to wrap repetitive PROC SQLs into a macro? Or otherwise reduce repetitiveness of proc sqls?

Thanks! And would that also work for the second half of the program above?

 

So instead of repeating below x4  

 

PROC SQL NOPRINT;
	CREATE TABLE elig_10 AS
	SELECT *
	FROM 'N:\ccaet103.sas7bdat'
	WHERE ENROLID IN (SELECT DISTINCT ENROLID from temp.compiled);
QUIT;

.....

 

 we would have 

 

data temp.elig;
  set raw.ccaet103
        raw.ccaet113
        raw.ccaet122
        raw.ccaet130;
  where enrolid in (select distinct enrolid from temp.compiled);
run;

 

Contributor RM6
Contributor
Posts: 24

Re: How to wrap repetitive PROC SQLs into a macro? Or otherwise reduce repetitiveness of proc sqls?

SELECT DISTINCT ENROLID from temp.compiled is  sql statements which does not run in data step

Contributor
Posts: 41

Re: How to wrap repetitive PROC SQLs into a macro? Or otherwise reduce repetitiveness of proc sqls?

I see ... in that case is there a way to make the bottom series of proc SQLs less repetitive? :0
Contributor RM6
Contributor
Posts: 24

Re: How to wrap repetitive PROC SQLs into a macro? Or otherwise reduce repetitiveness of proc sqls?

try this,

	proc sql;
	create table temp.compiled_elig as
	select * from (

		select * from 'N:\ccaet103.sas7bdat'
		union all
		select * from 'N:\ccaet113.sas7bdat'
		union all
		select * from 'N:\ccaet122.sas7bdat'
		union all
		select * from 'N:\ccaet132.sas7bdat'
	)
	WHERE ENROLID IN (SELECT DISTINCT ENROLID from temp.compiled)
	;
	quit;
Contributor RM6
Contributor
Posts: 24

Re: How to wrap repetitive PROC SQLs into a macro? Or otherwise reduce repetitiveness of proc sqls?

an other alternative to the whole code.


	proc sql;
	create table temp.compiled as
	select * from 'N:\ccaet103.sas7bdat'
	union all
	select * from 'N:\ccaet113.sas7bdat'
	union all
	select * from 'N:\ccaet122.sas7bdat'
	union all
	select * from 'N:\ccaet132.sas7bdat'
	;


	create table temp.compiled_elig as
	select * from temp.compiled
	WHERE ENROLID IN (SELECT DISTINCT ENROLID from temp.compiled
					 where (DX1 IN ('123', '12345')) AND (AGE >= (18))
					  )
	;
	quit;
Contributor
Posts: 41

Re: How to wrap repetitive PROC SQLs into a macro? Or otherwise reduce repetitiveness of proc sqls?

When I use the first part of your code, it says

 

ERROR: Column 12 from the first contributor of UNION ALL is not the same type as its
       counterpart from the second.
ERROR: Column 13 from the first contributor of UNION ALL is not the same type as its
       counterpart from the second.
ERROR: Column 17 from the first contributor of UNION ALL is not the same type as its
       counterpart from the second.
ERROR: Column 26 from the first contributor of UNION ALL is not the same type as its
       counterpart from the second.
ERROR: Column 13 from the first contributor of UNION ALL is not the same type as its
       counterpart from the second.
ERROR: Column 26 from the first contributor of UNION ALL is not the same type as its
       counterpart from the second.
ERROR: Column 13 from the first contributor of UNION ALL is not the same type as its
       counterpart from the second.
ERROR: Column 26 from the first contributor of UNION ALL is not the same type as its
       counterpart from the second.
294          quit;

Any idea what this means? :/

 

 

Specifically, this is what I ran:

 

		proc sql;
			create table temp.compiled as
			select * from (
				select * from raw.ccaes103 union all
				select * from raw.ccaes113 union all
				select * from raw.ccaes122 union all
				select * from raw.ccaes132)
			where dx1 in ('123', '12345') and age >= 18;
		quit;
Contributor RM6
Contributor
Posts: 24

Re: How to wrap repetitive PROC SQLs into a macro? Or otherwise reduce repetitiveness of proc sqls?

one way to simplify your code


%macro data_processing(dsn=,file=);
PROC SQL NOPRINT;
	CREATE TABLE &dsn. AS
	SELECT *
	FROM 'N:\&file..sas7bdat'
	WHERE (DX1 IN ('123', '12345')) 
	AND (AGE >= (18));
QUIT;
%mend;


%macro data_processing1(dsn=,file=);
PROC SQL NOPRINT;
	CREATE TABLE &dsn. AS
	SELECT *
	FROM 'N:\&file..sas7bdat'
	WHERE ENROLID IN (SELECT DISTINCT ENROLID from temp.compiled);
QUIT;
%mend;

%data processing(dsn=gg_def10,file=ccaes103);
%data processing(dsn=gg_def11,file=ccaes113);
%data processing(dsn=gg_def12,file=ccaes122);
%data processing(dsn=gg_def13,file=ccaes130);


data temp.compiled;
	set gg_def10 gg_def11 gg_def12 gg_def13;


%data processing1(dsn=elig_10,file=ccaes103);
%data processing1(dsn=elig_11,file=ccaes113);
%data processing1(dsn=elig_12,file=ccaes122);
%data processing1(dsn=elig_13,file=ccaes130);


data temp.compiled_elig;
	set elig_10 elig_11 elig_12 elig_13;
Trusted Advisor
Posts: 1,633

Re: How to wrap repetitive PROC SQLs into a macro? Or otherwise reduce repetitiveness of proc sqls?

[ Edited ]

I hope that next code will be the most efficient:

 

Step 1 is identical to @SASKiwi's code:

data temp.compiled  enrollids(keep=enrollid);
  set raw.ccaes103
        raw.ccaes113
        raw.ccaes122
        raw.ccaes130;
  where dx1 in ('123', '12345') and age >= 18;
run;

Step 2 creates distinct values of ENROLLID:

proc sort data=enrollids out=enrollids NODUPKEY;
by enrollid; run;

Last step:

data temp.compiled_elig;
  merge raw.ccaet103 (in=in103)
        raw.ccaet113 (in=in113)
        raw.ccaet122 (in=in122)
        raw.ccaet132 (in=in132)
        enrollids    (in=inids);
  by enrollid;   /* assumes all inputs are sorted by EnrollID */
       if inids and (in103 or in113 or in122 or in132);
run;
Contributor
Posts: 41

Re: How to wrap repetitive PROC SQLs into a macro? Or otherwise reduce repetitiveness of proc sqls?

[ Edited ]

Question 1: Could I make the code more "user friendly" by allowing a modifiable inputs at the top? e.g. 

 

%let inputs = ccaes103 ccaes 113 ccaes122 ccaes130

and then somehow incorporate that to the code later? 

 

I can see how we may incorporate this on the dx and age side, e.g. instead of 

 

data temp.compiled  enrollids(keep=enrollid);
  set raw.ccaes103
        raw.ccaes113
        raw.ccaes122
        raw.ccaes130;
  where dx1 in ('123', '12345') and age >= 18;
run;

we have something like 

 

%let dx1 = 123, 12345;
%let age_cutoff = 18;

data temp.compiled  enrollids(keep=enrollid);
  set raw.ccaes103
        raw.ccaes113
        raw.ccaes122
        raw.ccaes130;
  where dx1 in &dx1. and age >= &age_cutoff.;
run;

 

Question 2: isn't data merge adding all the datasets horizontally? Since these are datasets from different years, should we instead be stacking vertically?

Super User
Super User
Posts: 8,349

Re: How to wrap repetitive PROC SQLs into a macro? Or otherwise reduce repetitiveness of proc sqls?

Question 1) To be honest, I have seen many implementations of macro lists and such like in over 15years, and there is one consistent thing I have found, they never make code more user friendly or modifiable.  First there is rarely any documentation to adequately explain the code or parameters, and secondly due to that, those paramter lists can cause fails and bugs of all kinds in the code.  For instance a simple step where the name in the list doesn't exist, will you check for this before your code, what about not containing the expected variables, or the right type, correct values etc.  The list goes on.

 

Question 2) Data merging (joining) is the means of putting data across the page, based on joining variables.  Setting data is the method of setting data one under the other.

 

Finally if you want to make your code more robust, put your data into a set library, then you can apply a list to get all of them without knowing up front:

data temp.compiled  enrollids(keep=enrollid);
  set raw.ccaes:;
  where dx1 in ('123', '12345') and age >= 18;
run;

This will take all datasets in raw with ccaes prefix. 

Super User
Super User
Posts: 8,349

Re: How to wrap repetitive PROC SQLs into a macro? Or otherwise reduce repetitiveness of proc sqls?

[ Edited ]

i do have to ask, where did you learn to code like this.  I have never in 15 plus years seen anyone use:

	FROM 'N:\ccaes103.sas7bdat'

You create a libname called raw which points to the datasets and never use them?

  from RAW.CCAES103
Is how it should look.  Anyways @Shmuel has the correct answer here, concatenate your data before processing removes the repetitiveness in this example and most macro coding examples I have seen.  1 dataset is almost always preferable to multiple ones.

 

Contributor
Posts: 41

Re: How to wrap repetitive PROC SQLs into a macro? Or otherwise reduce repetitiveness of proc sqls?

Thank you! Yea, my coding is horrible Smiley Sad, but thank you for helping me learn a bit more! 

Super User
Posts: 3,365

Re: How to wrap repetitive PROC SQLs into a macro? Or otherwise reduce repetitiveness of proc sqls?

@RW9 - that style of referencing SAS datasets means you avoid assigning a LIBNAME entirely. I've used it occasionally if I'm just wanting to read a single dataset from a folder that isn't used for anything else. Funny I didn't learn about it until fairly recently myself. 

☑ This topic is solved.

Need further help from the community? Please ask a new question.

Discussion stats
  • 14 replies
  • 516 views
  • 12 likes
  • 5 in conversation