DATA Step, Macro, Functions and more

PROC SQL vs Merge tricky naming

Accepted Solution Solved
Reply
Frequent Contributor
Posts: 123
Accepted Solution

PROC SQL vs Merge tricky naming

[ Edited ]

Here is some test data:

data have;
  infile datalines;
  input RX_Chain $ Exclusion $ CLIENT $ ZIP;
  datalines;
ALD854 N IL 22309
JX139D Y NE 21234
IL603X N MT 99087
AB2009 Y NM 20208
;
run;

data have2;
  infile datalines;
  input Network $ Chain $;
  datalines;
ALD854 00455
JX139D 00139
JX307E 00540
IL603X 00651
;
run;

And here is some PROC SQL that will accomplish what I want:

proc sql;
  create table want1 as
  select t1.RX_Chain,
         t1.CLIENT,
		 t1.ZIP,
		 (CASE 
		    WHEN t1.RX_Chain = t2.NETWORK THEN t2.CHAIN
			ELSE t1.RX_Chain
		  END) format=$6. AS RX_Chain
    from have t1 left join have2 t2 on(t1.RX_Chain=t2.Network)
    where t1.Exclusion = 'N';
quit;

But, I want to accomplish this with a merge. The trouble is with the naming: RX_Chain.

 

I attempted a start at the merge and it looks like this (after sorting initial datasets):

data want;
  merge have(in=h) have2(in=h2 rename=(Network=RX_Chain));
  by RX_Chain;
  if h and h2 then RX_Chain=Chain;
  else RX_Chain=RX_Chain;
  if h and Exclusion='N';
keep RX_Claim Client ZIP; run;

However, I am not getting the same results as the PROC SQL. I know it has to do with the naming of RX_Chain and trying to assign a value to it but basically I'm trying to change the value of RX_Chain in have dataset to Chain in have2 dataset when RX_Chain = Network, or leave it as same value it was if not.

 


Accepted Solutions
Solution
‎11-07-2016 04:45 PM
Respected Advisor
Posts: 4,919

Re: PROC SQL vs Merge tricky naming

Posted in reply to JediApprentice

Your SQL query contains a duplicate column RX_chain. Beside that, you should get what you want with:

 

data want;
merge 
	have(in=h1) 
	have2(in=h2 rename=Network=RX_Chain);
by RX_chain; 
if h1 and Exclusion='N';
if h2 then RX_Chain=Chain;
keep RX_Chain Client ZIP;
run;
PG

View solution in original post


All Replies
Super User
Posts: 5,424

Re: PROC SQL vs Merge tricky naming

Posted in reply to JediApprentice

First, why do a merge when you got a working SQL?

Second, data step handles variable differently than SQL. By renaming Network its values will replace the values provided by Have.

DOn't rename, use Network in your if logic and assignments.

Data never sleeps
Frequent Contributor
Posts: 123

Re: PROC SQL vs Merge tricky naming

@LinusH But when I don't rename it, on the by statement, it throws an error "Variable RX_Chain is not on dataset have2". So it seems like I have to rename it..

Super User
Posts: 5,424

Re: PROC SQL vs Merge tricky naming

Posted in reply to JediApprentice

Apparently I didn't thought it through.Man Embarassed

So we are lucky that @PGStats came to the rescue Man Very Happy

Data never sleeps
Solution
‎11-07-2016 04:45 PM
Respected Advisor
Posts: 4,919

Re: PROC SQL vs Merge tricky naming

Posted in reply to JediApprentice

Your SQL query contains a duplicate column RX_chain. Beside that, you should get what you want with:

 

data want;
merge 
	have(in=h1) 
	have2(in=h2 rename=Network=RX_Chain);
by RX_chain; 
if h1 and Exclusion='N';
if h2 then RX_Chain=Chain;
keep RX_Chain Client ZIP;
run;
PG
☑ This topic is solved.

Need further help from the community? Please ask a new question.

Discussion stats
  • 4 replies
  • 270 views
  • 1 like
  • 3 in conversation