BookmarkSubscribeRSS Feed
🔒 This topic is solved and locked. Need further help from the community? Please sign in and ask a new question.

Here is some test data:

data have;
  infile datalines;
  input RX_Chain $ Exclusion $ CLIENT $ ZIP;
  datalines;
ALD854 N IL 22309
JX139D Y NE 21234
IL603X N MT 99087
AB2009 Y NM 20208
;
run;

data have2;
  infile datalines;
  input Network $ Chain $;
  datalines;
ALD854 00455
JX139D 00139
JX307E 00540
IL603X 00651
;
run;

And here is some PROC SQL that will accomplish what I want:

proc sql;
  create table want1 as
  select t1.RX_Chain,
         t1.CLIENT,
		 t1.ZIP,
		 (CASE 
		    WHEN t1.RX_Chain = t2.NETWORK THEN t2.CHAIN
			ELSE t1.RX_Chain
		  END) format=$6. AS RX_Chain
    from have t1 left join have2 t2 on(t1.RX_Chain=t2.Network)
    where t1.Exclusion = 'N';
quit;

But, I want to accomplish this with a merge. The trouble is with the naming: RX_Chain.

 

I attempted a start at the merge and it looks like this (after sorting initial datasets):

data want;
  merge have(in=h) have2(in=h2 rename=(Network=RX_Chain));
  by RX_Chain;
  if h and h2 then RX_Chain=Chain;
  else RX_Chain=RX_Chain;
  if h and Exclusion='N';
keep RX_Claim Client ZIP; run;

However, I am not getting the same results as the PROC SQL. I know it has to do with the naming of RX_Chain and trying to assign a value to it but basically I'm trying to change the value of RX_Chain in have dataset to Chain in have2 dataset when RX_Chain = Network, or leave it as same value it was if not.

 

1 ACCEPTED SOLUTION

Accepted Solutions
PGStats
Opal | Level 21

Your SQL query contains a duplicate column RX_chain. Beside that, you should get what you want with:

 

data want;
merge 
	have(in=h1) 
	have2(in=h2 rename=Network=RX_Chain);
by RX_chain; 
if h1 and Exclusion='N';
if h2 then RX_Chain=Chain;
keep RX_Chain Client ZIP;
run;
PG

View solution in original post

4 REPLIES 4
LinusH
Tourmaline | Level 20

First, why do a merge when you got a working SQL?

Second, data step handles variable differently than SQL. By renaming Network its values will replace the values provided by Have.

DOn't rename, use Network in your if logic and assignments.

Data never sleeps
JediApprentice
Pyrite | Level 9

@LinusH But when I don't rename it, on the by statement, it throws an error "Variable RX_Chain is not on dataset have2". So it seems like I have to rename it..

LinusH
Tourmaline | Level 20

Apparently I didn't thought it through.Man Embarassed

So we are lucky that @PGStats came to the rescue Man Very Happy

Data never sleeps
PGStats
Opal | Level 21

Your SQL query contains a duplicate column RX_chain. Beside that, you should get what you want with:

 

data want;
merge 
	have(in=h1) 
	have2(in=h2 rename=Network=RX_Chain);
by RX_chain; 
if h1 and Exclusion='N';
if h2 then RX_Chain=Chain;
keep RX_Chain Client ZIP;
run;
PG

sas-innovate-2024.png

Don't miss out on SAS Innovate - Register now for the FREE Livestream!

Can't make it to Vegas? No problem! Watch our general sessions LIVE or on-demand starting April 17th. Hear from SAS execs, best-selling author Adam Grant, Hot Ones host Sean Evans, top tech journalist Kara Swisher, AI expert Cassie Kozyrkov, and the mind-blowing dance crew iLuminate! Plus, get access to over 20 breakout sessions.

 

Register now!

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

Click image to register for webinarClick image to register for webinar

Classroom Training Available!

Select SAS Training centers are offering in-person courses. View upcoming courses for:

View all other training opportunities.

Discussion stats
  • 4 replies
  • 993 views
  • 1 like
  • 3 in conversation