DATA Step, Macro, Functions and more

Conditional Merge of 2 Datasets

Accepted Solution Solved
Reply
Contributor
Posts: 34
Accepted Solution

Conditional Merge of 2 Datasets

Hello,

 

I have two datasets that I am trying to merge based on a condition. Esstenially trying to combine Merge and If-Then-Else statements

 

Data 1 = Identifier      Segment                Data 2   = Identifier      CC     CR               Need Data = Identifier        Segment      New_Field

                 01                  CC                                         01              1         1                                         01                   CC                1

                 02                  CC                                         02              2         3                                         02                   CC                2

                 03                  CC                                         03              3         4                                         03                   CC                3

                 01                  CR                                                                                                                 01                   CR                1

                 02                  CR                                                                                                                 02                   CR                3

 

I could merge the two dataset first and then use IF-THEN to create the new field. Is there a more efficient way of doing this by combining these two steps.

 

Thanks for the suggestions.

 


Accepted Solutions
Solution
‎03-21-2016 10:56 AM
Trusted Advisor
Posts: 1,117

Re: Conditional Merge of 2 Datasets

[ Edited ]

Hello @sai_ch,

 

So, essentially, you have these two datasets:

data have1;
input id $ segment $;
cards;
01 CC
02 CC
03 CC
01 CR
02 CR
;

data have2;
input id $ CC CR;
cards;
01 1 1
02 2 3
03 3 4
;

And you're thinking about doing something like this:

proc sort data=have1 out=s1;
by id;
run;

data want;
merge s1 have2;
by id;
     if segment='CC' then new_field=cc;
else if segment='CR' then new_field=cr;
drop cc cr;
run;

proc sort data=want;
by segment id;
run;

This is fine, but it takes three steps.

 

With PROC SQL you could do it in one step:

proc sql;
create table want as
select a.*, choosen(whichc(segment, 'CC', 'CR'), CC, CR) as new_field
from have1 a, have2 b
where a.id=b.id
order by segment, a.id;
quit;

The functions CHOOSEN and WHICHC could have been used in the data step solution as well.

Edit: But in the data step they could have been replaced by VVALUEX in conjunction with INPUT:

 new_field=input(vvaluex(segment), 12.)

Edit 2: A shorter expression for the definition of NEW_FIELD, applicable to both solutions, is:

ifn(segment='CC', CC, CR)

View solution in original post


All Replies
Contributor
Posts: 34

Re: Conditional Merge of 2 Datasets

I apologize for the layout of the datasets. Let me try again

 

Data 1 = Identifier   Segment      Data 2 = Identifier  CC    CR 
                  01           CC                                 01         1        1 

                  02           CC                                 02         2        3 
                  03           CC                                 03         3        4 
                  01           CR                                 
                  02           CR                                 

 

Need Data = Identifier  Segment   New_Field

                         01            CC            1

                         02            CC            2

                         03            CC            3

                         01            CR            1

                         02            CR            3

 

Thanks again

Solution
‎03-21-2016 10:56 AM
Trusted Advisor
Posts: 1,117

Re: Conditional Merge of 2 Datasets

[ Edited ]

Hello @sai_ch,

 

So, essentially, you have these two datasets:

data have1;
input id $ segment $;
cards;
01 CC
02 CC
03 CC
01 CR
02 CR
;

data have2;
input id $ CC CR;
cards;
01 1 1
02 2 3
03 3 4
;

And you're thinking about doing something like this:

proc sort data=have1 out=s1;
by id;
run;

data want;
merge s1 have2;
by id;
     if segment='CC' then new_field=cc;
else if segment='CR' then new_field=cr;
drop cc cr;
run;

proc sort data=want;
by segment id;
run;

This is fine, but it takes three steps.

 

With PROC SQL you could do it in one step:

proc sql;
create table want as
select a.*, choosen(whichc(segment, 'CC', 'CR'), CC, CR) as new_field
from have1 a, have2 b
where a.id=b.id
order by segment, a.id;
quit;

The functions CHOOSEN and WHICHC could have been used in the data step solution as well.

Edit: But in the data step they could have been replaced by VVALUEX in conjunction with INPUT:

 new_field=input(vvaluex(segment), 12.)

Edit 2: A shorter expression for the definition of NEW_FIELD, applicable to both solutions, is:

ifn(segment='CC', CC, CR)
Respected Advisor
Posts: 4,920

Re: Conditional Merge of 2 Datasets

Posted in reply to FreelanceReinhard

Another SQL solution would be:

 

proc sql;
create table want as
select have1.*, CC as new_field from have1 natural join have2 where segment="CC"
union all 
select have1.*, CR as new_field from have1 natural join have2 where segment="CR"
order by segment, id;
quit;
PG
Super User
Posts: 10,023

Re: Conditional Merge of 2 Datasets

data have1;
input id $ segment $;
cards;
01 CC
02 CC
03 CC
01 CR
02 CR
;

data have2;
input id $ CC CR;
cards;
01 1 1
02 2 3
03 3 4
;
proc transpose data=have2 out=x(index=(segment id) rename=(col1=New_Field _name_=segment));
 by id;
 var CC CR;
run;

data want;
 ina=0;inb=0;
 merge have1(in=ina )  x(in=inb) ;
 by  segment id;
 if ina and inb;
run;
Contributor
Posts: 34

Re: Conditional Merge of 2 Datasets

Thank you all for improving my SAS skills.

☑ This topic is solved.

Need further help from the community? Please ask a new question.

Discussion stats
  • 5 replies
  • 1387 views
  • 4 likes
  • 4 in conversation