SAS Data Integration Studio, DataFlux Data Management Studio, SAS/ACCESS, SAS Data Loader for Hadoop and others

Merge two data sets by id when the two variables have equal values in different rows

Reply
Occasional Contributor
Posts: 15

Merge two data sets by id when the two variables have equal values in different rows

[ Edited ]

Hi there,

 

could you suggest for me how I could merge the two data sets by id, if I have two variables have equal values but in different rows.

 

DATA ONE;

INPUT ID VAR1 $;

CARDS;

 

1 C

1 A

1 D

2 A

2 B

3 C

;

RUN;

 

 

DATA TWO;

INPUT ID VAR2 $;

CARDS;

 

1 E

1 C

1 D

1 A

2 A

2 B

2 D

3 A

3 C

4 F

;

RUN;

 

 

 

 

/*The correct results should be as belwo

ID VAR1 VAR2

1    C        C

1    A        A

1    D        D

2    A        A

2    B       B

3    C       C

 

/*I tried the folloiwng code, but it didn't gave me the correct answer*/

data want;

merge one(in=a) two(in=b);

by ID;

if a;

if var1=var2;

run;

 

PROC Star
Posts: 847

Re: Merge two data sets by id when the two variables have equal values in different rows

Posted in reply to elias_2020
proc sql;
	create table want as
	select ONE.ID, VAR1, VAR2
	from ONE left join TWO
	on ONE.ID=TWO.ID and ONE.VAR1=TWO.VAR2;
quit;
Occasional Contributor
Posts: 15

Re: Merge two data sets by id when the two variables have equal values in different rows

[ Edited ]

Thanks draycut,

how I could please keep the other variables in both data sets when I'm using proc sql

for example if I have age in the first data set and weight in the second data set.

 

 

PROC Star
Posts: 847

Re: Merge two data sets by id when the two variables have equal values in different rows

[ Edited ]
Posted in reply to elias_2020

What other variables? Please be more specific.

Occasional Contributor
Posts: 15

Re: Merge two data sets by id when the two variables have equal values in different rows

[ Edited ]
DATA ONE; 
  INPUT ID VAR1 $  height; 
CARDS; 
1 C  167 
1 A  168
1 D  168
2 A  172
2 B  174
3 C  171
; 
RUN; 


DATA TWO; 
  INPUT ID VAR2 $  age; 
CARDS; 
1 E  21
1 C  22  
1 D  24
1 A  25
2 A  20
2 B  22
2 D  28
3 A  24
3 C  26
4 F  27
; 
RUN; 

 

 

/*The correct results should be as belwo

ID VAR1 VAR2 height   age

1     C        C     167     22

1     A        A     168     24

1     D        D     168     24

2     A        A      172    20

2     B        B      174     22

3     C        C      171     26

 

Super User
Posts: 8,037

Re: Merge two data sets by id when the two variables have equal values in different rows

Posted in reply to elias_2020

oalbalawi wrote:
DATA ONE; 
  INPUT ID VAR1 $  height; 
CARDS; 
1 C  167 
1 A  165
1 D  168
2 A  172
2 B  174
3 C  171
; 
RUN; 


DATA TWO; 
  INPUT ID VAR2 $  age; 
CARDS; 
1 E  21
1 C  22  
1 D  24
1 A  25
2 A  20
2 B  22
2 D  28
3 A  24
3 C  26
4 F  27
; 
RUN; 

 

 

/*The correct results should be as belwo

ID VAR1 VAR2 hight   age

1     C        C     167     22

1     A        A     168     24

1     D        D     169     24

2     A        A      172    20

2     B        B      174     22

3     C        C      171     26

 


Your result does not match your data. There's no height of 169 anywhere, and id = 1 var1 = A has an age of 25.

Apart from that, my code works:

data one;
input id var1 $ age;
cards;
1 C 167
1 A 165
1 D 168
2 A 172
2 B 174
3 C 171
;
run;

data two;
input id var2 $ height;
cards;
1 E 21
1 C 22
1 D 24
1 A 25
2 A 20
2 B 22
2 D 28
3 A 24
3 C 26
4 F 27
;
run;


proc sort data=one;
by id var1;
run;

proc sort data=two;
by id var2;
run;

data want;
merge
  one (in=a)
  two (in=b rename=(var2=var1))
;
by id var1;
if a and b;
run;

proc print data=want noobs;
run;

Result:

id    var1    age    height

 1     A      165      25  
 1     C      167      22  
 1     D      168      24  
 2     A      172      20  
 2     B      174      22  
 3     C      171      26  

Only the order changes, because of the sort. If you want to preserve the orignal order, you need to create a helper variable on which to re-sort after the merge.

---------------------------------------------------------------------------------------------
Maxims of Maximally Efficient SAS Programmers
Occasional Contributor
Posts: 15

Re: Merge two data sets by id when the two variables have equal values in different rows

[ Edited ]
Posted in reply to KurtBremser

Thanks KurtBremser.

 

Sorry, it was error typing in record the height of id=1 and the age id=1 & var1=A

 

Could you please show me any example of how I can keep the order of height and age  because  when I sort the data the age of each id will not be in the correct order.

Super User
Posts: 8,037

Re: Merge two data sets by id when the two variables have equal values in different rows

Posted in reply to elias_2020

Add a variable in table one:

data one;
input id var1 $ age;
keepsort = _n_;
cards;
1 C 167
1 A 165
1 D 168
2 A 172
2 B 174
3 C 171
;
run;

At the end, sort by that, and drop if you don't need it any longer:

proc sort data=want;
by keepsort;
run;

proc print data=want (drop=keepsort) noobs;
run;

Adapted result:

id    var1    age    height

 1     C      167      22  
 1     A      165      25  
 1     D      168      24  
 2     A      172      20  
 2     B      174      22  
 3     C      171      26  
---------------------------------------------------------------------------------------------
Maxims of Maximally Efficient SAS Programmers
Occasional Contributor
Posts: 15

Re: Merge two data sets by id when the two variables have equal values in different rows

Posted in reply to KurtBremser

Thanks KurtBremser.

Super User
Posts: 8,037

Re: Merge two data sets by id when the two variables have equal values in different rows

Posted in reply to elias_2020

Merge on both variables:

data one;
input id var1 $;
cards;
1 C
1 A
1 D
2 A
2 B
3 C
;
run;

data two;
input id var2 $;
cards;
1 E
1 C
1 D
1 A
2 A
2 B
2 D
3 A
3 C
4 F
;
run;


proc sort data=one;
by id var1;
run;

proc sort data=two;
by id var2;
run;

data want;
merge
  one (in=a)
  two (in=b rename=(var2=var1))
;
by id var1;
if a and b;
run;

proc print data=want noobs;
run;

Result:

id    var1

 1     A  
 1     C  
 1     D  
 2     A  
 2     B  
 3     C  

Since var2 would be equal to var1 anyways, there's no need to keep it.

---------------------------------------------------------------------------------------------
Maxims of Maximally Efficient SAS Programmers
Ask a Question
Discussion stats
  • 9 replies
  • 151 views
  • 0 likes
  • 3 in conversation