Hi all,
I have several data sets i am trying to merge using a standard dataset that contains IDs i need for the final merge. Here is an example: dataset A and dataset B need to be merged. Dataset C is the 'standard' dataset that has just ID #s that i need in the final merge. Also, is it better to use SQL or just data step? Thanks
Dataset A
ID | var1 | var2 |
2 | Cat1 | 2 |
3 | cat2 | 4 |
4 | Cat1 | 6 |
6 | cat2 | 8 |
7 | Cat1 | 0 |
9 | cat2 | 10 |
10 | cat1 | 12 |
Dataset B
ID | Var3 |
1 | 1.5 |
2 | 3.6 |
3 | 5.3 |
5 | 7.9 |
7 | 9.2 |
9 | 2.5 |
10 | 2.9 |
Dataset C
ID |
2 |
3 |
4 |
7 |
9 |
10 |
Data want
ID | var1 | var2 | Var3 |
2 | Cat1 | 2 | 3.6 |
3 | cat2 | 4 | 5.3 |
4 | Cat1 | 6 | |
7 | Cat1 | 0 | 9.2 |
9 | cat2 | 10 | 2.5 |
10 | cat1 | 12 | 2.9 |
Since the datasets are already sorted, and have a 1:1 relationship throughout, the data step is the method of choice:
data want;
merge
a
b
c (in=in_c)
;
by id;
if in_c;
run;
data a;
input ID var1 $ var2;
cards;
2 Cat1 2
3 cat2 4
4 Cat1 6
6 cat2 8
7 Cat1 0
9 cat2 10
10 cat1 12
;
data B;
input ID Var3;
cards;
1 1.5
2 3.6
3 5.3
5 7.9
7 9.2
9 2.5
10 2.9
;
data C;
input ID;
cards;
2
3
4
7
9
10
;
data want;
merge c(in=c) a b;
by id;
if c;
run;
Yes, you could use SQL Select or datastep keep/drop dataset options. The choice is yours.
Sir @Kurt_Bremser / @Tom or @Reeza can offer better slick advice if your question is related to "performance".
Registration is now open for SAS Innovate 2025 , our biggest and most exciting global event of the year! Join us in Orlando, FL, May 6-9.
Sign up by Dec. 31 to get the 2024 rate of just $495.
Register now!
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.
Ready to level-up your skills? Choose your own adventure.