Hi all,
I have several data sets i am trying to merge using a standard dataset that contains IDs i need for the final merge. Here is an example: dataset A and dataset B need to be merged. Dataset C is the 'standard' dataset that has just ID #s that i need in the final merge. Also, is it better to use SQL or just data step? Thanks
Dataset A
| ID | var1 | var2 |
| 2 | Cat1 | 2 |
| 3 | cat2 | 4 |
| 4 | Cat1 | 6 |
| 6 | cat2 | 8 |
| 7 | Cat1 | 0 |
| 9 | cat2 | 10 |
| 10 | cat1 | 12 |
Dataset B
| ID | Var3 |
| 1 | 1.5 |
| 2 | 3.6 |
| 3 | 5.3 |
| 5 | 7.9 |
| 7 | 9.2 |
| 9 | 2.5 |
| 10 | 2.9 |
Dataset C
| ID |
| 2 |
| 3 |
| 4 |
| 7 |
| 9 |
| 10 |
Data want
| ID | var1 | var2 | Var3 |
| 2 | Cat1 | 2 | 3.6 |
| 3 | cat2 | 4 | 5.3 |
| 4 | Cat1 | 6 | |
| 7 | Cat1 | 0 | 9.2 |
| 9 | cat2 | 10 | 2.5 |
| 10 | cat1 | 12 | 2.9 |
Since the datasets are already sorted, and have a 1:1 relationship throughout, the data step is the method of choice:
data want;
merge
a
b
c (in=in_c)
;
by id;
if in_c;
run;
data a;
input ID var1 $ var2;
cards;
2 Cat1 2
3 cat2 4
4 Cat1 6
6 cat2 8
7 Cat1 0
9 cat2 10
10 cat1 12
;
data B;
input ID Var3;
cards;
1 1.5
2 3.6
3 5.3
5 7.9
7 9.2
9 2.5
10 2.9
;
data C;
input ID;
cards;
2
3
4
7
9
10
;
data want;
merge c(in=c) a b;
by id;
if c;
run;
Yes, you could use SQL Select or datastep keep/drop dataset options. The choice is yours.
Sir @Kurt_Bremser / @Tom or @Reeza can offer better slick advice if your question is related to "performance".
April 27 – 30 | Gaylord Texan | Grapevine, Texas
Walk in ready to learn. Walk out ready to deliver. This is the data and AI conference you can't afford to miss.
Register now and lock in 2025 pricing—just $495!
Still thinking about your presentation idea? The submission deadline has been extended to Friday, Nov. 14, at 11:59 p.m. ET.
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.
Ready to level-up your skills? Choose your own adventure.