hi, I now have two datasets(A and B), and I want a new datasets that combine these two based on what A has.
C is the output dataset that I want. In C, I only want data(fam-id, pers-id, date) that have the same fam-id and pers-id in both A and B. A and B are in different size and for the same pair of fam-id and pers-id, A only has one row, but B may have multiple rows and I want multiple rows to show up in C. If a pair of fam-id and pers-id appears only in B but not in A, then it should not appear in C. I tried to use merge in the data step, but it kept giving me output that apart from what I want, it has rows that appear only in B(two rows with fam-id=115 in the following example).
Any suggestions would be appreciated! Thank you!
A : B: C:
fam-id pers-id fam-id pers-id date fam-id pers-id date
111 1112 111 1112 01/02/2020 111 1112 01/02/2020
112 1124 111 1112 03/25/2020 111 1112 03/25/2020
113 1132 111 1112 07/08/2020 111 1112 07/08/2020
114 1145 112 1124 05/30/2020 112 1124 05/30/2020
112 1124 03/26/2020 112 1124 03/26/2020
113 1132 02/24/2020 113 1132 02/24/2020
115 1156 04/07/2020
115 1156 05/06/2020
For future, please post the data as datalines like I have done. You need to use in flags to get the desired output dataset.
data A;
input fam_id $ pers_id $;
datalines;
111 1112
112 1124
113 1132
114 1145
;
run;
data B;
format date mmddyy10.;
input fam_id $ pers_id $ date :mmddyy10.;
datalines;
111 1112 01/02/2020
111 1112 03/25/2020
111 1112 07/08/2020
112 1124 05/30/2020
112 1124 03/26/2020
113 1132 02/24/2020
115 1156 04/07/2020
115 1156 05/06/2020
;
run;
data C;
merge A(in=inA) B(in=inB);
by fam_id pers_id;
if inA and inB;
run;
can you post the merge you tried?
For future, please post the data as datalines like I have done. You need to use in flags to get the desired output dataset.
data A;
input fam_id $ pers_id $;
datalines;
111 1112
112 1124
113 1132
114 1145
;
run;
data B;
format date mmddyy10.;
input fam_id $ pers_id $ date :mmddyy10.;
datalines;
111 1112 01/02/2020
111 1112 03/25/2020
111 1112 07/08/2020
112 1124 05/30/2020
112 1124 03/26/2020
113 1132 02/24/2020
115 1156 04/07/2020
115 1156 05/06/2020
;
run;
data C;
merge A(in=inA) B(in=inB);
by fam_id pers_id;
if inA and inB;
run;
Registration is now open for SAS Innovate 2025 , our biggest and most exciting global event of the year! Join us in Orlando, FL, May 6-9.
Sign up by Dec. 31 to get the 2024 rate of just $495.
Register now!
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.
Ready to level-up your skills? Choose your own adventure.