BookmarkSubscribeRSS Feed
☑ This topic is solved. Need further help from the community? Please sign in and ask a new question.
LisaZ1
Obsidian | Level 7

hi, I now have two datasets(A and B), and I want a new datasets that combine these two based on what A has.

C is the output dataset that I want. In C, I only want data(fam-id, pers-id, date) that have the same fam-id and pers-id in both A and B. A and B are in different size and for the same pair of fam-id and pers-id, A only has one row, but B may have multiple rows and I want multiple rows to show up in C. If a pair of fam-id and pers-id appears only in B but not in A, then it should not appear in C. I tried to use merge in the data step, but it kept giving me output that apart from what I want, it has rows that appear only in B(two rows with fam-id=115 in the following example).

Any suggestions would be appreciated! Thank you! 

A :                                  B:                                                         C:

fam-id  pers-id                  fam-id   pers-id         date                     fam-id   pers-id   date

  111     1112                      111       1112       01/02/2020                111       1112       01/02/2020 

  112      1124                     111       1112       03/25/2020                111       1112       03/25/2020

  113      1132                     111       1112       07/08/2020                111       1112       07/08/2020

  114      1145                     112       1124       05/30/2020                112       1124       05/30/2020

                                           112       1124       03/26/2020                112       1124       03/26/2020

                                           113       1132       02/24/2020                113       1132       02/24/2020

                                           115       1156       04/07/2020

                                           115       1156       05/06/2020

1 ACCEPTED SOLUTION

Accepted Solutions
tarheel13
Rhodochrosite | Level 12

For future, please post the data as datalines like I have done. You need to use in flags to get the desired output dataset.

data A;
input fam_id $ pers_id $;
datalines;
111 1112
112 1124
113 1132
114 1145
;
run;

data B;
format date mmddyy10.;
input fam_id $ pers_id $ date :mmddyy10.;
datalines;
111 1112 01/02/2020
111 1112 03/25/2020
111 1112 07/08/2020
112 1124 05/30/2020
112 1124 03/26/2020
113 1132 02/24/2020
115 1156 04/07/2020
115 1156 05/06/2020
;
run;

data C;
    merge A(in=inA) B(in=inB);
    by fam_id pers_id;
    if inA and inB;
run;

View solution in original post

3 REPLIES 3
tarheel13
Rhodochrosite | Level 12

can you post the merge you tried? 

tarheel13
Rhodochrosite | Level 12

For future, please post the data as datalines like I have done. You need to use in flags to get the desired output dataset.

data A;
input fam_id $ pers_id $;
datalines;
111 1112
112 1124
113 1132
114 1145
;
run;

data B;
format date mmddyy10.;
input fam_id $ pers_id $ date :mmddyy10.;
datalines;
111 1112 01/02/2020
111 1112 03/25/2020
111 1112 07/08/2020
112 1124 05/30/2020
112 1124 03/26/2020
113 1132 02/24/2020
115 1156 04/07/2020
115 1156 05/06/2020
;
run;

data C;
    merge A(in=inA) B(in=inB);
    by fam_id pers_id;
    if inA and inB;
run;
LisaZ1
Obsidian | Level 7
Thank you so much!! I'm new to SAS and I worked for these for almost 2 days.

SAS Innovate 2025: Register Now

Registration is now open for SAS Innovate 2025 , our biggest and most exciting global event of the year! Join us in Orlando, FL, May 6-9.
Sign up by Dec. 31 to get the 2024 rate of just $495.
Register now!

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

SAS Training: Just a Click Away

 Ready to level-up your skills? Choose your own adventure.

Browse our catalog!

Discussion stats
  • 3 replies
  • 567 views
  • 2 likes
  • 2 in conversation