BookmarkSubscribeRSS Feed
☑ This topic is solved. Need further help from the community? Please sign in and ask a new question.
LisaZ1
Obsidian | Level 7

hi, I now have two datasets(A and B), and I want a new datasets that combine these two based on what A has.

C is the output dataset that I want. In C, I only want data(fam-id, pers-id, date) that have the same fam-id and pers-id in both A and B. A and B are in different size and for the same pair of fam-id and pers-id, A only has one row, but B may have multiple rows and I want multiple rows to show up in C. If a pair of fam-id and pers-id appears only in B but not in A, then it should not appear in C. I tried to use merge in the data step, but it kept giving me output that apart from what I want, it has rows that appear only in B(two rows with fam-id=115 in the following example).

Any suggestions would be appreciated! Thank you! 

A :                                  B:                                                         C:

fam-id  pers-id                  fam-id   pers-id         date                     fam-id   pers-id   date

  111     1112                      111       1112       01/02/2020                111       1112       01/02/2020 

  112      1124                     111       1112       03/25/2020                111       1112       03/25/2020

  113      1132                     111       1112       07/08/2020                111       1112       07/08/2020

  114      1145                     112       1124       05/30/2020                112       1124       05/30/2020

                                           112       1124       03/26/2020                112       1124       03/26/2020

                                           113       1132       02/24/2020                113       1132       02/24/2020

                                           115       1156       04/07/2020

                                           115       1156       05/06/2020

1 ACCEPTED SOLUTION

Accepted Solutions
tarheel13
Rhodochrosite | Level 12

For future, please post the data as datalines like I have done. You need to use in flags to get the desired output dataset.

data A;
input fam_id $ pers_id $;
datalines;
111 1112
112 1124
113 1132
114 1145
;
run;

data B;
format date mmddyy10.;
input fam_id $ pers_id $ date :mmddyy10.;
datalines;
111 1112 01/02/2020
111 1112 03/25/2020
111 1112 07/08/2020
112 1124 05/30/2020
112 1124 03/26/2020
113 1132 02/24/2020
115 1156 04/07/2020
115 1156 05/06/2020
;
run;

data C;
    merge A(in=inA) B(in=inB);
    by fam_id pers_id;
    if inA and inB;
run;

View solution in original post

3 REPLIES 3
tarheel13
Rhodochrosite | Level 12

can you post the merge you tried? 

tarheel13
Rhodochrosite | Level 12

For future, please post the data as datalines like I have done. You need to use in flags to get the desired output dataset.

data A;
input fam_id $ pers_id $;
datalines;
111 1112
112 1124
113 1132
114 1145
;
run;

data B;
format date mmddyy10.;
input fam_id $ pers_id $ date :mmddyy10.;
datalines;
111 1112 01/02/2020
111 1112 03/25/2020
111 1112 07/08/2020
112 1124 05/30/2020
112 1124 03/26/2020
113 1132 02/24/2020
115 1156 04/07/2020
115 1156 05/06/2020
;
run;

data C;
    merge A(in=inA) B(in=inB);
    by fam_id pers_id;
    if inA and inB;
run;
LisaZ1
Obsidian | Level 7
Thank you so much!! I'm new to SAS and I worked for these for almost 2 days.

hackathon24-white-horiz.png

The 2025 SAS Hackathon has begun!

It's finally time to hack! Remember to visit the SAS Hacker's Hub regularly for news and updates.

Latest Updates

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

SAS Training: Just a Click Away

 Ready to level-up your skills? Choose your own adventure.

Browse our catalog!

Discussion stats
  • 3 replies
  • 965 views
  • 2 likes
  • 2 in conversation