Hi,
I have two datasets in long format that have variables “ID” and “incident_number” in common. Each person has a unique ID number but two or more people can have the same incident_number if they violate school policy together (for example, both of them engaged in underage drinking at the same party; so the underage drinking at the party would count as one incident).
If DatasetB doesn’t have a matching ID and incident number with DatasetA, I want to delete unmatched cases in DatasetB. So, I want to select cases based on a combination of ID and incident number.
Is there a way to modify DatasetB in this way?
Have DatasetA:
ID | Incident_number | VarA | VarB |
1 | 56789 | Abcd | 1232 |
1 | 56789 | Erfsdjdgr | 94847 |
2 | 64908 | Fakjsdf;asie | 5342 |
2 | 56789 | Sajfldkjsa;ljf | 54698734 |
3 | 78621 | Kjdafjdh | 0589576 |
4 | 78621 | afsdfasdkljfa | 1154 |
4 | 78721 | Fsafd | 987888 |
Have DatasetB:
ID | Incident_number | VarX | VarY |
1 | 56789 | 56789 | galdjfoia |
1 | 56789 | 56789 | sadjfakhd |
1 | 56818 | 56818 | weoiatlk |
2 | 56789 | 124363 | Abjdkafh |
2 | 64826 | 5463 | Giagaojdfl |
3 | 78629 | 1545744654 | Ajfu |
3 | 78799 | 1265841 | Fkajhdfajh |
4 | 78621 | 9378464 | Nadfaei |
4 | 79725 | 5486513541 | Lirueoi |
9 | 79102 | 23756 | Nann |
12 | 80215 | 987543 | Ueyrf |
Want new_DatasetB
ID | Incident | VarX | VarY |
1 | 56789 | 56789 | galdjfoia |
1 | 56789 | 56789 | sadjfakhd |
2 | 56789 | 124363 | Abjdkafh |
4 | 78621 | 9378464 | Nadfaei |
Thank you for your help.
Hash Table ?
data DatasetA; input ID Incident_number VarA $ VarB ; cards4; 1 56789 Abcd 1232 1 56789 Erfsdjdgr 94847 2 64908 Fakjsdf;asie 5342 2 56789 Sajfldkjsa;ljf 54698734 3 78621 Kjdafjdh 0589576 4 78621 afsdfasdkljfa 1154 4 78721 Fsafd 987888 ;;;; run; data DatasetB; input ID Incident_number VarX VarY $; cards4; 1 56789 56789 galdjfoia 1 56789 56789 sadjfakhd 1 56818 56818 weoiatlk 2 56789 124363 Abjdkafh 2 64826 5463 Giagaojdfl 3 78629 1545744654 Ajfu 3 78799 1265841 Fkajhdfajh 4 78621 9378464 Nadfaei 4 79725 5486513541 Lirueoi 9 79102 23756 Nann 12 80215 987543 Ueyrf ;;;; run; data want; if _n_ eq 1 then do; declare hash h(dataset:'DatasetA'); h.definekey('ID','Incident_number'); h.definedone(); end; set DatasetB; if h.check()=0; run;
Xia Keshan
Try this,
proc sort data=a; by ID Incident_number; run;
proc sort data=b; by ID Incident_number; run;
data want; merge a(in=a) b(in=b); by ID Incident_number; if a=b; drop VarA VarB; run;
proc print noobs; run;
Thank you K C Wong. Is there any way of achieving the same result without merging the datasets?
Hash Table ?
data DatasetA; input ID Incident_number VarA $ VarB ; cards4; 1 56789 Abcd 1232 1 56789 Erfsdjdgr 94847 2 64908 Fakjsdf;asie 5342 2 56789 Sajfldkjsa;ljf 54698734 3 78621 Kjdafjdh 0589576 4 78621 afsdfasdkljfa 1154 4 78721 Fsafd 987888 ;;;; run; data DatasetB; input ID Incident_number VarX VarY $; cards4; 1 56789 56789 galdjfoia 1 56789 56789 sadjfakhd 1 56818 56818 weoiatlk 2 56789 124363 Abjdkafh 2 64826 5463 Giagaojdfl 3 78629 1545744654 Ajfu 3 78799 1265841 Fkajhdfajh 4 78621 9378464 Nadfaei 4 79725 5486513541 Lirueoi 9 79102 23756 Nann 12 80215 987543 Ueyrf ;;;; run; data want; if _n_ eq 1 then do; declare hash h(dataset:'DatasetA'); h.definekey('ID','Incident_number'); h.definedone(); end; set DatasetB; if h.check()=0; run;
Xia Keshan
Thank you very much xia keshan!
Are you ready for the spotlight? We're accepting content ideas for SAS Innovate 2025 to be held May 6-9 in Orlando, FL. The call is open until September 25. Read more here about why you should contribute and what is in it for you!
Learn the difference between classical and Bayesian statistical approaches and see a few PROC examples to perform Bayesian analysis in this video.
Find more tutorials on the SAS Users YouTube channel.