I have a dataset with patient_type=living and patient_type=deceased, as well as a column for match_ID.
I want to keep only observations where patient_type=living and patient_type=deceased have the same match_ID, and exclude everything else.
For example,
Key Px_Type Match_ID
1115 L 0001a
1116 D 0001a
1117 D 0004b
1118 L 0003b
I want to keep observations 1115 (L) and 1116 (D) since they have the same match_ID. Is there a way to approach this in SAS?
data have;
input Key     Px_Type $  Match_ID $;
cards;
1115    L               0001a
1116    D              0001a
1117    D              0004b
1118    L               0003b
;
run;
proc sql;
select *
 from have
  group by  Match_ID
   having count(distinct  Px_Type) ne 1;
quit;If your data is sorted then you could try a statement like this
if compress(Match_ID) eq lag(compress(Match_ID)) and compress(Px_Type) in ( "D","L") then output ;
Is there other data that has to be preserved? If so, what exactly would the result look like?
data have;
input Key     Px_Type $  Match_ID $;
cards;
1115    L               0001a
1116    D              0001a
1117    D              0004b
1118    L               0003b
;
run;
proc sql;
select *
 from have
  group by  Match_ID
   having count(distinct  Px_Type) ne 1;
quit;It's finally time to hack! Remember to visit the SAS Hacker's Hub regularly for news and updates.
Check out this tutorial series to learn how to build your own steps in SAS Studio.
Find more tutorials on the SAS Users YouTube channel.
Ready to level-up your skills? Choose your own adventure.
