I have a dataset with patient_type=living and patient_type=deceased, as well as a column for match_ID.
I want to keep only observations where patient_type=living and patient_type=deceased have the same match_ID, and exclude everything else.
For example,
Key Px_Type Match_ID
1115 L 0001a
1116 D 0001a
1117 D 0004b
1118 L 0003b
I want to keep observations 1115 (L) and 1116 (D) since they have the same match_ID. Is there a way to approach this in SAS?
data have;
input Key Px_Type $ Match_ID $;
cards;
1115 L 0001a
1116 D 0001a
1117 D 0004b
1118 L 0003b
;
run;
proc sql;
select *
from have
group by Match_ID
having count(distinct Px_Type) ne 1;
quit;
If your data is sorted then you could try a statement like this
if compress(Match_ID) eq lag(compress(Match_ID)) and compress(Px_Type) in ( "D","L") then output ;
Is there other data that has to be preserved? If so, what exactly would the result look like?
data have;
input Key Px_Type $ Match_ID $;
cards;
1115 L 0001a
1116 D 0001a
1117 D 0004b
1118 L 0003b
;
run;
proc sql;
select *
from have
group by Match_ID
having count(distinct Px_Type) ne 1;
quit;
Good news: We've extended SAS Hackathon registration until Sept. 12, so you still have time to be part of our biggest event yet – our five-year anniversary!
Check out this tutorial series to learn how to build your own steps in SAS Studio.
Find more tutorials on the SAS Users YouTube channel.
Ready to level-up your skills? Choose your own adventure.