BookmarkSubscribeRSS Feed
🔒 This topic is solved and locked. Need further help from the community? Please sign in and ask a new question.
Bounce
Fluorite | Level 6

Hello SAS community,


I have a dataset with 3 rows: ID, Date, Event. ID and Date are pretty self-explanatory but event contains different education phases for each ID: Kindergarten, Preschool, Primary, High School and University. Now obviously, not everyone will have completed every stage of this education (or I might be missing their data).

 

But how can I retain only the IDs who have data with every single Event (end-to-end)? For instance, in the below mock data, I want my final dataset to be in the exact same format but containing only ID 3 since it has *all* stages of the Events and ID 1 and 2 are missing certain Events.

 

Data I have:

data have;

input id date event;

1 01012000 Kindergarten

1 01012001 Preschool

1 01012003 Primary

1 01012010 High School

 

2 01012000 Primary

2 01012010 High School

 

3 01012009 Kindergarten

3 01012010 Preschool

3 01012012 Primary

3 01012019 High School

3 01012024 University

;

run;

 

Many thanks for your kind help SAS community.

1 ACCEPTED SOLUTION

Accepted Solutions
novinosrin
Tourmaline | Level 20

Hi @Bounce  Assuming you are having 5 distinct events, would it be easy to filter the count of distinct events=5 -


data have;
input (id date) ($) event & $20.;
cards;
1 01012000 Kindergarten
1 01012001 Preschool
1 01012003 Primary
1 01012010 High School
2 01012000 Primary
2 01012010 High School
3 01012009 Kindergarten
3 01012010 Preschool
3 01012012 Primary
3 01012019 High School
3 01012024 University
;

run;

proc sql;
create table want as
select *
from have
group by id
having count(distinct event)=5;
quit;

 

View solution in original post

1 REPLY 1
novinosrin
Tourmaline | Level 20

Hi @Bounce  Assuming you are having 5 distinct events, would it be easy to filter the count of distinct events=5 -


data have;
input (id date) ($) event & $20.;
cards;
1 01012000 Kindergarten
1 01012001 Preschool
1 01012003 Primary
1 01012010 High School
2 01012000 Primary
2 01012010 High School
3 01012009 Kindergarten
3 01012010 Preschool
3 01012012 Primary
3 01012019 High School
3 01012024 University
;

run;

proc sql;
create table want as
select *
from have
group by id
having count(distinct event)=5;
quit;