Hello,
I want to subset a dataset into two datasets where in one dataset I want to keep all rows for an ID if the value for another variable is a certain value.
For eg.
Dataset I have:
ID | class | status |
1 | 1 | complete |
1 | 2 | Incomplete |
1 | 3 | complete |
2 | 1 | Incomplete |
2 | 2 | Incomplete |
2 | 3 | Incomplete |
3 | 1 | Incomplete |
3 | 2 | Incomplete |
4 | 1 | complete |
4 | 2 | complete |
4 | 3 | complete |
4 | 4 | complete |
Datasets I want are:
Dataset 1: status is all "Incomplete" for an ID.
ID | class | status |
2 | 1 | Incomplete |
2 | 2 | Incomplete |
2 | 3 | Incomplete |
3 | 1 | Incomplete |
3 | 2 | Incomplete |
Dataset 2: status is all "Complete" or combination of "Complete/ Incomplete" for an ID.
ID | class | status |
1 | 1 | complete |
1 | 2 | Incomplete |
1 | 3 | complete |
4 | 1 | complete |
4 | 2 | complete |
4 | 3 | complete |
4 | 4 | complete |
Thank you for your help.
If easy and lazy solutions are acceptable, here is one:
data have;
input ID class status $12.;
cards;
1 1 complete
1 2 Incomplete
1 3 complete
2 1 Incomplete
2 2 Incomplete
2 3 Incomplete
3 1 Incomplete
3 2 Incomplete
4 1 complete
4 2 complete
4 3 complete
4 4 complete
;
proc sql;
create table one as
select id, class, status
from have
group by id
having sum(status='Incomplete')=n(id)
order by id, class;
quit;
proc sql;
create table two as
select id, class, status
from have
group by id
having sum(status='complete')>0
order by id, class;
quit;
Simple and fast with a data step:
data ds1 ds2;
do until(last.ID);
set have; by id;
if status ne "Incomplete" then found = 1;
end;
do until(last.ID);
set have; by id;
if found then output ds2;
else output ds1;
end;
drop found;
run;
It's finally time to hack! Remember to visit the SAS Hacker's Hub regularly for news and updates.
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.
Ready to level-up your skills? Choose your own adventure.