Hello,
I want to subset a dataset into two datasets where in one dataset I want to keep all rows for an ID if the value for another variable is a certain value.
For eg.
Dataset I have:
ID | class | status |
1 | 1 | complete |
1 | 2 | Incomplete |
1 | 3 | complete |
2 | 1 | Incomplete |
2 | 2 | Incomplete |
2 | 3 | Incomplete |
3 | 1 | Incomplete |
3 | 2 | Incomplete |
4 | 1 | complete |
4 | 2 | complete |
4 | 3 | complete |
4 | 4 | complete |
Datasets I want are:
Dataset 1: status is all "Incomplete" for an ID.
ID | class | status |
2 | 1 | Incomplete |
2 | 2 | Incomplete |
2 | 3 | Incomplete |
3 | 1 | Incomplete |
3 | 2 | Incomplete |
Dataset 2: status is all "Complete" or combination of "Complete/ Incomplete" for an ID.
ID | class | status |
1 | 1 | complete |
1 | 2 | Incomplete |
1 | 3 | complete |
4 | 1 | complete |
4 | 2 | complete |
4 | 3 | complete |
4 | 4 | complete |
Thank you for your help.
If easy and lazy solutions are acceptable, here is one:
data have;
input ID class status $12.;
cards;
1 1 complete
1 2 Incomplete
1 3 complete
2 1 Incomplete
2 2 Incomplete
2 3 Incomplete
3 1 Incomplete
3 2 Incomplete
4 1 complete
4 2 complete
4 3 complete
4 4 complete
;
proc sql;
create table one as
select id, class, status
from have
group by id
having sum(status='Incomplete')=n(id)
order by id, class;
quit;
proc sql;
create table two as
select id, class, status
from have
group by id
having sum(status='complete')>0
order by id, class;
quit;
Simple and fast with a data step:
data ds1 ds2;
do until(last.ID);
set have; by id;
if status ne "Incomplete" then found = 1;
end;
do until(last.ID);
set have; by id;
if found then output ds2;
else output ds1;
end;
drop found;
run;
SAS Innovate 2025 is scheduled for May 6-9 in Orlando, FL. Sign up to be first to learn about the agenda and registration!
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.
Ready to level-up your skills? Choose your own adventure.