Hello,
I want to subset a dataset into two datasets where in one dataset I want to keep all rows for an ID if the value for another variable is a certain value.
For eg.
Dataset I have:
ID | class | status |
1 | 1 | complete |
1 | 2 | Incomplete |
1 | 3 | complete |
2 | 1 | Incomplete |
2 | 2 | Incomplete |
2 | 3 | Incomplete |
3 | 1 | Incomplete |
3 | 2 | Incomplete |
4 | 1 | complete |
4 | 2 | complete |
4 | 3 | complete |
4 | 4 | complete |
Datasets I want are:
Dataset 1: status is all "Incomplete" for an ID.
ID | class | status |
2 | 1 | Incomplete |
2 | 2 | Incomplete |
2 | 3 | Incomplete |
3 | 1 | Incomplete |
3 | 2 | Incomplete |
Dataset 2: status is all "Complete" or combination of "Complete/ Incomplete" for an ID.
ID | class | status |
1 | 1 | complete |
1 | 2 | Incomplete |
1 | 3 | complete |
4 | 1 | complete |
4 | 2 | complete |
4 | 3 | complete |
4 | 4 | complete |
Thank you for your help.
If easy and lazy solutions are acceptable, here is one:
data have;
input ID class status $12.;
cards;
1 1 complete
1 2 Incomplete
1 3 complete
2 1 Incomplete
2 2 Incomplete
2 3 Incomplete
3 1 Incomplete
3 2 Incomplete
4 1 complete
4 2 complete
4 3 complete
4 4 complete
;
proc sql;
create table one as
select id, class, status
from have
group by id
having sum(status='Incomplete')=n(id)
order by id, class;
quit;
proc sql;
create table two as
select id, class, status
from have
group by id
having sum(status='complete')>0
order by id, class;
quit;
Simple and fast with a data step:
data ds1 ds2;
do until(last.ID);
set have; by id;
if status ne "Incomplete" then found = 1;
end;
do until(last.ID);
set have; by id;
if found then output ds2;
else output ds1;
end;
drop found;
run;
Available on demand!
Missed SAS Innovate Las Vegas? Watch all the action for free! View the keynotes, general sessions and 22 breakouts on demand.
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.