DATA Step, Macro, Functions and more

processing multiple observation per ID

Reply
New Contributor
Posts: 2

processing multiple observation per ID

I have a datset that has multiple observation per subject ID.

Have:

ID   status
001  completed
001  failed
001  failed
002  completed
002  completed
002  partially completed
002  failed
003  failed
003  failed
003  failed

From the above data I want to create a dataset with one record per subject and create a new variable(allfail) which has a value of '1' if all the observation for that subject have a 'failed' status.

Want:

ID  allfail
001 .
002 .
003 1

 

Currently I am counting the total number of observation for each subject and then checking to see if is equal to count of distinct status for each subject. Wanted to see if there is a more efficient way to do this? Thank you.

 

 

Super User
Posts: 22,874

Re: processing multiple observation per ID

SQL is great for these type of problems.

The logic here is to check if the max = min of the column which means all values are the same, and then to check if it equals the failed value.  You could also do this in a data step using First/Last logic and RETAIN. 

 

proc sql;
create table want as
select ID, case when max(status)=min(status) and max(status) = 'failed' then 1
               else . end as all_failed_flag
from have
group by ID;
quit;

proc print data=want;
run;

@Div_sas wrote:

I have a datset that has multiple observation per subject ID.

Have:

ID   status
001  completed
001  failed
001  failed
002  completed
002  completed
002  partially completed
002  failed
003  failed
003  failed
003  failed

From the above data I want to create a dataset with one record per subject and create a new variable(allfail) which has a value of '1' if all the observation for that subject have a 'failed' status.

Want:

ID  allfail
001 .
002 .
003 1

 

Currently I am counting the total number of observation for each subject and then checking to see if is equal to count of distinct status for each subject. Wanted to see if there is a more efficient way to do this? Thank you.

 

 


 

New Contributor
Posts: 2

Re: processing multiple observation per ID

Thank you!
PROC Star
Posts: 1,351

Re: processing multiple observation per ID

data have;
input ID $   status & $20.;
datalines;
001  completed
001  failed
001  failed
002  completed
002  completed
002  partially completed
002  failed
003  failed
003  failed
003  failed
;

proc sql;
create table want as
select id,sum(status='failed')=count(status) as allfail
from have
group by id;
quit;
Ask a Question
Discussion stats
  • 3 replies
  • 90 views
  • 2 likes
  • 3 in conversation