Solved: In repeated measure how do i delete unwanted ids

Douce · Posted 04-20-2016 04:18 PM

here is the the structure of my data below:

i only need ID 2 since all patients are alive, therefore trying to delete id1:

Id sex status

1 2 A

1 2 D

2 1 A

Pamela_JSRCC · Posted 04-21-2016 10:18 AM

It will sum the values from the expression "status='A'" by ID and compare that to the number of rows, so it will keep the ones where all the rows with that ID have status='A'. This is nifty usage of the group by and having clause. To see what is going on, like when debugging, you could "name" the flag and view an "intermediate" table:

data tbl;

input id sex status $;

datalines;

1 2 A

1 2 R

1 2 A

1 2 D

2 1 A

;

/* Create intermediate table to see the flag */

proc sql;

create table have as

select *, status='A' as alive_flag 'alive flag'

from tbl

;

select * from have

;

create table want as

select * from have

group by id

having sum(alive_flag)=count(*)

;

select * from want;

/*

alive

id sex status flag

-------------------

1 2 A 1

1 2 R 0

1 2 A 1

1 2 D 0

2 1 A 1

alive

id sex status flag

-------------------

2 1 A 1

*/

View solution in original post

Astounding · Posted 04-20-2016 05:01 PM

The simplest program might be this. Assuming your data set is already sorted by ID:

data want;

merge have (where=(status='D') in=delete_me) have;

by ID;

if delete_me then delete;

run;

Note that this deletes IDs that contain a "D" only. If you want to delete anything but "A" you would replace the WHERE condition:

where=(status ne 'A')

While simplest, this approach will unfortunately give you some undesirable messages in the log. The messages won't hurt in this particular case, but could under other circumstances. So a more complex, cleaner method would be:

data want;

wanted='Y';;

do until (last.id);

set have;

by ID;

if status='D' then wanted='N';

end;

do until (last.id);

set have;

by ID;

if wanted='Y' then output;

end;

run;

As before, you could conceivably change the conditions:

if status ne 'A' then wanted='N';

Douce · Posted 04-20-2016 05:09 PM

I appreciate your prompt response.

I was using the first method and in fact if was only omiting obs where id='d'; when my goal is to remove all obs with that id; ie delete id 1 all together and ending with id 2 only.

Pamela_JSRCC · Posted 04-20-2016 05:09 PM

I usually create a "discard" table before deleting from the data. I'm assuming that 'A' is for Alive and 'D' is for Dead, but what is 'R'? Anyway, the discard table below is populated by any rows where the subject is not 'A', so we can delete any ID from the original table that appears in the discard table.

proc sql;
create table discard as
select *
from tbl
where status ^= 'A'
;
delete from tbl
where id in (select id from discard);

Douce · Posted 04-20-2016 05:15 PM

R is for retransplant 🙂 thank you for your suggestion.

Ksharp · Posted 04-20-2016 10:16 PM

data have;
input Id  sex  status $;
cards;
1    2     A
1    2     A
1    2     A
1    2     D
2    1     A
2    1     A
2    1     A
;
run;

proc sql;
 create table want as
  select * from have
   group by id
    having sum(status='A')=count(*);
quit;

Babloo · Posted 04-21-2016 07:37 AM

Could you please tell me how the following 'having' clause will work in below proc sql? what variable it will sum?

proc sql;
 create table want as
  select * from have
   group by id
    having sum(status='A')=count(*);
quit;

FreelanceReinh · Posted 04-21-2016 10:12 AM

Hi @Babloo,

This SUM function adds Boolean values, more precisely: their numeric equivalents 1 (for TRUE) and 0 (for FALSE). These arise from the equation status='A'. So, an observation with STATUS='A' contributes a 1 to the sum and other observations a 0. Thus, sum(status='A') is the number of observations with STATUS='A' in the BY group (due to the GROUP BY clause). If this number is equal to count(*), the number of all observations in the BY group, this means that all observations in the BY group must have STATUS='A'.

Pamela_JSRCC · Posted 04-21-2016 10:18 AM

It will sum the values from the expression "status='A'" by ID and compare that to the number of rows, so it will keep the ones where all the rows with that ID have status='A'. This is nifty usage of the group by and having clause. To see what is going on, like when debugging, you could "name" the flag and view an "intermediate" table:

data tbl;

input id sex status $;

datalines;

1 2 A

1 2 R

1 2 A

1 2 D

2 1 A

;

/* Create intermediate table to see the flag */

proc sql;

create table have as

select *, status='A' as alive_flag 'alive flag'

from tbl

;

select * from have

;

create table want as

select * from have

group by id

having sum(alive_flag)=count(*)

;

select * from want;

/*

alive

id sex status flag

-------------------

1 2 A 1

1 2 R 0

1 2 A 1

1 2 D 0

2 1 A 1

alive

id sex status flag

-------------------

2 1 A 1

*/

In repeated measure how do i delete unwanted ids

Re: In repeated measure how do i delete unwanted ids

Re: In repeated measure how do i delete unwanted ids

Re: In repeated measure how do i delete unwanted ids

Re: In repeated measure how do i delete unwanted ids

Re: In repeated measure how do i delete unwanted ids

Re: In repeated measure how do i delete unwanted ids

Re: In repeated measure how do i delete unwanted ids

Re: In repeated measure how do i delete unwanted ids

Re: In repeated measure how do i delete unwanted ids

Registration is open

SAS Training: Just a Click Away