New SAS User

asgee · Posted 02-24-2020 02:44 PM

Hi all,

I'm working on a dataset where some IDs have repeated measurements over time. A template example is shown below. The data is displayed in chronological order (i.e. time 1, time 2, time 3, etc...):

ID	Data
AAA	14.00
AAA	13.00
BB	18.00
BB	21.00
CC	19.00
CC	14.00
CC	10.00
DDD	9.00
DDD	12.00
EE	11.00
EE	17.00
EE	10.00

Each data obtained per ID was collected longitudinally, meaning that the data is in chronological order. This is important since I want to only obtain those ID's where the data was < 15.00 between two-points back-to-back.

As you can see, "AAA" would count since between time 1 and time 2, the data taken indicates that its < 15.00.

The ID "CC" on the other hand has 3 data points (19.00, 14.00, and 10.00). In this case, the cut-off (< 15.00) would apply for this subject since at Time 2 (data = 14.00) and Time 3 (10.00), were obtained right after the other.

On the other hand, the ID "EE" also has 3 data points, but the cut-off (<15.00) applies for Time 1 (data = 11.00) and Time 3 (data = 10.00), therefore, this ID would not meet the requirement of (< 15.00) AND that they have to be right after each other.

I would like to subset the subjects so that I end up with something like this:

ID	Data
AAA	14.00
AAA	13.00
CC	19.00
CC	14.00
CC	10.00
DDD	9.00
DDD	12.00

As you can see, only those ID's where the cut-off was <15.00 and where two observations were identified as <15.00 right after the other (in the case of "CC") were drawn. Important that in the case for "CC", the first time point where data = 19.00 was still drawn since I'm less concerned about the raw numbers per se, but more so that this ID at any point met the criteria of <15.00 in two subsequent time points.

I've attempted to run the code below, but it seems that this doesn't achieve the restriction that I wanted to do:

data test;
set master;
if ID ne (0 < data < 15.0) then delete;
run;

Any help would be very much appreciated!

Thanks,

AG

novinosrin · Posted 02-24-2020 02:55 PM

Hi @asgee Assuming I understand your requirement--


data have;
input ID $	Data;
cards;
AAA	14.00
AAA	13.00
BB	18.00
BB	21.00
CC	19.00
CC	14.00
CC	10.00
DDD	9.00
DDD	12.00
EE	11.00
EE	17.00
EE	10.00
;

data want;
 do _n_=1 by 1 until(last.id);
  set have;
  by id;
  if 0 < data < 15.0 then _count=sum(_count,1);
  else _count=0;
  if _count>=2 then _flag=1;
 end;
 do _n_=1 to _n_;
  set have;
  if _flag then output;
 end;
 drop _:;
run;

View solution in original post

novinosrin · Posted 02-24-2020 02:55 PM

Hi @asgee Assuming I understand your requirement--


data have;
input ID $	Data;
cards;
AAA	14.00
AAA	13.00
BB	18.00
BB	21.00
CC	19.00
CC	14.00
CC	10.00
DDD	9.00
DDD	12.00
EE	11.00
EE	17.00
EE	10.00
;

data want;
 do _n_=1 by 1 until(last.id);
  set have;
  by id;
  if 0 < data < 15.0 then _count=sum(_count,1);
  else _count=0;
  if _count>=2 then _flag=1;
 end;
 do _n_=1 to _n_;
  set have;
  if _flag then output;
 end;
 drop _:;
run;

asgee · Posted 02-24-2020 03:20 PM

Hi @novinosrin

Yes this is exactly what I'm looking for! Thanks for your help!

New SAS User

How do I subset rows based on a range of values?

Re: How do I subset rows based on a range of values?

Re: How do I subset rows based on a range of values?

Re: How do I subset rows based on a range of values?

Follow Us

What is...

New SAS User

How do I subset rows based on a range of values?

Re: How do I subset rows based on a range of values?

Re: How do I subset rows based on a range of values?

Re: How do I subset rows based on a range of values?

Our biggest data and AI event of the year.

Follow Us

What is...