BookmarkSubscribeRSS Feed
SarahW13
Obsidian | Level 7

Hello all!

 

I have a dataset for which I need to identify ids that had two consecutive values of ≥180.

 

Below is a sample dataset:

 

Id       date                 value

1        1/3/15              140

1        2/9/15              145

1        3/4/15              132

2        1/8/16              150

2        2/15/16            180

2        3/14/16            181

3        1/1/15              110

3        2/1/15              180

3        3/1/15              190

4        1/7/17              130

4        2/8/17              180

 

From this example, I would want to identify that id 2 and id 3 had two consecutive values of ≥180.

 

Any advice?

4 REPLIES 4
Kurt_Bremser
Super User

Retain a flag variable, use by id;

if first.id then flag = 0;
if not first.id and lag(value) >= 180 and value >= 180 then flag = 1;
if last.id and flag then output;

 

Edit: changed second comparison from "greater" (>) to "greater or equal" (>=).

Astounding
PROC Star

That looks like the right set of tools, but you might need to tweak the program.  I could be wrong (can't test it right now), but the program might erroneously flag this situation:

 

ID   value

1      100

1      200

2      100

2      200

 

If that situation turns out to be a problem, here's the fix:

 

data want;

set have;

by id;

prior_value = lag(value);

if first.id then flag=0;

else if value >= 180 and prior_value >= 180 then flag+1;

if last.id and flag;

run;

 

Kurt_Bremser
Super User

@Astounding wrote:

That looks like the right set of tools, but you might need to tweak the program.  I could be wrong (can't test it right now), but the program might erroneously flag this situation:

 

ID   value

1      100

1      200

2      100

2      200

 

If that situation turns out to be a problem, here's the fix:

 

data want;

set have;

by id;

prior_value = lag(value);

if first.id then flag=0;

else if value >= 180 and prior_value >= 180 then flag+1;

if last.id and flag;

run;

 


Just checked it:

data have;
input id value;
cards;
1      100
1      200
2      100
2      200
;
run;

data want (keep=id);
set have;
by id;
retain flag;
if first.id then flag = 0;
if not first.id and lag(value) >= 180 and value >= 180 then flag = 1;
if last.id and flag then output;
run;

Result is an empty dataset, as it should be. SAS does not optimize the if condition, so the lag() function is reliably executed in every iteration.

That's why I specifically omitted the "obvious" else for the check of "first.id", and used a separate check for "not first.id" instead.

PGStats
Opal | Level 21

Extract those IDs with:

 

data want;
do until(last.id);
    set have; by id;
    if met then if value >= 180 then consec = 1;
    met = value >= 180;
    end;
do until(last.id);
    set have; by id;
    if consec then output;
    end;
drop met consec;
run;
PG
How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

SAS Training: Just a Click Away

 Ready to level-up your skills? Choose your own adventure.

Browse our catalog!

Discussion stats
  • 4 replies
  • 2364 views
  • 0 likes
  • 4 in conversation