Hi,
I am quite new to sas programming and have not been able to figure out this problem. I am Looking for solution to flag records if a certain criteria is met (in the below example, the criteria is value column < 3) and maintains that throughout the succeeding records for that ID. If there is any unmet until the end of the record for that ID then dont flag any. One more criteria is only flag if there are two consecutive that meet the value < 3. Below is the have and want dataset. The data will be sorted as ID and Day. I am able to do the one without missing but with missing is little confusing for me. Below is the code that I have so far.
proc sort data=have;
by id descending day ;
run;
data want ;
set have ;
by id descending day;
if first.id then flag='Y';
if value > 2 then flag=' ';
retain flag;
run;
This is have dataset
ID | day | value |
1 | 1 | 3 |
1 | 2 | 2 |
1 | 3 | 3 |
1 | 4 | 3 |
1 | 5 | 2 |
1 | 6 | 2 |
1 | 7 | 1 |
2 | 1 | 2 |
2 | 2 | 1 |
2 | 3 | 3 |
2 | 4 | 2 |
2 | 5 | 5 |
2 | 6 | 3 |
2 | 7 | 1 |
3 | 1 | 5 |
3 | 2 | 2 |
3 | 3 | 4 |
3 | 4 | 1 |
3 | 5 | . |
3 | 6 | 1 |
3 | 7 | 2 |
3 | 8 | 2 |
3 | 9 | 1 |
4 | 1 | 3 |
4 | 2 | 3 |
4 | 3 | 3 |
4 | 4 | 4 |
4 | 5 | 2 |
4 | 6 | 5 |
4 | 7 | 1 |
4 | 8 | 4 |
5 | 1 | 2 |
5 | 2 | 3 |
5 | 3 | 5 |
5 | 4 | 1 |
5 | 5 | 8 |
5 | 6 | 1 |
5 | 7 | 2 |
5 | 8 | . |
5 | 9 | . |
5 | 10 | 2 |
5 | 11 | 2 |
5 | 12 | 1 |
5 | 13 | . |
6 | 1 | 5 |
6 | 2 | 1 |
6 | 3 | 6 |
6 | 4 | 7 |
6 | 5 | 8 |
6 | 6 | 1 |
6 | 7 | 7 |
6 | 8 | 3 |
6 | 9 | 2 |
6 | 10 | 2 |
This is what I am looking to get
ID | day | value | flag |
1 | 1 | 3 | |
1 | 2 | 2 | |
1 | 3 | 3 | |
1 | 4 | 3 | |
1 | 5 | 2 | Y |
1 | 6 | 2 | Y |
1 | 7 | 1 | Y |
2 | 1 | 2 | |
2 | 2 | 1 | |
2 | 3 | 3 | |
2 | 4 | 2 | |
2 | 5 | 5 | |
2 | 6 | 3 | |
2 | 7 | 1 | |
3 | 1 | 5 | |
3 | 2 | 2 | |
3 | 3 | 4 | |
3 | 4 | 1 | |
3 | 5 | . | |
3 | 6 | 1 | Y |
3 | 7 | 2 | Y |
3 | 8 | 2 | Y |
3 | 9 | 1 | Y |
4 | 1 | 3 | |
4 | 2 | 3 | |
4 | 3 | 3 | |
4 | 4 | 4 | |
4 | 5 | 2 | |
4 | 6 | 5 | |
4 | 7 | 1 | |
4 | 8 | 4 | |
5 | 1 | 2 | |
5 | 2 | 3 | |
5 | 3 | 5 | |
5 | 4 | 1 | |
5 | 5 | 8 | |
5 | 6 | 1 | Y |
5 | 7 | 2 | Y |
5 | 8 | . | Y |
5 | 9 | . | Y |
5 | 10 | 2 | Y |
5 | 11 | 2 | Y |
5 | 12 | 1 | Y |
5 | 13 | . | Y |
6 | 1 | 5 | |
6 | 2 | 1 | |
6 | 3 | 6 | |
6 | 4 | 7 | |
6 | 5 | 8 | |
6 | 6 | 1 | |
6 | 7 | 7 | |
6 | 8 | 3 | |
6 | 9 | 2 | Y |
6 | 10 | 2 | Y |
As you can see the for ID = 1, day 2 is not flagged because there is >= 3 value after that. Also if you look at ID 3 then day 4 and 5 is not Y because not two consecutive but ID =5 we flagged everything from Day 6 to 13 because we met the criteria at Day 6 and 7 so as long as there is no value > 3 then we flag it . I am looking to flag records that meet the criteria of value < 3 for 2 consecutive days and maintains it throughout the last records.
Something like this should work
data WANT;
set HAVE end=LASTOBS;
by ID descending DAY;
if ^LASTOBS then set HAVE(firstobs=2 keep=ID VALUE rename=(ID=NEXTID VALUE=NEXTVAL));
if first.ID then FLAG=' ';
if VALUE<3 and NEXTVAL<3 and ID=NEXTID FLAG='Y';
else if FLAG ne 'Y' then FLAG=' ';
retain FLAG ;
run;
Untested as there's no usable data.
The WANT data that you show are not consistent. FLAG is set blank when encountering a missing value for ID=3, but not for ID=5.
Assuming that the WANT data for ID=5 is correct, this should do the trick:
proc sort data=have;
by id descending day;
run;
data want;
set have;
by id;
if first.id then do;
flag=' ';
if value<3 then do;
_N_=_N_+1;
set have(keep=value rename=(value=next)) point=_N_;
if next<3 then flag='Y';
end;
end;
else if value>=3 then
flag=' ';
drop next;
retain flag;
run;
proc sort data=want;
by id day;
run;
It's finally time to hack! Remember to visit the SAS Hacker's Hub regularly for news and updates.
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.
Ready to level-up your skills? Choose your own adventure.