Greetings,
I have a question about the following example dataset of 3 subjects and condition level responses (0, 1, 2) for each visit:
Subject | visit | condition_level |
A | 1 | 0 |
A | 2 | 0 |
A | 3 | 1 |
B | 1 | 0 |
B | 2 | 1 |
B | 3 | 0 |
B | 4 | 0 |
C | 1 | 1 |
C | 2 | 2 |
C | 3 | 1 |
What I want to do is specify for all the subjects that they cannot regress on the condition level: i.e. once a person reach 1 or 2, they cannot go back to 0 or 1, and must instead keep the previous level. For the dataset I would like to modify it to the following:
Subject | visit | condition_level |
A | 1 | 0 |
A | 2 | 0 |
A | 3 | 1 |
B | 1 | 0 |
B | 2 | 1 |
B | 3 | 1 |
B | 4 | 1 |
C | 1 | 1 |
C | 2 | 2 |
C | 3 | 2 |
I've tried using the lag() function and retain statements, but so far nothing has worked for me. Perhaps an iterative DO loop will be needed?
I'd appreciate any help/advice here.
Hi @MigMaster12 Assuming I understand your requirement -
data have;
input Subject $ visit condition_level;
cards;
A 1 0
A 2 0
A 3 1
B 1 0
B 2 1
B 3 0
B 4 0
C 1 1
C 2 2
C 3 1
;
data Want;
do until(last.subject);
set have;
by subject;
if _n_ then _condition_level=condition_level;
if not first.subject and condition_level in (1,2) then _n_=0;
output;
end;
drop condition_level;
rename _condition_level=condition_level;
run;
The only thing I'm not quite clear is C reached condition level in 1st visit. Therefore to meet your requirement, I have assumed to start the check from the 2nd record of each subject
I suggest you keep a _HISTORIC_MAX variable, and compare it to each incoming condition_level. Update each as appropriate:
data have;
input Subject $ visit condition_level;
cards;
A 1 0
A 2 0
A 3 1
B 1 0
B 2 1
B 3 0
B 4 0
C 1 1
C 2 2
C 3 1
;
data want (drop=_:);
set have;
by subject;
retain _historic_max;
if first.subject then _historic_max=condition_level; else
if condition_level>_historic_max then _historic_max=condition_level; else
condition_level=max(_historic_max,condition_level);
run;
data have;
input Subject $ visit condition_level;
cards;
A 1 0
A 2 0
A 3 1
B 1 0
B 2 1
B 3 0
B 4 0
C 1 1
C 2 2
C 3 1
;
data want;
set have;
by subject;
retain want;
if first.subject then call missing(want);
want=max(want,condition_level) ;
run;;
Another, slightly different take:
data want;
set have;
by subject;
retain _cl;
if first.subject
then _cl = condition_level;
else do;
condition_level = max(condition_level,_cl);
_cl = condition_level;
end;
drop _cl;
run;
Are you ready for the spotlight? We're accepting content ideas for SAS Innovate 2025 to be held May 6-9 in Orlando, FL. The call is open until September 25. Read more here about why you should contribute and what is in it for you!
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.