Hi--,
I need to flag the first record among two consequitve decreases in MMSE. I provide the dataset that I have and the dataset that I would like to achieve. Please help me if you can!
data Have;
input patnum mmse month;
datalines;
101 28 0
101 28 6
101 29 36
101 28 60
101 25.7 66
101 23 72
101 24 78
101 24.5 84
102 29 0
102 28 6
102 28 36
102 27 60
102 29 66
102 28 72
102 27 72
102 26 78
102 26 84
;
run;
data Want;
input patnum mmse month re_month;
datalines;
101 28 0 .
101 28 6 .
101 29 36 .
101 28 60 60
101 25.7 66 60
101 23 72 60
101 24 78 60
101 24.5 84 60
102 29 0 .
102 28 6 .
102 28 36 .
102 27 60 .
102 29 66 .
102 28 72 72
102 27 72 72
102 26 78 72
102 26 84 72
;
run;
Thanks,
MK
I've interpreted this as setting re_month as being persistant from where the two consecutive decreasing values started. Is this what you want?
data decrease;
set have;
by patnum;
save_month = lag1(month);
if patnum = lag2(patnum)
& mmse < lag1(mmse)
& lag1(mmse) < lag2(mmse) then
output;
keep patnum save_month;
rename save_month = month;
run;
data decrease;
set decrease;
by patnum;
if first.patnum;
run;
proc sql;
create table want as
select have.*,
decrease.month as re_month
from have
left join decrease
on have.patnum = decrease.patnum
and have.month >= decrease.month
order by have.patnum,
have.month;
quit;
Why are the following listed as decreases:
101 24 78 60
101 24.5 84 60
Art, CEO, AnalystFinder.com
Hi,
Well, I need to flag the first month, at which we see two consequtive decreases and retain it until the last record for each patient, regardless future increases. Does this make sense?
Thank you,
MK
I've interpreted this as setting re_month as being persistant from where the two consecutive decreasing values started. Is this what you want?
data decrease;
set have;
by patnum;
save_month = lag1(month);
if patnum = lag2(patnum)
& mmse < lag1(mmse)
& lag1(mmse) < lag2(mmse) then
output;
keep patnum save_month;
rename save_month = month;
run;
data decrease;
set decrease;
by patnum;
if first.patnum;
run;
proc sql;
create table want as
select have.*,
decrease.month as re_month
from have
left join decrease
on have.patnum = decrease.patnum
and have.month >= decrease.month
order by have.patnum,
have.month;
quit;
Hi LaurieF,
Yes, this seems to produce what I need. Thank you so much for the quick solution 🙂 I will apply it to the bigger dataset now and check if it works there too.
I think the following would run more quickly:
data want (drop=_:); set have; retain re_month; by patnum; if first.patnum then call missing(re_month); _mmse0=lag(mmse); if missing(re_month) and not last.patnum then do; _n_+1; set have (keep=mmse rename=(mmse=_mmse1)) point=_n_; if _mmse1<mmse<_mmse0 then re_month=month; end; run;
Hi art297,
Thank you so much! I will try this solution as well.
Hi art297,
I would like to ask you a small question regarding the code that you posted:
what is the _n_+1 ?
does it crete a lag, in the sense that if
missing(re_month) and not last.patnum
for a certain observation then you set the data set have of the next observation (i.e observation#+1) and then to this new observation you do:
if _mmse1<mmse<_mmse0 then re_month=month;
Also, what does "point=_n_" exactly do?
Thanks!
@ilikesas: I like to use _n_ as it is an automatic variable that doesn't appear in one's dataset and doesn't have to be dropped.
It is the same _n_ that informs you of which record you are currently working on, and using it doesn't have any adverse effect on its intended use. So, at the beginning of processing the 20th record in a file, _n_ will still be equal to 20.
I used it, in the above code, as a variable that would represent the extra record I want to process. Thus, by stating:
_n_+1; set have (keep=mmse rename=(mmse=_mmse1)) point=_n_;
it brings in the NEXT record so that we can see, in this case, if there was a decrease. When we're processing the 10th record, _n_ will be equal to 10, _n_+1 will be equal to 11.
Art, CEO, AnalystFinder.com
@art297, that code is so cunning, you could pin a tail on it and call it a weasel. I'm very impressed.
@LaurieF: I can't take credit for it .. I stole it from Howard Schreier years ago. He provides various look ahead/look back methods at:
http://www.sascommunity.org/wiki/Look-Ahead_and_Look-Back
In fact, my code was more cumbersome than it had to be and could have been shortened to:
data want (drop=_:); set have; retain re_month; by patnum; set have ( firstobs = 2 keep=mmse rename=(mmse=_mmse1)) have ( obs = 1 drop = _all_ ); if first.patnum then call missing(re_month); _mmse0=lag(mmse); if missing(re_month) and not last.patnum then if _mmse1<mmse<_mmse0 then re_month=month; run;
Art, CEO, AnalystFinder.com
I provide a slightly edited HAVE dataset, for which the SAS program by art297 doesn't seem to work well. For patient 103, the flag should be at 72 months. Instead, it flags month 0. I am not sure if there would be an easy fix?
data Have;
input patnum mmse month;
datalines;
101 28 0
101 28 6
101 29 36
101 28 60
101 25.7 66
101 23 72
101 24 78
101 24.5 84
102 29 0
102 28 6
102 28 36
102 27 60
102 29 66
102 28 72
102 27 78
102 26 84
102 27 90
103 25 0
103 24 6
103 28 36
103 27 60
103 29 66
103 28 72
103 26 78
103 25 84
103 28 90
;
run;
The code by LaurieF works fine.
Again, thank you both for your inputs!
Yes, easy fix to the code I suggested:
data want (drop=_:); set have; retain re_month; by patnum; set have ( firstobs = 2 keep=mmse rename=(mmse=_mmse1)) have ( obs = 1 drop = _all_ ); _mmse0=lag(mmse); if first.patnum then call missing(re_month); else if missing(re_month) and not last.patnum then if _mmse1<mmse<_mmse0 then re_month=month; run;
Art, CEO, AnalystFinder.com
Dear all,
Thank you for helping me the first time. If it is okay, I would like to ask for one more task, which I've been struggling for a couple of hours now. My client now wants to do another exploratry step, where we flag and retain the first month the outcome measure (mmse) drops below the threshold of 28 and never comes back up to that threshold. If a patient dropped below the threshold, went back up and then dropped again, I am supposed to flag the fist time of the last drop. Ultimately, I am trying to get to dataset "WANT_SHORT", but I suppose that I need to first get to "WANT_LONG" and then output the last patient observation to get to WANT_SHORT. I will really appreciate it if you can help me with this.
data Want_long;
input patnum mmse month re_month;
datalines;
101 28 0 .
101 28 6 .
101 29 36 .
101 28 60 .
101 25.7 66 66
101 23 72 66
101 24 78 66
101 24.5 84 66
102 29 0 .
102 28 6 .
102 28 36 .
102 27 60 .
102 29 66 .
102 28 72 .
102 27 78 78
102 26 84 78
102 27 90 78
103 25 0 .
103 24 6 .
103 28 36 .
103 27 60 .
103 29 66 .
103 28 72 .
103 26 78 .
103 25 84 .
103 28 90 .
;
run;
data Want_short;
input patnum mmse month re_month;
datalines;
101 24.5 84 66
102 27 90 78
103 28 90 .
;
run;
Thanks,
MK
If this is what you want, you owe me a beer:
proc sql;
create table below as
select h.patnum,
h.month as re_month
from have as h
left join have as above
on h.patnum = above.patnum
and above.mmse ge 28
and h.month < above.month
where h.mmse < 28
and above.month is null
order by h.patnum,
h.month;
quit;
data below;
set below;
by patnum;
if first.patnum;
run;
data want;
merge have
below;
by patnum;
if last.patnum;
run;
This generates your want_short.
SAS Innovate 2025 is scheduled for May 6-9 in Orlando, FL. Sign up to be first to learn about the agenda and registration!
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.
Ready to level-up your skills? Choose your own adventure.