Hello SAS Community,
I would like to calculate a number of months it takes a person to get back to the previous program from his/her first exit from the program.
In the attached sample data, I have listed two program types and individuals' status for 10 months (in the real data, I have many programs and almost 5 years of data). Two variables in the last columns (NumofMonthtoES from 1st ES exit and NumofMonthtoES from 1st TH exit) are my desired variables. I want to know the Number of months it takes a person to get to ES from first ES/TH exits.
I would truly appreciate the help!
Thank you.
Hi, @HarryB,
Sorry to take so long to reply. It's been a long day at work.
OK! That explanation makes sense. I think I get it now.
Here's what I've come up with in terms of SAS code, and the results are below.
LIBNAME Returns XLSX '.\SampleData\SampleData.xlsx';
DATA Want;
DROP _:;
SET Returns.ES_TH;
ARRAY ES_Array [*] ES1 - ES10;
ARRAY TH_Array [*] TH1 - TH10;
_ES_Cnt = 0;
_ES_True = 0;
_ES_Done = 0;
_TH_Cnt = 0;
_TH_True = 0;
_TH_Done = 0;
_TH_Zero = 0;
DO _i = 1 TO DIM(ES_Array);
IF TH_Array[_i] = 1 THEN
DO;
_TH_True = 1;
END;
ELSE
DO;
IF _TH_True THEN
_TH_Zero = 1;
END;
IF ES_Array[_i] = 0 THEN
DO;
IF _ES_True AND
NOT _ES_Done THEN
DO;
_ES_Cnt + 1;
END;
IF _TH_True AND
_TH_Zero AND
NOT _TH_Done THEN
DO;
_TH_Cnt + 1;
END;
END;
ELSE
IF ES_Array[_i] = 1 THEN
DO;
IF _ES_Cnt >= 1 AND
NOT _ES_Done THEN
DO;
Num_Months_ES = _ES_Cnt;
_ES_Done = 1;
END;
ELSE
DO;
_ES_True = 1;
END;
IF _TH_Cnt >= 1 AND
NOT _TH_Done THEN
DO;
Num_Months_TH = _TH_Cnt;
_TH_Done = 1;
END;
END;
IF _ES_Done AND
_TH_Done THEN
_i = DIM(ES_Array);
END;
RUN;
Results:
Jim
This looks like array processing would be a good approach. Something like this:
LIBNAME Returns XLSX '.\SampleData\SampleData.xlsx';
DATA Want;
DROP _:;
SET Returns.ES_TH;
ARRAY ES_Array [*] ES1 - ES10;
ARRAY TH_Array [*] TH1 - TH10;
_ES_Cnt = 0;
_ES_True = 0;
DO _i = 1 TO DIM(ES_Array);
IF ES_Array[_i] = 0 AND
_ES_True THEN
DO;
_ES_Cnt + 1;
END;
ELSE
IF ES_Array[_i] = 1 THEN
IF _ES_Cnt >= 1 THEN
DO;
Num_Months_ES = _ES_Cnt;
_i = DIM(ES_Array);
END;
ELSE
DO;
_ES_True = 1;
END;
END;
_TH_Cnt = 0;
_TH_True = 0;
DO _i = 1 TO DIM(TH_Array);
IF TH_Array[_i] = 0 AND
_TH_True THEN
DO;
_TH_Cnt + 1;
END;
ELSE
IF TH_Array[_i] = 1 THEN
IF _TH_Cnt >= 1 THEN
DO;
Num_Months_TH = _TH_Cnt;
_i = DIM(TH_Array);
END;
ELSE
DO;
_TH_True = 1;
END;
END;
RUN;
Which would give you the following results:
I believe you have a bit of an error in your sample results. I think the 4th TH should be "1" not "2", yes?
Is that the sort of thing that you were looking for?
Jim
Edit: Now that I understand that the second value (NTH here) is the number of months from quitting TH to restarting ES, I've corrected the code below. It's trivial: just change one reference (in the second
if (_firstone<_quit<10) then do ...
statement to refer to the XES array instead of the XTH array. It's marked with an "** edit here**;" comment.
I agree that arrays are the way to structure this problem. But I think you can take advantage of the DO ... UNTIL syntax to make the code simpler.
The conceptual logic is
data have;
input id es1-es10 th1-th10;
datalines;
1 1 1 0 0 0 0 1 1 0 0 1 1 1 0 0 0 1 1 0 0
2 0 0 1 1 0 0 0 0 1 1 0 0 1 1 1 0 0 0 1 1
3 0 0 1 1 1 0 0 0 1 0 0 0 1 1 1 1 1 0 1 0
4 1 1 0 0 1 1 0 0 1 1 1 1 0 1 1 1 0 0 1 1
5 0 0 0 0 1 1 1 1 1 0 0 0 0 1 1 1 1 1 1 0
6 0 0 1 1 1 1 0 0 0 0 0 0 1 1 1 1 1 0 0 0
7 1 1 1 1 0 1 0 0 0 0 1 1 1 1 0 1 1 0 0 0
8 0 0 0 1 1 0 0 0 0 1 0 0 0 1 1 1 0 0 0 1
run;
data want (drop=_:);
set have;
array xes {*} es: ;
nes=0;
_firstone=whichn(1,of xes{*});
if (1<=_firstone<=8) then do _quit=_firstone+1 to 9 until (xes{_quit}=0);
end;
if (_firstone<_quit<10) then do _restart=_quit+1 to 10 until(xes{_restart}=1);
end;
if (_quit<_restart<=10) then nes=_restart-_quit;
call missing(of _:);
array xth {*} th: ;
nth=0;
_firstone=whichn(1,of xth{*});
if (1<=_firstone<=8) then do _quit=_firstone+1 to 9 until (xth{_quit}=0);
end;
if (_firstone<_quit<10) then do _restart=_quit+1 to 10 until(xes{_restart}=1); **Edit here **;
end;
if (_quit<_restart<=10) then nth=_restart-_quit;
run;
This works for array of exactly size 10. If you want to generalize, then you can use DIM(x) instead of 10, DIM(x)-1 instead of 9, DIM(x)-2 instead of 8, where x is the array name:
data want (drop=_:);
set have;
array xes {*} es: ;
nes=0;
_firstone=whichn(1,of xes{*});
if (1<=_firstone<=dim(xes)-2) then do _quit=_firstone+1 to dim(xes)-1 until (xes{_quit}=0);
end;
if (_firstone<_quit<dim(xes)) then do _restart=_quit+1 to dim(xes) until(xes{_restart}=1);
end;
if (_quit<_restart<=dim(xes)) then nes=_restart-_quit;
call missing(of _:);
array xth {*} th: ;
nth=0;
_firstone=whichn(1,of xth{*});
if (1<=_firstone<=dim(xth)-2) then do _quit=_firstone+1 to dim(xth)-1 until (xth{_quit}=0);
end;
if (_firstone<_quit<dim(xth)) then do _restart=_quit+1 to dim(xth) until(xes{_restart}=1); ** Edit here**;
end;
if (_quit<_restart<=dim(xth)) then nth=_restart-_quit;
run;
The "trick" here is the DO UNTIL construct. Once the until-condition is met, then the loop index is not incremented, and points directly to the array element of interest (_quit or _restart). If it is never met then the index is 1 greater than the loop upper limit.
Editted note: I guess the other point to make here is that the expression (x<y<z) is a logical expression equivalent to (x<y and y<z).
Hi @jimbarbour & @mkeintz,
Thank you for the prompt response. Both of your suggestions seem to work; however, there may be a confusion on my question.
I want to know how long does it take for an individual to get to ES from 1st ES exit, from 1st TH exit, and so on.
My ultimate goal is to find whether an individual returned to ES after he/she 1st exited from various types of programs and how long did it take to get to ES.
Both of your suggested solution calculate how long does it take for an individual to get to ES from 1st ES exit, how long does it take to get to TH from 1st TH exit.
I hope this is more clear than the previous post.
Thank you again for your help !!
Oh. dear. I am completely lost. I don't understand.
Can you give me a couple examples? Perhaps you could walk me through step by step?
Jim
Sorry, I should have explained it in detail.
Let's see a couple of IDs.
ID 4:
This individual first exited from the ES program at the month of 2 and re-entered at the month of 5, which means it took 2 months to get back to ES.
The same individual also happen to be in the TH program. This individual exited the TH program at the month of 2 (happen to be the same as ES 1st exit, but it could have been any number) and entered into ES at the month of 5, which means it took 2 months to get to ES after he/she 1st exited from the TH.
ID 8:
First exited from the ES at the month of 5 and re-entered at the month of 10, so took 4 months go get back to ES.
First exited from the TH at the month of 6 and entered to ES at the month of 10, so took 3 months to get to ES.
The bottom line is I want to know did they get back to ES once they first exited from ES, TH, and so on. If they entered into ES from 1st ES exit, 1st TH exit and so on, how long did it take for them to get to ES.
I hope this is helpful.
Thank you for taking your precious time and responding to my question.
Hi, @HarryB,
Sorry to take so long to reply. It's been a long day at work.
OK! That explanation makes sense. I think I get it now.
Here's what I've come up with in terms of SAS code, and the results are below.
LIBNAME Returns XLSX '.\SampleData\SampleData.xlsx';
DATA Want;
DROP _:;
SET Returns.ES_TH;
ARRAY ES_Array [*] ES1 - ES10;
ARRAY TH_Array [*] TH1 - TH10;
_ES_Cnt = 0;
_ES_True = 0;
_ES_Done = 0;
_TH_Cnt = 0;
_TH_True = 0;
_TH_Done = 0;
_TH_Zero = 0;
DO _i = 1 TO DIM(ES_Array);
IF TH_Array[_i] = 1 THEN
DO;
_TH_True = 1;
END;
ELSE
DO;
IF _TH_True THEN
_TH_Zero = 1;
END;
IF ES_Array[_i] = 0 THEN
DO;
IF _ES_True AND
NOT _ES_Done THEN
DO;
_ES_Cnt + 1;
END;
IF _TH_True AND
_TH_Zero AND
NOT _TH_Done THEN
DO;
_TH_Cnt + 1;
END;
END;
ELSE
IF ES_Array[_i] = 1 THEN
DO;
IF _ES_Cnt >= 1 AND
NOT _ES_Done THEN
DO;
Num_Months_ES = _ES_Cnt;
_ES_Done = 1;
END;
ELSE
DO;
_ES_True = 1;
END;
IF _TH_Cnt >= 1 AND
NOT _TH_Done THEN
DO;
Num_Months_TH = _TH_Cnt;
_TH_Done = 1;
END;
END;
IF _ES_Done AND
_TH_Done THEN
_i = DIM(ES_Array);
END;
RUN;
Results:
Jim
Hi @jimbarbour ,
Sorry, it took me a while to come back to this. Your updated solution perfectly works on my dataset. I truly appreciate your help.
Excellent. Glad I was able to help.
Jim
@HarryB wrote:
Hi @jimbarbour & @mkeintz,
My ultimate goal is to find whether an individual returned to ES after he/she 1st exited from various types of programs and how long did it take to get to ES.
Both of your suggested solution calculate how long does it take for an individual to get to ES from 1st ES exit, how long does it take to get to TH from 1st TH exit.
Did you actually look at the results of my program? It calculates two new variables: NES and NTH. NTH is precisely how long it takes to get to TH from 1st TH exit. There are 8 statements that calclulate NES, and (after resetting the utility variables to missing, in preparation for re-use) a similar set of 8 statements calculating NTH.
Ah. thank you @jimbarbour .
Are you ready for the spotlight? We're accepting content ideas for SAS Innovate 2025 to be held May 6-9 in Orlando, FL. The call is open until September 25. Read more here about why you should contribute and what is in it for you!
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.