BookmarkSubscribeRSS Feed
🔒 This topic is solved and locked. Need further help from the community? Please sign in and ask a new question.
HarryB
Obsidian | Level 7

Hello SAS Community,

 

I would like to calculate a number of months it takes a person to get back to the previous program from his/her first exit from the program.

In the attached sample data, I have listed two program types and individuals' status for 10 months (in the real data, I have many programs and almost 5 years of data). Two variables in the last columns (NumofMonthtoES from 1st ES exit and NumofMonthtoES from 1st TH exit) are my desired variables. I want to know the Number of months it takes a person to get to ES from first ES/TH exits.

I would truly appreciate the help!

Thank you.

1 ACCEPTED SOLUTION

Accepted Solutions
jimbarbour
Meteorite | Level 14

Hi, @HarryB,

 

Sorry to take so long to reply.  It's been a long day at work. 

 

OK!  That explanation makes sense.  I think I get it now.  

 

Here's what I've come up with in terms of SAS code, and the results are below.

LIBNAME	Returns	XLSX	'.\SampleData\SampleData.xlsx';

DATA	Want;
	DROP	_:;
	SET	Returns.ES_TH;
	ARRAY	ES_Array	[*]	ES1 - ES10;
	ARRAY	TH_Array	[*]	TH1 - TH10;

	_ES_Cnt								=	0;
	_ES_True							=	0;
	_ES_Done							=	0;
	_TH_Cnt								=	0;
	_TH_True							=	0;
	_TH_Done							=	0;
	_TH_Zero							=	0;

	DO	_i								=	1	TO	DIM(ES_Array);
		IF	TH_Array[_i]				=	1	THEN
			DO;
				_TH_True				=	1;
			END;
		ELSE
			DO;
				IF	_TH_True					THEN
					_TH_Zero			=	1;
			END;

		IF	ES_Array[_i]				=	0	THEN
			DO;
				IF	_ES_True					AND
					NOT	_ES_Done				THEN
					DO;
						_ES_Cnt			+	1;
					END;

				IF	_TH_True					AND
					_TH_Zero					AND
					NOT	_TH_Done				THEN
					DO;
						_TH_Cnt			+	1;
					END;
			END;
		ELSE
		IF	ES_Array[_i]				=	1	THEN
			DO;
				IF	_ES_Cnt				>=	1	AND
					NOT	_ES_Done				THEN
					DO;
						Num_Months_ES	=	_ES_Cnt;
						_ES_Done		=	1;
					END;
				ELSE
					DO;
						_ES_True		=	1;
					END;

				IF	_TH_Cnt				>=	1	AND
					NOT	_TH_Done				THEN
					DO;
						Num_Months_TH	=	_TH_Cnt;
						_TH_Done		=	1;
					END;
			END;

		IF	_ES_Done							AND
			_TH_Done							THEN
			_i							=	DIM(ES_Array);
	END;
RUN;

Results:

jimbarbour_0-1600999588310.png

 

Jim

View solution in original post

14 REPLIES 14
jimbarbour
Meteorite | Level 14

This looks like array processing would be a good approach.  Something like this:

LIBNAME	Returns	XLSX	'.\SampleData\SampleData.xlsx';

DATA	Want;
	DROP	_:;
	SET	Returns.ES_TH;
	ARRAY	ES_Array	[*]	ES1 - ES10;
	ARRAY	TH_Array	[*]	TH1 - TH10;

	_ES_Cnt								=	0;
	_ES_True							=	0;
	DO	_i								=	1	TO	DIM(ES_Array);
		IF	ES_Array[_i]				=	0	AND
			_ES_True							THEN
			DO;
				_ES_Cnt					+	1;
			END;
		ELSE
		IF	ES_Array[_i]				=	1	THEN
			IF	_ES_Cnt					>=	1	THEN
				DO;
					Num_Months_ES		=	_ES_Cnt;
					_i					=	DIM(ES_Array);
				END;
			ELSE
				DO;
					_ES_True			=	1;
				END;
	END;

	_TH_Cnt								=	0;
	_TH_True							=	0;
	DO	_i								=	1	TO	DIM(TH_Array);
		IF	TH_Array[_i]				=	0	AND
			_TH_True							THEN
			DO;
				_TH_Cnt					+	1;
			END;
		ELSE
		IF	TH_Array[_i]				=	1	THEN
			IF	_TH_Cnt					>=	1	THEN
				DO;
					Num_Months_TH		=	_TH_Cnt;
					_i					=	DIM(TH_Array);
				END;
		ELSE
			DO;
				_TH_True				=	1;
			END;
	END;
RUN;

Which would give you the following results:

jimbarbour_0-1600908433754.png

 

I believe you have a bit of an error in your sample results.  I think the 4th TH should be "1" not "2", yes?

jimbarbour_1-1600908560321.png

 

Is that the sort of thing that you were looking for?

 

Jim

mkeintz
PROC Star

Edit:  Now that I understand that the second value (NTH here) is the number of months from quitting TH to restarting ES, I've corrected the code below.  It's trivial:  just change one reference  (in the second 

  if (_firstone<_quit<10) then do ...

statement to refer to the XES array instead of the XTH array.  It's marked with an "** edit here**;" comment.

 

I agree that arrays are the way to structure this problem.  But I think you can take advantage of the DO ... UNTIL syntax to make the code simpler.

 

The conceptual logic is

  1. default number of months to zero.
  2. Find the first 1  (via the whichn function), since you can't quit (encounter a 0) until after there is a 1.
  3. If the first 1 (variable _firstone) lies between 1 and 8 (for this size array), then there is room to quit and restart, so look for a quit between the first 1 and array element 9.
  4. If there is such a quit (variable _quit), then look for a restart (the next 1).
  5. If there is no value of 1 after the quit element, then the _restart variable is 11, so leave the default number of months at zero, otherwise subtract _quit from _restart:

 

data have;
  input id es1-es10 th1-th10;
datalines;
1   	1	1	0	0	0	0	1	1	0	0		1	1	1	0	0	0	1	1	0	0
2   	0	0	1	1	0	0	0	0	1	1		0	0	1	1	1	0	0	0	1	1
3   	0	0	1	1	1	0	0	0	1	0		0	0	1	1	1	1	1	0	1	0
4   	1	1	0	0	1	1	0	0	1	1		1	1	0	1	1	1	0	0	1	1
5   	0	0	0	0	1	1	1	1	1	0		0	0	0	1	1	1	1	1	1	0
6   	0	0	1	1	1	1	0	0	0	0		0	0	1	1	1	1	1	0	0	0
7   	1	1	1	1	0	1	0	0	0	0		1	1	1	1	0	1	1	0	0	0
8   	0	0	0	1	1	0	0	0	0	1		0	0	0	1	1	1	0	0	0	1
run;

data want (drop=_:);
  set have;

  array xes {*} es: ;
  nes=0;
  _firstone=whichn(1,of xes{*});
  if (1<=_firstone<=8) then do _quit=_firstone+1 to 9 until (xes{_quit}=0);
  end;
  if (_firstone<_quit<10) then do _restart=_quit+1 to 10 until(xes{_restart}=1);
  end;
  if (_quit<_restart<=10) then nes=_restart-_quit;

  call missing(of _:);
  array xth {*} th: ;
  nth=0;
  _firstone=whichn(1,of xth{*});
  if (1<=_firstone<=8) then do _quit=_firstone+1 to 9 until (xth{_quit}=0);
  end;  
  if (_firstone<_quit<10) then do _restart=_quit+1 to 10 until(xes{_restart}=1);  **Edit here **;
  end;
  if (_quit<_restart<=10) then nth=_restart-_quit;

run;

This works for array of exactly size 10.  If you want to generalize, then you can use DIM(x) instead of 10, DIM(x)-1 instead of 9, DIM(x)-2 instead of 8, where x is the array name:

 

data want (drop=_:);
  set have;

  array xes {*} es: ;
  nes=0;
  _firstone=whichn(1,of xes{*});
  if (1<=_firstone<=dim(xes)-2) then do _quit=_firstone+1 to dim(xes)-1 until (xes{_quit}=0);
  end;
  if (_firstone<_quit<dim(xes)) then do _restart=_quit+1 to dim(xes) until(xes{_restart}=1);
  end;
  if (_quit<_restart<=dim(xes)) then nes=_restart-_quit;

  call missing(of _:);
  array xth {*} th: ;
  nth=0;
  _firstone=whichn(1,of xth{*});
  if (1<=_firstone<=dim(xth)-2) then do _quit=_firstone+1 to dim(xth)-1 until (xth{_quit}=0);
  end;
  if (_firstone<_quit<dim(xth)) then do _restart=_quit+1 to dim(xth) until(xes{_restart}=1);  ** Edit here**;
  end;
  if (_quit<_restart<=dim(xth)) then nth=_restart-_quit;
run;

The "trick" here is the DO UNTIL construct.  Once the until-condition is met, then the loop index is not incremented, and points directly to the array element of interest (_quit or _restart).  If it is never met then the index is 1 greater than the loop upper limit.

 

Editted note:  I guess the other point to make here is that the expression   (x<y<z) is a logical expression equivalent to (x<y and y<z).

--------------------------
The hash OUTPUT method will overwrite a SAS data set, but not append. That can be costly. Consider voting for Add a HASH object method which would append a hash object to an existing SAS data set

Would enabling PROC SORT to simultaneously output multiple datasets be useful? Then vote for
Allow PROC SORT to output multiple datasets

--------------------------
HarryB
Obsidian | Level 7

Hi @jimbarbour & @mkeintz,

 

Thank you for the prompt response. Both of your suggestions seem to work; however, there may be a confusion on my question.

I want to know how long does it take for an individual to get to ES from 1st ES exit, from 1st TH exit, and so on.

My ultimate goal is to find whether an individual returned to ES after he/she 1st exited from various types of programs and how long did it take to get to ES.

Both of your suggested solution calculate how long does it take for an individual to get to ES from 1st ES exit, how long does it take to get to TH from 1st TH exit.

I hope this is more clear than the previous post.

 

Thank you again for your help !!

 

jimbarbour
Meteorite | Level 14

@HarryB,

 

Oh. dear.  I am completely lost.  I don't understand.  

 

Can you give me a couple examples?  Perhaps you could walk me through step by step?

 

Jim

HarryB
Obsidian | Level 7

@jimbarbour ,

Sorry, I should have explained it in detail.

Let's see a couple of IDs.

ID 4:

This individual first exited from the ES program at the month of 2 and re-entered at the month of 5, which means it took 2 months to get back to ES.

The same individual also happen to be in the TH program. This individual exited the TH program at the month of 2 (happen to be the same as ES 1st exit, but it could have been any number) and entered into ES at the month of 5, which means it took 2 months to get to ES after he/she 1st exited from the TH.

 

ID 8:

First exited from the ES at the month of 5 and re-entered at the month of 10, so took 4 months go get back to ES.

First exited from the TH at the month of 6 and entered to ES at the month of 10, so took 3 months to get to ES.

 

The bottom line is I want to know did they get back to ES once they first exited from ES, TH, and so on. If they entered into ES from 1st ES exit, 1st TH exit and so on, how long did it take for them to get to ES.

I hope this is helpful.

Thank you for taking your precious time and responding to my question.

jimbarbour
Meteorite | Level 14

Hi, @HarryB,

 

Sorry to take so long to reply.  It's been a long day at work. 

 

OK!  That explanation makes sense.  I think I get it now.  

 

Here's what I've come up with in terms of SAS code, and the results are below.

LIBNAME	Returns	XLSX	'.\SampleData\SampleData.xlsx';

DATA	Want;
	DROP	_:;
	SET	Returns.ES_TH;
	ARRAY	ES_Array	[*]	ES1 - ES10;
	ARRAY	TH_Array	[*]	TH1 - TH10;

	_ES_Cnt								=	0;
	_ES_True							=	0;
	_ES_Done							=	0;
	_TH_Cnt								=	0;
	_TH_True							=	0;
	_TH_Done							=	0;
	_TH_Zero							=	0;

	DO	_i								=	1	TO	DIM(ES_Array);
		IF	TH_Array[_i]				=	1	THEN
			DO;
				_TH_True				=	1;
			END;
		ELSE
			DO;
				IF	_TH_True					THEN
					_TH_Zero			=	1;
			END;

		IF	ES_Array[_i]				=	0	THEN
			DO;
				IF	_ES_True					AND
					NOT	_ES_Done				THEN
					DO;
						_ES_Cnt			+	1;
					END;

				IF	_TH_True					AND
					_TH_Zero					AND
					NOT	_TH_Done				THEN
					DO;
						_TH_Cnt			+	1;
					END;
			END;
		ELSE
		IF	ES_Array[_i]				=	1	THEN
			DO;
				IF	_ES_Cnt				>=	1	AND
					NOT	_ES_Done				THEN
					DO;
						Num_Months_ES	=	_ES_Cnt;
						_ES_Done		=	1;
					END;
				ELSE
					DO;
						_ES_True		=	1;
					END;

				IF	_TH_Cnt				>=	1	AND
					NOT	_TH_Done				THEN
					DO;
						Num_Months_TH	=	_TH_Cnt;
						_TH_Done		=	1;
					END;
			END;

		IF	_ES_Done							AND
			_TH_Done							THEN
			_i							=	DIM(ES_Array);
	END;
RUN;

Results:

jimbarbour_0-1600999588310.png

 

Jim

HarryB
Obsidian | Level 7

Hi @jimbarbour ,

 

Sorry, it took me a while to come back to this. Your updated solution perfectly works on my dataset. I truly appreciate your help.

 

jimbarbour
Meteorite | Level 14

Excellent.  Glad I was able to help.

 

Jim

jimbarbour
Meteorite | Level 14
@Harry
Was my last answer satisfactory? If so, please mark solved otherwise please feel free to ask further questions.

Jim
jimbarbour
Meteorite | Level 14

Whoops, sorry @Harry 

 

@HarryB, how are we doing here?  Was my last post a reasonable solution?  Or do we still need some more work?

 

Jim

mkeintz
PROC Star

@HarryB wrote:

Hi @jimbarbour & @mkeintz,

 

My ultimate goal is to find whether an individual returned to ES after he/she 1st exited from various types of programs and how long did it take to get to ES.

Both of your suggested solution calculate how long does it take for an individual to get to ES from 1st ES exit, how long does it take to get to TH from 1st TH exit.


Did you actually look at the results of my program?  It calculates two new variables: NES and NTH.  NTH is precisely how long it takes to get to TH from 1st TH exit.  There are 8 statements that calclulate NES, and (after resetting the utility variables to missing, in preparation for re-use) a similar set of 8 statements calculating NTH. 

--------------------------
The hash OUTPUT method will overwrite a SAS data set, but not append. That can be costly. Consider voting for Add a HASH object method which would append a hash object to an existing SAS data set

Would enabling PROC SORT to simultaneously output multiple datasets be useful? Then vote for
Allow PROC SORT to output multiple datasets

--------------------------
jimbarbour
Meteorite | Level 14

@mkeintz,

 

NTH is precisely how long it takes to get to TH from 1st TH exit.

I believe you are correct. I thought @HarryB wanted that too, but he clarified that what he wants is how long it takes to get to TH ES from 1st TH exit.  All counts should be incremented until ES resumes.  

 

Jim

 

mkeintz
PROC Star

Ah.  thank you @jimbarbour .

--------------------------
The hash OUTPUT method will overwrite a SAS data set, but not append. That can be costly. Consider voting for Add a HASH object method which would append a hash object to an existing SAS data set

Would enabling PROC SORT to simultaneously output multiple datasets be useful? Then vote for
Allow PROC SORT to output multiple datasets

--------------------------

SAS Innovate 2025: Call for Content

Are you ready for the spotlight? We're accepting content ideas for SAS Innovate 2025 to be held May 6-9 in Orlando, FL. The call is open until September 25. Read more here about why you should contribute and what is in it for you!

Submit your idea!

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

Click image to register for webinarClick image to register for webinar

Classroom Training Available!

Select SAS Training centers are offering in-person courses. View upcoming courses for:

View all other training opportunities.

Discussion stats
  • 14 replies
  • 2004 views
  • 5 likes
  • 4 in conversation