BookmarkSubscribeRSS Feed
🔒 This topic is solved and locked. Need further help from the community? Please sign in and ask a new question.
Asileon
Obsidian | Level 7

Hello @art297,

 

Thanks a lot for your code!

That is correct, client 004 met the criteria by the third re-interview; all the clients will continue to have services and re-interviews, but I am interested in the initial interview, the following 3 re-interviews and the services within the initial interview/re-interviews.

 

I will apply your code to my dataset, and let you know how it goes.

 

Thanks!!

art297
Opal | Level 21

One slight (?) discrepancy between your current description and the criteria you originally specified. You said, in this post, that "I am interested in the initial interview, the following 3 re-interviews and the services within the initial interview/re-interviews."

 

However, nothing in your criteria stipulated that the re-interviews had to be done either initially, or concurrent.

 

As such, the following ClientID would meet your original criteria AND would be selected by my code:

 

data have;
  infile datalines dlm=',';
  informat ClientID 8.;
  informat category $20.;
  informat date mmddyy10.;
  input ClientID Category Date;
  format date mmddyy10.;
  datalines;
8,initial interview,2/29/2016
8,service delivered,8/29/2016
8,re-interview,9/3/2016
8,re-interview,10/3/2016
8,re-interview,11/3/2016
8,re-interview,12/3/2016
8,service delivered,12/4/2016
8,re-interview,6/3/2017
8,service delivered,6/5/2017
8,re-interview,12/2/2017
;

Art, CEO, AnalystFinder.com

 

Asileon
Obsidian | Level 7

Thanks again for the feedback!

 

I am sorry if I created confusion with my current description; I was trying to summarize - not a good idea 🙂

 

Client 008 in the example you provided would not me one of the criteria (please see below in bold) and therefore would not be selected.

This is the criteria:

I have a dataset with multiple observations per Client ID and have to select just the clients that meet the following criteria:

  • The client must have an initial interview.
  • The client must have at least one service (cleaning house, delivering groceries, etc.) delivered within 6 months of the initial interview.
  • The client must have at least 3 re-interviews 5-7 months apart from each other.
  • The client must have at least one service delivered between the re-interviews.
  • Get a final dataset with the qualifying Client ID and only the qualifying observations.
data have;
  infile datalines dlm=',';
  informat ClientID 8.;
  informat category $20.;
  informat date mmddyy10.;
  input ClientID Category Date;
  format date mmddyy10.;
  datalines;
8,initial interview,2/29/2016
8,service delivered,8/29/2016
8,re-interview,9/3/2016      
/* Two consecutive re-interviews with no service in between, and the y are not 5-7 months apart */ 8,re-interview,10/3/2016

 

Thanks!

art297
Opal | Level 21

I think that the expanded criteria shown in the following revised code meets your latest criterion:

data have;
  infile datalines dlm=',';
  informat ClientID 8.;
  informat category $20.;
  informat date mmddyy10.;
  input ClientID Category Date;
  format date mmddyy10.;
  datalines;
1,initial interview,2/1/2016
1,service delivered,8/25/2016
1,re-interview,8/26/2016
1,service delivered,10/1/2016
1,re-interview,1/1/2017
1,service delivered,3/1/2017
1,re-interview,8/1/2017
2,initial interview,2/1/2016
2,service delivered,3/1/2016
2,re-interview,8/1/2016
2,service delivered,10/1/2016
2,re-interview,1/1/2017
2,service delivered,3/1/2017
2,re-interview,8/1/2017
3,initial interview,2/1/2016
3,re-interview,8/1/2016
3,service delivered,10/1/2016
3,re-interview,1/1/2017
3,service delivered,3/1/2017
3,re-interview,8/1/2017
4,initial interview,2/1/2016
4,service delivered,3/1/2016
4,re-interview,8/1/2016
4,service delivered,10/1/2016
4,re-interview,1/1/2017
4,service delivered,3/1/2017
4,re-interview,8/1/2017
4,service delivered,9/1/2017
4,re-interview,12/1/2017
5,initial interview,2/1/2016
5,service delivered,3/1/2016
5,re-interview,8/1/2016
5,service delivered,10/1/2016
5,re-interview,1/1/2017
5,service delivered,3/1/2017
5,re-interview,8/1/2017
6,initial interview,2/1/2016
6,service delivered,8/2/2016
6,re-interview,8/3/2016
6,service delivered,10/1/2016
6,re-interview,1/1/2017
6,service delivered,3/1/2017
6,re-interview,8/1/2017
7,initial interview,2/29/2016
7,service delivered,8/29/2016
7,re-interview,9/3/2016
7,service delivered,10/1/2016
7,re-interview,2/3/2017
7,service delivered,3/1/2017
7,re-interview,9/1/2017
8,initial interview,2/29/2016
8,service delivered,8/29/2016
8,re-interview,9/3/2016
8,re-interview,10/3/2016
8,re-interview,11/3/2016
8,re-interview,12/3/2016
8,service delivered,12/4/2016
8,re-interview,6/3/2017
8,service delivered,6/5/2017
8,re-interview,12/2/2017
;

/*
 Criterion1: initial interview (criterion_int)
 Criterion2: service within 6 months of initial interview
 Criterion3: 3 re-interviews within 5-7 months of each other,
              with each of the first two re-interviews
              followed by a service
 Criterion4: All re-interviews, except for the final re-interview,
              must be followed by a service
*/

data want;
  set have;
  by ClientID;
  array reint(10);
  retain criterion1 criterion2 reint:;
  last_category=lag(category);
  last_date=lag(date);
  if first.ClientID then do;
    call missing(criterion1);
    call missing(criterion2);
    call missing(last_category);
    call missing(last_date);
    counter1=0;
    counter2=0;
  end;
  if category eq 'initial interview' then criterion1=date;
  else if not missing(criterion1) then do;
    if missing(criterion2) and category eq 'service delivered'
     and date le intnx('month',criterion1,6,'sameday')
     then criterion2=date;
    else if last_category eq 're-interview' and 
        category eq 'service delivered' then do;
         counter1+1;
         reint(counter1)=last_date;
    end;
    else if category eq 're-interview' and counter1 ge 2 then do;
       counter1+1;
       reint(counter1)=date;
    end;
    else if category eq 're-interview' and last_category eq 're-interview' then do;
      call missing(criterion1);
      call missing(criterion2);
      counter1=0;
    end;
  end;
  if last.ClientID and criterion1 and criterion2 and counter1 ge 3 then do;
    counter2=counter1;
    do i=1 to counter1-1;
      if not(intnx('month',reint(i),5,'sameday')<=
       reint(i+1)<=intnx('month',reint(i),7,'sameday')) then counter2=counter2-1;
    end;
  end;
  if counter2 ge 3;
run;

Art, CEO, AnalystFinder.com

 

Asileon
Obsidian | Level 7

Thanks @art297! Too bad I have to wait until Monday (due to SAS availability) to be able to apply the code to my data. I will let you know as soon as I do on Monday.

novinosrin
Tourmaline | Level 20

Good evening @art297, May i seek a clarification on why client8 doesn't qualify based on the data  please?

I am scratching my head here 

 

8,initial interview,2/29/2016
8,service delivered,8/29/2016 /*Interval is 6 months seems fine here as service delivered with 6 months*/
8,re-interview,9/3/2016
8,re-interview,10/3/2016
8,re-interview,11/3/2016
8,re-interview,12/3/2016  Comparing reinteviews starting from here 
8,service delivered,12/4/2016 
8,re-interview,6/3/2017     /*Interval is 6 months seems fine here as reinterview range seems  within 5-7 months*/
8,service delivered,6/5/2017
8,re-interview,12/2/2017 /*Interval is 6 months seems fine here as reinterview range seems  within 5-7 months*/

 

My eyes isn't spotting something. Thank you for your time

art297
Opal | Level 21

As I understand the rules, all re-interviews other than the last ClientID record must be followed by a service. ClientID 8 doesn't meet that criterion.

 

I don't know if @Asileon is certain of the criteria as they have expanded since the question was first asked and, in my latest post, I offered another test ClientID that may or may not require another expansion of the criteria.

 

However, this is typical, as most don't realize that defining a set of rules is, quite often, far more complex than one initially envisions.

 

Art, CEO, AnalystFinder.com

 

novinosrin
Tourmaline | Level 20

My oh My! Ah well, Sorry for the bother.Thank you 

Asileon
Obsidian | Level 7

Hello @art297

 

Looking at the clients in your code (thanks for adding more clients!) and the criteria that I posted, these are the ones that will qualify and the ones that will not qualify:

 

Qualify:

Client 002

Client 004

Client 005

Client 007

 

Do not qualify:

Client 001: the service delivered after the initial interview did not happen within 6 months. This is the first criteria, therefore, there is not need to look for the others.

Client 003: there is not service delivered after the initial interview.

Client 006: the service delivered after the initial interview did not happen within 6 months. 

Client 008: the three re-interviews following the initial interview do not have services delivered in between them. In addition those re-interviews did not happen 5-7 months from each others.

 

I can't wait for tomorrow to apply the code!

 

Thanks

 

 

 

data have;
  infile datalines dlm=',';
  informat ClientID 8.;
  informat category $20.;
  informat date mmddyy10.;
  input ClientID Category Date;
  format date mmddyy10.;
  datalines;
1,initial interview,2/1/2016
1,service delivered,8/25/2016
1,re-interview,8/26/2016
1,service delivered,10/1/2016
1,re-interview,1/1/2017
1,service delivered,3/1/2017
1,re-interview,8/1/2017
2,initial interview,2/1/2016
2,service delivered,3/1/2016
2,re-interview,8/1/2016
2,service delivered,10/1/2016
2,re-interview,1/1/2017
2,service delivered,3/1/2017
2,re-interview,8/1/2017
3,initial interview,2/1/2016
3,re-interview,8/1/2016
3,service delivered,10/1/2016
3,re-interview,1/1/2017
3,service delivered,3/1/2017
3,re-interview,8/1/2017
4,initial interview,2/1/2016
4,service delivered,3/1/2016
4,re-interview,8/1/2016
4,service delivered,10/1/2016
4,re-interview,1/1/2017
4,service delivered,3/1/2017
4,re-interview,8/1/2017
4,service delivered,9/1/2017
4,re-interview,12/1/2017
5,initial interview,2/1/2016
5,service delivered,3/1/2016
5,re-interview,8/1/2016
5,service delivered,10/1/2016
5,re-interview,1/1/2017
5,service delivered,3/1/2017
5,re-interview,8/1/2017
6,initial interview,2/1/2016
6,service delivered,8/2/2016
6,re-interview,8/3/2016
6,service delivered,10/1/2016
6,re-interview,1/1/2017
6,service delivered,3/1/2017
6,re-interview,8/1/2017
7,initial interview,2/29/2016
7,service delivered,8/29/2016
7,re-interview,9/3/2016
7,service delivered,10/1/2016
7,re-interview,2/3/2017
7,service delivered,3/1/2017
7,re-interview,9/1/2017
8,initial interview,2/29/2016
8,service delivered,8/29/2016
8,re-interview,9/3/2016
8,re-interview,10/3/2016
8,re-interview,11/3/2016
8,re-interview,12/3/2016
8,service delivered,12/4/2016
8,re-interview,6/3/2017
8,service delivered,6/5/2017
8,re-interview,12/2/2017
;

/*
 Criterion1: initial interview (criterion_int)
 Criterion2: service within 6 months of initial interview
 Criterion3: 3 re-interviews within 5-7 months of each other,
              with each of the first two re-interviews
              followed by a service
 Criterion4: All re-interviews, except for the final re-interview,
              must be followed by a service
*/

 

Asileon
Obsidian | Level 7

Good evening @novinosrin,

 

Client 008 does not qualify because the criteria:

  • The client must have at least one service delivered between the re-interviews refers to the three re-interviews following the initial interview. Also, the re-interviews are not 5-7 months from each other. I am sorry for not making this more clear. Please feel free to let me know if it is clear now 🙂

 

8,initial interview,2/29/2016
8,service delivered,8/29/2016 /*Interval is 6 months seems fine here as service delivered with 6 months*/
8,re-interview,9/3/2016
8,re-interview,10/3/2016
8,re-interview,11/3/2016

 

Thanks

 

 

novinosrin
Tourmaline | Level 20
data have;
input ClientID Category & $20. Date : mmddyy10.;
format date mmddyy10.;
datalines;
1	initial interview	2/1/2016
1	service delivered	8/25/2016
1	re-interview	8/26/2016
1	service delivered	10/1/2016
1	re-interview	1/1/2017
1	service delivered	3/1/2017
1	re-interview	8/1/2017
2	initial interview	2/1/2016
2	service delivered	3/1/2016
2	re-interview	8/1/2016
2	service delivered	10/1/2016
2	re-interview	1/1/2017
2	service delivered	3/1/2017
2	re-interview	8/1/2017
3	initial interview	2/1/2016
3	re-interview	8/1/2016
3	service delivered	10/1/2016
3	re-interview	1/1/2017
3	service delivered	3/1/2017
3	re-interview	8/1/2017
4	initial interview	2/1/2016
4	service delivered	3/1/2016
4	re-interview	8/1/2016
4	service delivered	10/1/2016
4	re-interview	1/1/2017
4	service delivered	3/1/2017
4	re-interview	8/1/2017
4	service delivered	9/1/2017
4	re-interview	12/1/2017
;


/*I have a dataset with multiple observations per Client ID and have to select just the clients that meet the following criteria:

The client must have an initial interview.
The client must have at least one service 
(cleaning house, delivering groceries, etc.) delivered within 6 months of the initial interview.
The client must have at least 3 re-interviews 5-7 months apart from each other.
The client must have at least one service delivered between the re-interviews.
Get a final dataset with the qualifying Client ID and only the qualifying observations.*/


data want;
if _n_=1 then do; 
if 0 then set have(rename=(date=_date Category=_Category));
if 0 then set have(rename=(date=__date Category=__Category));
dcl hash h(dataset:'have(rename=(date=_date Category=_Category))', multidata: 'y');
 h.definekey('ClientID','_Category');
 h.definedata(all:'y');
 h.definedone();
 dcl hash h2(dataset:'have(rename=(date=__date Category=__Category))', multidata: 'y');
 h2.definekey('ClientID');
 h2.definedata(all:'y');
 h2.definedone();
 end;
 _qual=0;
 call missing(_count,_flag);
 do until(last.clientid);
 	set have;
 	by ClientID;
	if first.clientid and Category='initial interview' then 
		do;
 			do while(h2.do_over(key:ClientID) eq 0);
			if __Category='service delivered' and intck('month',date,__date)<6 then _qual=1;
			end;
		end;
	else if Category='re-interview' and _qual=1 then 
		do;
			if h.find(key:clientid,key:Category)=0 then 
				do;
					if 5<=intck('month',date,_date)<=7  then _count+1;
					rc = h.find_next();
					do while(rc = 0);
            			if 5<=intck('month',date,_date)<=7  then _count+1;
            			rc = h.find_next();
         			end;
					if _count>=3 then _qual=1;
				end;
		end;
	else  if Category='service delivered' and  _qual=1 then
		do;
			do while(h2.do_over(key:ClientID) eq 0);
			if  missing(_flag)and __Category='re-interview' and date>__date  then _holddate1=date;
			else if missing(_flag) and __Category='re-interview' and _holddate1<date then _flag=1;
			end;
			if _flag=1 then _qual=1;
		end;
	else _qual=0;
end;
do until(last.clientid);
	set have;
	by clientid;
	if _qual then output;
end;
drop _: rc;
run;
art297
Opal | Level 21

@novinosrin: Not a good use of the hash object. Your code takes more than four times longer than my straight forward datastep to run! Plus, given the following example, your code fails to select ClientID 7.

data have;
  infile datalines dlm=',';
  informat ClientID 8.;
  informat category $20.;
  informat date mmddyy10.;
  input ClientID Category Date;
  format date mmddyy10.;
  datalines;
1,initial interview,2/1/2016
1,service delivered,8/25/2016
1,re-interview,8/26/2016
1,service delivered,10/1/2016
1,re-interview,1/1/2017
1,service delivered,3/1/2017
1,re-interview,8/1/2017
2,initial interview,2/1/2016
2,service delivered,3/1/2016
2,re-interview,8/1/2016
2,service delivered,10/1/2016
2,re-interview,1/1/2017
2,service delivered,3/1/2017
2,re-interview,8/1/2017
3,initial interview,2/1/2016
3,re-interview,8/1/2016
3,service delivered,10/1/2016
3,re-interview,1/1/2017
3,service delivered,3/1/2017
3,re-interview,8/1/2017
4,initial interview,2/1/2016
4,service delivered,3/1/2016
4,re-interview,8/1/2016
4,service delivered,10/1/2016
4,re-interview,1/1/2017
4,service delivered,3/1/2017
4,re-interview,8/1/2017
4,service delivered,9/1/2017
4,re-interview,12/1/2017
5,initial interview,2/1/2016
5,service delivered,3/1/2016
5,re-interview,8/1/2016
5,service delivered,10/1/2016
5,re-interview,1/1/2017
5,service delivered,3/1/2017
5,re-interview,8/1/2017
6,initial interview,2/1/2016
6,service delivered,8/2/2016
6,re-interview,8/3/2016
6,service delivered,10/1/2016
6,re-interview,1/1/2017
6,service delivered,3/1/2017
6,re-interview,8/1/2017
7,initial interview,2/29/2016
7,service delivered,8/29/2016
7,re-interview,9/3/2016
7,service delivered,10/1/2016
7,re-interview,2/3/2017
7,service delivered,3/1/2017
7,re-interview,9/1/2017
;

Art, CEO, AnalystFinder.com

 

Asileon
Obsidian | Level 7

Hello @novinosrin,

 

Thanks for your code! I will apply it to my dataset also, and see how it goes.

 

Thanks!

novinosrin
Tourmaline | Level 20

When hall of famers + Ksharp participate just go for their solution. However, I will work on an alternative to at least satisfy myself when i get to my college lab tomorrow. This forum is very addictive and yet because of whatever other factors my brain didn't seem to focus well enough today. Hmm, one of those days 😞 lol 

 

I made a mistake in loop (do over associative array) too many times and that caused slow execution. I can make it to zip through a key only once and will find ways to match your requirement. Well, after all at least I try and proud to be a contributor here

art297
Opal | Level 21

@novinosrin: Not good advice! We ALL make mistakes and, sometimes, fresh eyes find the best solution. Don't assume that Hall of Famers, and/or @Ksharp-like experts, always suggest the best solutions.

 

Art, CEO, AnalystFinder.com

 

sas-innovate-2024.png

Don't miss out on SAS Innovate - Register now for the FREE Livestream!

Can't make it to Vegas? No problem! Watch our general sessions LIVE or on-demand starting April 17th. Hear from SAS execs, best-selling author Adam Grant, Hot Ones host Sean Evans, top tech journalist Kara Swisher, AI expert Cassie Kozyrkov, and the mind-blowing dance crew iLuminate! Plus, get access to over 20 breakout sessions.

 

Register now!

How to connect to databases in SAS Viya

Need to connect to databases in SAS Viya? SAS’ David Ghan shows you two methods – via SAS/ACCESS LIBNAME and SAS Data Connector SASLIBS – in this video.

Find more tutorials on the SAS Users YouTube channel.

Discussion stats
  • 71 replies
  • 1834 views
  • 12 likes
  • 8 in conversation