Hi @art297,
The reason why client 009 should not be selected is because (please see comments in red below):
9,initial interview,2/29/2016 9,service delivered,8/29/2016 9,re-interview,10/3/2016 /* The first re-interview after the initial interview is out of the "re-interview has to happen in the 5-7 months" criteria*/ 9,service delivered,10/4/2016 9,re-interview,12/3/2016 /* This re-interview takes place approximately 2 months after the previous one*/ 9,service delivered,12/4/2016 9,re-interview,6/3/2017 9,service delivered,6/5/2017 9,re-interview,12/2/2017 9,service delivered,12/4/2017 9,re-interview,6/2/2018 9,service delivered,6/3/2018 9,re-interview,6/4/2018
Thanks!
That is a new criterion! You've never mentioned that the first re-interview has to occur within a certain timeframe. Easy to build into the code but, again, it's a new criterion.
I'd also be interested if there's another new criterion regarding my proposed test case #10.
Art, CEO, AnalystFinder.com
Hi @art297, you are correct, I never specified the timeframe from the initial interview to the first re-interview.
I've added the latest criterion and two new test cases. The code, below, selects ClienIds 2,4,5,7 and 11:
data have; infile datalines dlm=','; informat ClientID 8.; informat category $20.; informat date mmddyy10.; input ClientID Category Date; format date mmddyy10.; datalines; 1,initial interview,2/1/2016 1,service delivered,8/25/2016 1,re-interview,8/26/2016 1,service delivered,10/1/2016 1,re-interview,1/1/2017 1,service delivered,3/1/2017 1,re-interview,8/1/2017 2,initial interview,2/1/2016 2,service delivered,3/1/2016 2,re-interview,8/1/2016 2,service delivered,10/1/2016 2,re-interview,1/1/2017 2,service delivered,3/1/2017 2,re-interview,8/1/2017 3,initial interview,2/1/2016 3,re-interview,8/1/2016 3,service delivered,10/1/2016 3,re-interview,1/1/2017 3,service delivered,3/1/2017 3,re-interview,8/1/2017 4,initial interview,2/1/2016 4,service delivered,3/1/2016 4,re-interview,8/1/2016 4,service delivered,10/1/2016 4,re-interview,1/1/2017 4,service delivered,3/1/2017 4,re-interview,8/1/2017 4,service delivered,9/1/2017 4,re-interview,12/1/2017 5,initial interview,2/1/2016 5,service delivered,3/1/2016 5,re-interview,8/1/2016 5,service delivered,10/1/2016 5,re-interview,1/1/2017 5,service delivered,3/1/2017 5,re-interview,8/1/2017 6,initial interview,2/1/2016 6,service delivered,8/2/2016 6,re-interview,8/3/2016 6,service delivered,10/1/2016 6,re-interview,1/1/2017 6,service delivered,3/1/2017 6,re-interview,8/1/2017 7,initial interview,2/29/2016 7,service delivered,8/29/2016 7,re-interview,9/3/2016 7,service delivered,10/1/2016 7,re-interview,2/3/2017 7,service delivered,3/1/2017 7,re-interview,9/1/2017 8,initial interview,2/29/2016 8,service delivered,8/29/2016 8,re-interview,9/3/2016 8,re-interview,10/3/2016 8,re-interview,11/3/2016 8,re-interview,12/3/2016 8,service delivered,12/4/2016 8,re-interview,6/3/2017 8,service delivered,6/5/2017 8,re-interview,12/2/2017 9,initial interview,2/29/2016 9,service delivered,8/29/2016 9,re-interview,10/3/2016 9,service delivered,10/4/2016 9,re-interview,12/3/2016 9,service delivered,12/4/2016 9,re-interview,6/3/2017 9,service delivered,6/5/2017 9,re-interview,12/2/2017 9,service delivered,12/4/2017 9,re-interview,6/2/2018 9,service delivered,6/3/2018 9,re-interview,6/4/2018 10,initial interview,2/29/2016 10,service delivered,8/29/2016 10,service delivered,10/3/2016 10,service delivered,10/4/2016 10,service delivered,12/3/2016 10,service delivered,12/4/2016 10,service delivered,6/3/2017 10,service delivered,6/5/2017 10,re-interview,12/2/2017 10,service delivered,12/4/2017 10,service delivered,12/5/2017 10,service delivered,12/6/2017 10,re-interview,6/2/2018 10,service delivered,6/3/2018 10,re-interview,12/4/2018 11,initial interview,6/29/2016 11,service delivered,8/29/2016 11,service delivered,10/3/2016 11,service delivered,10/4/2016 11,re-interview,12/2/2016 11,service delivered,12/4/2016 11,service delivered,12/5/2016 11,service delivered,12/6/2016 11,re-interview,6/2/2017 11,service delivered,6/3/2017 11,re-interview,12/4/2017 12,initial interview,4/29/2016 12,service delivered,8/29/2016 12,service delivered,10/3/2016 12,service delivered,10/4/2016 12,re-interview,12/2/2016 12,service delivered,12/4/2016 12,service delivered,12/5/2016 12,service delivered,12/6/2016 12,re-interview,6/2/2017 12,service delivered,6/3/2017 12,re-interview,12/4/2017 ; /* Criterion1: initial interview (criterion_int) Criterion2: service within 6 months of initial interview Criterion3: first re-interview must occur with 5-7 months of initial interview Criterion4: 3 re-interviews within 5-7 months of each other, with each of the first two re-interviews followed by a service Criterion5: All re-interviews, except for the final re-interview, must be followed by a service */ data want; set have; by ClientID; array reint(10); retain criterion1 criterion2 reint:; last_category=lag(category); last_date=lag(date); if first.ClientID then do; call missing(criterion1); call missing(criterion2); call missing(last_category); call missing(last_date); counter1=0; counter2=0; end; if category eq 'initial interview' then criterion1=date; else if not missing(criterion1) then do; if missing(criterion2) and category eq 'service delivered' and date le intnx('month',criterion1,6,'sameday') then criterion2=date; else if last_category eq 're-interview' and category eq 'service delivered' then do; counter1+1; reint(counter1)=last_date; end; else if category eq 're-interview' and counter1 ge 2 then do; counter1+1; reint(counter1)=date; end; else if category eq 're-interview' and last_category eq 're-interview' then do; call missing(criterion1); call missing(criterion2); counter1=0; end; end; if last.ClientID and criterion1 and criterion2 and counter1 ge 3 then do; counter2=counter1; do i=1 to counter1-1; if i eq 1 and not(intnx('month',criterion1,5,'sameday')<= reint(i)<=intnx('month',criterion1,7,'sameday')) then counter2=0; if not(intnx('month',reint(i),5,'sameday')<= reint(i+1)<=intnx('month',reint(i),7,'sameday')) then counter2=counter2-1; end; end; if counter2 ge 3; run;
Art, CEO, AnalystFinder.com
@art297 Thanks a lot for the updated code! The selected clients (2, 4, 5, 7, and 11) follow my criteria. I will apply it tomorrow to my data and let you know as soon as I do.
Thanks
Arthur.T,
9,re-interview,10/3/2016 9,service delivered,10/4/2016 9,re-interview,12/3/2016
.
these two re-interview have only 2 month gap,not like OP said 5-7 months .
Good morning @art297,
I have apply your code to my dataset and for some reason I am getting 0 observations. I have a million of observations in my dataset.
Thanks
Post your log. That should make it easier to identify what the problem might be.
One possibility is that clients have more than ten re-interviews. That can easily be corrected by making the array bigger. i.e., change:
array reint(10);
to
array reint(99);
Art, CEO, AnalystFinder.com
data have; infile datalines dlm=','; informat ClientID 8.; informat category $20.; informat date mmddyy10.; input ClientID Category Date; format date mmddyy10.; datalines; 1,initial interview,2/1/2016 1,service delivered,8/25/2016 1,re-interview,8/26/2016 1,service delivered,10/1/2016 1,re-interview,1/1/2017 1,service delivered,3/1/2017 1,re-interview,8/1/2017 2,initial interview,2/1/2016 2,service delivered,3/1/2016 2,re-interview,8/1/2016 2,service delivered,10/1/2016 2,re-interview,1/1/2017 2,service delivered,3/1/2017 2,re-interview,8/1/2017 3,initial interview,2/1/2016 3,re-interview,8/1/2016 3,service delivered,10/1/2016 3,re-interview,1/1/2017 3,service delivered,3/1/2017 3,re-interview,8/1/2017 4,initial interview,2/1/2016 4,service delivered,3/1/2016 4,re-interview,8/1/2016 4,service delivered,10/1/2016 4,re-interview,1/1/2017 4,service delivered,3/1/2017 4,re-interview,8/1/2017 4,service delivered,9/1/2017 4,re-interview,12/1/2017 5,initial interview,2/1/2016 5,service delivered,3/1/2016 5,re-interview,8/1/2016 5,service delivered,10/1/2016 5,re-interview,1/1/2017 5,service delivered,3/1/2017 5,re-interview,8/1/2017 6,initial interview,2/1/2016 6,service delivered,8/2/2016 6,re-interview,8/3/2016 6,service delivered,10/1/2016 6,re-interview,1/1/2017 6,service delivered,3/1/2017 6,re-interview,8/1/2017 7,initial interview,2/29/2016 7,service delivered,8/29/2016 7,re-interview,9/3/2016 7,service delivered,10/1/2016 7,re-interview,2/3/2017 7,service delivered,3/1/2017 7,re-interview,9/1/2017 8,initial interview,2/29/2016 8,service delivered,8/29/2016 8,re-interview,9/3/2016 8,re-interview,10/3/2016 8,re-interview,11/3/2016 8,re-interview,12/3/2016 8,service delivered,12/4/2016 8,re-interview,6/3/2017 8,service delivered,6/5/2017 8,re-interview,12/2/2017 ; data temp; set have; by ClientID; if first.ClientID then n=0; n+1; run; data temp1; set temp; by ClientID Category notsorted; if first.Category; run; data temp2; set temp1; by ClientID; retain yes 1; if first.ClientID then yes=1; lag_Date=lag(Date); lag2_Date=lag2(Date); if n=1 then do; if Category ne 'initial interview' then yes=0; end; else if n=2 then do; if Category ne 'service delivered' then yes=0; else if intck('month',lag_date,date,'c')>6 then yes=0;; end; else if mod(n,2)=1 then do; if Category ne 're-interview' then yes=0; else do; if n=3 then do; if yes then output; end; else do; if 5<=intck('month',lag2_date,date,'c')<=7 then do; if yes then output; end; else yes=0; end; end; end; else if mod(n,2)=0 then do; if Category ne 'service delivered' then yes=0; else if yes then output; /****<-- Add ****/ end; run; data temp3; set temp2; by ClientID; if last.ClientID; keep ClientID n; run; data temp4; merge temp temp3(rename=(n=_n)); by ClientID; if n<=_n; drop n _n; run; proc sql; create table want as select * from temp4 group by ClientID having sum(Category='re-interview')>2 order by ClientID,date; quit;
While you haven't posted your log yet, I made two small changes to the code I proposed. I increased the array to 999, made the array temporary, and only kept the three variables of interest, namely ClientID category and date.
data want (keep=ClientID category date); set have; by ClientID; array reint(999) _temporary_; retain criterion1 criterion2 reint:; last_category=lag(category); last_date=lag(date); if first.ClientID then do; call missing(criterion1); call missing(criterion2); call missing(last_category); call missing(last_date); counter1=0; counter2=0; end; if category eq 'initial interview' then criterion1=date; else if not missing(criterion1) then do; if missing(criterion2) and category eq 'service delivered' and date le intnx('month',criterion1,6,'sameday') then criterion2=date; else if last_category eq 're-interview' and category eq 'service delivered' then do; counter1+1; reint(counter1)=last_date; end; else if category eq 're-interview' and counter1 ge 2 then do; counter1+1; reint(counter1)=date; end; else if category eq 're-interview' and last_category eq 're-interview' then do; call missing(criterion1); call missing(criterion2); counter1=0; end; end; if last.ClientID and criterion1 and criterion2 and counter1 ge 3 then do; counter2=counter1; do i=1 to counter1-1; if i eq 1 and not(intnx('month',criterion1,5,'sameday')<= reint(i)<=intnx('month',criterion1,7,'sameday')) then counter2=0; if not(intnx('month',reint(i),5,'sameday')<= reint(i+1)<=intnx('month',reint(i),7,'sameday')) then counter2=counter2-1; end; end; if counter2 ge 3; run;
Art, CEO, AnalystFinder.com
Hi @art297
Please find attached the log. I added the changes and still have 0 observations.
Thanks again!
The log doesn't show any errors, thus my first guesses are:
1. the strings (category) in your actual data don't match the ones expected. e.g., they may be upper, lower or mixed case. That would be easy to fix (i.e., making the criterion case insensitive)
2. no ClientIDs actually match the criteria
Can you post the records for one ClientID that you think should have been selected?
Art, CEO, AnalystFinder.com
Hi @art297,
Thanks a lot for the feedback!
I will check your first guess. However, the variable representing the category and other variables are frequently checked for consistency - it is unlikely that there is mixed case.
Also, it is very unlikely that no clients actually match the criteria.
Unfortunately, due to confidentiality rules I can't share the real information - hope you understand 🙂
Totally understand confidentiality, but for one ClientID's records couldn't you simply change the ClientID to something that doesn't match any Client's ID (e.g., 'x~x~x~x~x') and only keep ClientID, category and date? I don't see how that would be releasing confidential information.
Art, CEO, AnalystFinder.com
Hi @art297,
I completely understand your point, and I do want to make the issues that I am having with the code as clear as possible, especially because you are taking from your time to help me out, and I am appreciative of that. We received a training were we were instructed not to share any real information from a client (dates, services or anything at all, even if it does not look confidential), and that is the reason why I do not post it here. I am really sorry that I can't!
Are you ready for the spotlight? We're accepting content ideas for SAS Innovate 2025 to be held May 6-9 in Orlando, FL. The call is open until September 25. Read more here about why you should contribute and what is in it for you!
Need to connect to databases in SAS Viya? SAS’ David Ghan shows you two methods – via SAS/ACCESS LIBNAME and SAS Data Connector SASLIBS – in this video.
Find more tutorials on the SAS Users YouTube channel.