Hello SAS community,
I have a question on how to consolidate dates to start and end date by client where the dates are continuous
What I have is data on each client and each day they had a visit. (see SAS code below).
What I want is that if there has not been a break in a visit, so a client came in everyday, then to have a start date and end date. if there was a break in a visit then to have a new row that has a new start date and end date (see want table below).
Suggestions on how to accomplish this? I am sure the code is really simple but cannot seem to figure it out
Thank you
data have;
informat client 1. date mmddyy10.;
input client date;
format date mmddyy10.;
datalines;
1 10122017
1 10132017
1 10142017
1 10152017
1 10162017
1 10172017
1 10182017
1 10192017
1 10212017
1 10222017
1 10232017
1 10242017
1 10252017
1 10262017
1 10272017
2 01172017
2 01182017
2 01192017
2 01202017
2 01232017
2 01242017
2 01252017
2 01262017
2 01272017
3 05042017
3 05052017
3 05062017
3 05072017
3 05082017
3 05092017
3 05102017
3 05112017
;
run;
What I need is:
Client | Start_date | end_date |
1 | 10/12/2017 | 10/19/2017 |
1 | 10/21/2017 | 10/27/2017 |
2 | 1/17/2017 | 1/20/2017 |
2 | 1/23/2017 | 1/27/2017 |
3 | 5/4/2017 | 5/11/2017 |
data have2;
set have;
by client;
lagdate = lag(date);
if first.client then episode=1;
else if date-lagdate>1 then episode+1;
run;
proc summary data=have2;
by client episode;
var date;
output out=want(drop=_: episode) min=start_date max=stop_date;
run;
Another approach.
Try the DIF function.
If DIF() > 1 then that's the end and you need to start a new record/entry.
@sas_student1 wrote:
Hello SAS community,
I have a question on how to consolidate dates to start and end date by client where the dates are continuous
What I have is data on each client and each day they had a visit. (see SAS code below).
What I want is that if there has not been a break in a visit, so a client came in everyday, then to have a start date and end date. if there was a break in a visit then to have a new row that has a new start date and end date (see want table below).
Suggestions on how to accomplish this? I am sure the code is really simple but cannot seem to figure it out
Thank you
data have; informat client 1. date mmddyy10.; input client date; format date mmddyy10.; datalines; 1 10122017 1 10132017 1 10142017 1 10152017 1 10162017 1 10172017 1 10182017 1 10192017 1 10212017 1 10222017 1 10232017 1 10242017 1 10252017 1 10262017 1 10272017 2 01172017 2 01182017 2 01192017 2 01202017 2 01232017 2 01242017 2 01252017 2 01262017 2 01272017 3 05042017 3 05052017 3 05062017 3 05072017 3 05082017 3 05092017 3 05102017 3 05112017 ; run;
What I need is:
Client Start_date end_date 1 10/12/2017 10/19/2017 1 10/21/2017 10/27/2017 2 1/17/2017 1/20/2017 2 1/23/2017 1/27/2017 3 5/4/2017 5/11/2017
data have;
informat client 1. date mmddyy10.;
input client date;
format date mmddyy10.;
datalines;
1 10122017
1 10132017
1 10142017
1 10152017
1 10162017
1 10172017
1 10182017
1 10192017
1 10212017
1 10222017
1 10232017
1 10242017
1 10252017
1 10262017
1 10272017
2 01172017
2 01182017
2 01192017
2 01202017
2 01232017
2 01242017
2 01252017
2 01262017
2 01272017
3 05042017
3 05052017
3 05062017
3 05072017
3 05082017
3 05092017
3 05102017
3 05112017
;
run;
data want;
set have;
by client;
k=dif(date);
retain startdate enddate;
if first.client then startdate=date;
else if k=1 then enddate=date;
if not first.client and k>1 or last.client and k=1 then do;
output;startdate=date;
call missing(enddate);
end;
format startdate enddate mmddyy10.;
keep client startdate enddate;
run;
Hi @Reeza just a quick question please. I tend to write the code in the PC SAS at my college lab and paste here using the running man icon as it pops a small window. I was told the same as what you said by an OP in another thread. Is something wrong at my end or am i doing not right?
Sorry for the trivial stuff, I just wanna get this right. Thank you!
@novinosrin wrote:
Hi @Reeza just a quick question please. I tend to write the code in the PC SAS at my college lab and paste here using the running man icon as it pops a small window. I was told the same as what you said by an OP in another thread. Is something wrong at my end or am i doing not right?
Sorry for the trivial stuff, I just wanna get this right. Thank you!
There seems to be some intermittent behavior with things pasted into the "running man" and I have no idea what causes it. Sometimes the result will show as a "spoiler" with a right arrow instead of inline displayed text, sometimes text gets scrambled in other ways, possibly the result of characters that you do not see if you copy code from online (message windows here for example), edit and paste back.
I tend to use the {I} as it seems more of a "pure text" format, no color highlighting for example. But even there I've seen some odd things happen.
Just check the appearance of your post after it posts and edit if needed by clicking on the wheel to the upper right of the posted message and selecting "edit".
@ballardw Thank you Sir. Hmm sounds like a remedy ticket needs to be raised with community 🙂 I will be more aware. Thank you for your time. Have a nice evening!
assigns missing values exactly as documented. I googled it too.
Determining whether the record in hand is the start of a new time-span is relatively easy. But verifying that the record-in-hand is the END of a time-span requires looking ahead at the next record. This program achieves that task via a data step with a self-merge and a "firstobs=2" parameter:
data want (keep=client start_date end_date);
merge have
have (firstobs=2 keep=client date rename=(client=nxt_client date=nxt_date));
retain start_date;
if client^=nxt_client or nxt_date>date+1 then do;
end_date=date;
output;
start_date=nxt_date;
end;
format start_date end_date mmddyy10.;
run;
thank you @mkeintz this almost worked!
I am missing the first start date for the first client .
so the table looks like:
client | start_date | end_date |
1 | . | 10/19/2017 |
1 | 10/21/2017 | 10/27/2017 |
2 | 1/17/2017 | 1/20/2017 |
2 | 1/23/2017 | 1/27/2017 |
3 | 5/4/2017 | 5/11/2017
|
how do I get the 10/12/17 in the . ?
@sas_student1: You have observed that only one case of START_DATE is not as desired, then first time span.
As a student, this is a good opportunity to implement some thing you probably know, or at least something worthwhile for every student to learn. Namely:
That is, a statement such as
if (some condition) then start_date = (some value);
So, what's the condition, and what's the value?
@mkeintz thank you!
okay so to your question "whats the condition and whats the value" I would think it would be :
if first.client and first.date then start_date= date?
But that is not exactly right, is it? I tried it and it didn't work.
the dif(date) by @novinosrin and the fix by @Reeza did work, but trying to figure out (how/why) and how do I alter @mkeintz code to get the first clients date not missing.
Thank you
data have2;
set have;
by client;
lagdate = lag(date);
if first.client then episode=1;
else if date-lagdate>1 then episode+1;
run;
proc summary data=have2;
by client episode;
var date;
output out=want(drop=_: episode) min=start_date max=stop_date;
run;
Another approach.
Are you ready for the spotlight? We're accepting content ideas for SAS Innovate 2025 to be held May 6-9 in Orlando, FL. The call is open until September 25. Read more here about why you should contribute and what is in it for you!
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.