BookmarkSubscribeRSS Feed
☑ This topic is solved. Need further help from the community? Please sign in and ask a new question.
Whitlea
Obsidian | Level 7

Using the FirstDtSrvc and Lastdtsrvc,   I want to create 3 new variables in this data set.

Stay Begin 

Stay End

Stay Count= number of stays per clients

I specifically want these new date variables for Clinic_Type=1 (see sas data set below) . 

FirstDtSrcv-Lastdtsrvc is the client's stay duration, and most clients have multiple stays. I want to roll up all stays that are <=7 days apart into 1 stay for that client. The stay begins variable will equal the first data of service (FirstDtSrcv)for that new stay duration and the stay end will equal the last date of service (Lastdtsrvc) of that new stay duration. 

Here is an Example :

Capture.PNGCapture.PNG

 

 

Here is my data set:

Data have20;

informat FirstDtSrvc;
informat Lastdtsrvc;
format FirstDtSrvc mmddyy10. ;
format Lastdtsrvc mmddyy10. ;

input Client_ID$ FirstDtSrvc  Lastdtsrvc	Clinic_Type	Clinic_name$ ;

datalines;
A	43412	43415	1	AA
A	43416	43422	1	AA
A	43423	43429	1	AA
A	43430	43434	1	AA
A	43435	43436	1	AA
A	43437	43438	1	AA
B	43031	43039	1	AC
B	43040	43044	1	AC
B	43049	43065	1	AC
B	43066	43069	1	AC
C	42717	42717	1	AB
C	42718	42718	1	AB
C	42719	42735	1	AB
C	42736	42750	1	AB
C	42751	42766	1	AB
C	42767	42781	1	AB
C	42782	42794	1	AB
C	42795	42809	1	AB
C	42810	42825	1	AB
C	42826	42840	1	AG
C	42841	42855	1	AG
C	42856	42870	1	AG
C	42871	42886	1	AG
C	42887	42901	1	AG
C	42902	42916	1	AG
C	42917	42924	1	AG
D	45117	45120	2	AP
E	45012	45033	2	AR
F	45058	45059	3	AF
F	45065	45065	3	AD


;
run;

Thank you!!!

1 ACCEPTED SOLUTION

Accepted Solutions
ballardw
Super User

You data step is apparently using some other source of  numeric values instead of your actual SAS dates shown in the picture the first line has dates of 11/20/2078 and 11/26/2078. Plus there is no gap large enough for the C client to generate two sequences.

 

And what ever offset is applied is not consistent. So you may want use this code to generate a new data step:

Instructions here: https://communities.sas.com/t5/SAS-Communities-Library/How-to-create-a-data-step-version-of-your-dat... will show how to turn an existing SAS data set into data step code that can be pasted into a forum code box using the </> icon or attached as text to show exactly what you have and that we can test code against.

 

What is supposed to happen with clinic type other than 1? No mention is made. You also so do not provide any rule for if a client has multiple (interleaved) clinic types.

 

Does the clinic_name have any role in this? As in what would happen with something like

A	43412	43415	1	AA
A	43416	43422	1	AB
A	43423	43429	1	AA
A	43430	43434	1	AB

I would start with something like this.

/* assumes data is sorted  by
Client id FirstDtSrvc
*/

Data temp;
   set have20 ;
   by client_id;
   retain seqstart;
   l_last = lag(Lastdtsrvc);
   if first.client_id then Seqstart=FirstDtSrvc;
   else if FirstDtSrvc - l_last >7 then do;
      seqstart=firstdtsrvc;
   end;
run;

proc summary data=temp nway;
   class client_id seqstart;
   var  FirstDtSrvc Lastdtsrvc;
   output out=want (drop=_type_ _freq_) 
          min(FirstDtSrvc)=Span_Begin
          max(Lastdtsrvc)=Span_end
   ;
run;

It is very confusing, potentially quite fatal in terms of code, to discuss a problem in one term and use variables of different names. Your picture shows FirstDOS as does your narrative. But the example data step code uses FirstDtSrvc. A couple more similar themed variables and we might not have a clear understanding.

View solution in original post

6 REPLIES 6
ballardw
Super User

You data step is apparently using some other source of  numeric values instead of your actual SAS dates shown in the picture the first line has dates of 11/20/2078 and 11/26/2078. Plus there is no gap large enough for the C client to generate two sequences.

 

And what ever offset is applied is not consistent. So you may want use this code to generate a new data step:

Instructions here: https://communities.sas.com/t5/SAS-Communities-Library/How-to-create-a-data-step-version-of-your-dat... will show how to turn an existing SAS data set into data step code that can be pasted into a forum code box using the </> icon or attached as text to show exactly what you have and that we can test code against.

 

What is supposed to happen with clinic type other than 1? No mention is made. You also so do not provide any rule for if a client has multiple (interleaved) clinic types.

 

Does the clinic_name have any role in this? As in what would happen with something like

A	43412	43415	1	AA
A	43416	43422	1	AB
A	43423	43429	1	AA
A	43430	43434	1	AB

I would start with something like this.

/* assumes data is sorted  by
Client id FirstDtSrvc
*/

Data temp;
   set have20 ;
   by client_id;
   retain seqstart;
   l_last = lag(Lastdtsrvc);
   if first.client_id then Seqstart=FirstDtSrvc;
   else if FirstDtSrvc - l_last >7 then do;
      seqstart=firstdtsrvc;
   end;
run;

proc summary data=temp nway;
   class client_id seqstart;
   var  FirstDtSrvc Lastdtsrvc;
   output out=want (drop=_type_ _freq_) 
          min(FirstDtSrvc)=Span_Begin
          max(Lastdtsrvc)=Span_end
   ;
run;

It is very confusing, potentially quite fatal in terms of code, to discuss a problem in one term and use variables of different names. Your picture shows FirstDOS as does your narrative. But the example data step code uses FirstDtSrvc. A couple more similar themed variables and we might not have a clear understanding.

Whitlea
Obsidian | Level 7

Thank you!! This rolls my dates up exactly how I wanted them.  Is there any way I can keep all the variables and only have the Span_Begin and Span_End populate for Clinic_Type =1? Span_Begin and Span_End can be  .  or blank for all other clinic types. 

Thank you!

mkeintz
PROC Star

@Whitlea wrote:

Thank you!! This rolls my dates up exactly how I wanted them.  Is there any way I can keep all the variables and only have the Span_Begin and Span_End populate for Clinic_Type =1? Span_Begin and Span_End can be  .  or blank for all other clinic types. 

Thank you!


You still have not answered @ballardw's question namely what if clinic types are interleaved?  Do you just ignore the interspersed records for other clinic types?  Or does a change in clinic type force a boundary on spans for clinic type 1?

 

And if there are two spans for clinic type 1, but separated by more than 7 days, should they be output as two separate output records?

 

What if you have a series of clinic type 2 followed by a series of clinic type 3?  Is that two output records with missing span_begin and span_end?

 

Help us help you.

--------------------------
The hash OUTPUT method will overwrite a SAS data set, but not append. That can be costly. Consider voting for Add a HASH object method which would append a hash object to an existing SAS data set

Would enabling PROC SORT to simultaneously output multiple datasets be useful? Then vote for
Allow PROC SORT to output multiple datasets

--------------------------
Whitlea
Obsidian | Level 7

I need all variables (Client_ID ,FirstDtSrvc,Lastdtsrvc, Clinic_Type, Clinic_name, Span Begin, Span End, Span count) to remain in the data set that will also include the span variables.  I only need these span variables for clinic type 1, the span variables can be blank for other clinic types as actual date variables that already exists in the data will be used for clinic types <> 1.  If there are 2 spans (stays separated by 7 days), that would be 2 separate spans for that client or clinic name. A client and clinic can have more than 1 span. 

Ideally, when I am finished adding these new variables, I would like to be able to export the data to Excel and be able to get span and client count by clinic name with a pivot table. That might help understand how I want this data set up.

Hope this answers all of your questions.

Thank you for your help.

ballardw
Super User

@Whitlea wrote:

Thank you!! This rolls my dates up exactly how I wanted them.  Is there any way I can keep all the variables and only have the Span_Begin and Span_End populate for Clinic_Type =1? Span_Begin and Span_End can be  .  or blank for all other clinic types. 

Thank you!


If you only want clinic_type=1 you have a couple of choices:

1) if you don't want the clinics in the output at all then use a Where Clinic_type=1; which would typically be either a data set option or a statement shortly after the SET in the data set or in the Proc Summary code.

2) If the clinic_type is not 1 then skip the calculations in the data step (and use the Missing option on the class statement in Summary )or

3) set the variables to missing in a data step that uses the output from Proc Summary as the input.

The last may be the easiest depending on what you actually want the see the clinic types but not the span ranges:

Data final;
   set want;
   if clinic_type ne 1 then call missing (Span_begin, Span_end);
run;

Call missing will take a list of variables of any type and set them all to missing as appropriate for the variable type.

 

 

Tom
Super User Tom
Super User

Those are not date values, unless you a planning something that is 60 years in the future.

147  data _null_;
148    date1=43412;
149    date2=43415;
150    put date1= date9. date2= date9.;
151  run;

date1=09NOV2078 date2=12NOV2078

From the first photograph you appear to have an XSLX worksheet.  You probably incorrectly imported it such that the date columns ended up with mixed numeric and character values.  In that case SAS would store the numbers that EXCEL would use to represent the dates as digit strings like you show in your data step.

 

To fix EXCEL dates into SAS dates you need to add '30DEC1899'd to adjust for the different offset date used (and other issues with EXCEL date numbering).

 

SAS Innovate 2025: Call for Content

Are you ready for the spotlight? We're accepting content ideas for SAS Innovate 2025 to be held May 6-9 in Orlando, FL. The call is open until September 25. Read more here about why you should contribute and what is in it for you!

Submit your idea!

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

Click image to register for webinarClick image to register for webinar

Classroom Training Available!

Select SAS Training centers are offering in-person courses. View upcoming courses for:

View all other training opportunities.

Discussion stats
  • 6 replies
  • 955 views
  • 1 like
  • 4 in conversation