BookmarkSubscribeRSS Feed
☑ This topic is solved. Need further help from the community? Please sign in and ask a new question.
axescot78
Quartz | Level 8

I have monthly data and am trying to calculate the amount of time spent in visit with gaps in between visits. So in the example below, rows 1 and 5 should calculate 59 days and row 2 should calculate 62 (the number of days in each month where there's a visit). How can I calculate it and account for gaps?

 

Capture.PNG

 

Here is the code that produces that table. It has more data (couldn't get it to work right) but still the same idea.

data fake_data;
input User_ID $ Vst_dt1 date9. Vst_dt2 date9. Vst_dt3 date9. Vst_dt4 date9. Vst_dt5 date9. Vst_dt6 date9. Vst_dt7 date9. Vst_dt8 date9. Vst_dt9 date9. Vst_dt10 date9.;
format User_ID $2. Vst_dt1 date9. Vst_dt2 date9. Vst_dt3 date9. Vst_dt4 date9. Vst_dt5 date9. Vst_dt6 date9. Vst_dt7 date9. Vst_dt8 date9. Vst_dt9 date9. Vst_dt10 date9.;
datalines;
01 01Feb2023 01Mar2023 01Apr2023 01May2023 01Jun2023 01Jul2023 01Aug2023 01Sep2023 01Oct2023 01Nov2023
02 . 01Mar2023 01Apr2023 01May2023 . . 01Aug2023 01Sep2023 01Oct2023 .
03 . . 01Apr2023 01May2023 . . . . . .
04 . . . 01May2023 01Jun2023 01Jul2023 . . 01Oct2023 01Nov2023
05 01Feb2023 01Mar2023 01Apr2023 . . . 01Aug2023 01Sep2023 . .
;

 

1 ACCEPTED SOLUTION

Accepted Solutions
axescot78
Quartz | Level 8

@Kurt_Bremser I don't completely understand the difference between the 2 methods that you're asking about. I ended up counting the number of months that's not missing and subtracting by the total number of months to get the missing months. I left it as months but easy enough to convert to days, especially when it's over a span of years.

View solution in original post

7 REPLIES 7
ballardw
Super User

Before proceeding you may want to clearly describe how Row 2 in the picture gets 62 days with the values shown are 01MAR2020 and 01AUG2023  which is more than 3 years.

And how you get 59 days for row 1 when the dates shown 8 months apart and row 5 where there are 6 months difference.

 


@axescot78 wrote:

I have monthly data and am trying to calculate the amount of time spent in visit with gaps in between visits. So in the example below, rows 1 and 5 should calculate 59 days and row 2 should calculate 62 (the number of days in each month where there's a visit). How can I calculate it and account for gaps?

 

Capture.PNG

What should appear if there are no dates as in row 3? Or only one date as with Row 4? What if there are two or more "gaps"?

I think that you may need to provide some clarification/definition of what a "gap" means.

This may also be a case of describing what the resulting data set will be used for as often requests like this are masking a more complex problem that might be amenable to other tools.

 

FWIW, I suspect that if your data set was rearranged to have Id VisitNumber Date per actual date it may be more amenable to analysis or even calculating this request (and handles the multiple gap per Id nicer).

PaigeMiller
Diamond | Level 26

Your screen capture of the data, and the code, do not agree, and so how can we proceed? We need further explanation.

 

Please explain what you mean by "gaps" with respect to the data provided. Once we identify a "gap", how would we "account" for it?

--
Paige Miller
axescot78
Quartz | Level 8

Sorry about that. I couldn't get the code to work right so I tried another approach. Here is the data which can be pasted in a txt file and saved as a csv, and then there is code to import the file. This won't match the screenshot in my original post. Please ignore that screenshot. The data below is what was intended. Thank you.

 

User_ID,Vst_dt1,Vst_dt2,Vst_dt3,Vst_dt4,Vst_dt5,Vst_dt6,Vst_dt7,Vst_dt8,Vst_dt9,Vst_dt10
1,1-Feb-23,1-Mar-23,1-Apr-23,1-May-23,1-Jun-23,1-Jul-23,1-Aug-23,1-Sep-23,1-Oct-23,1-Nov-23
2,.,1-Mar-23,1-Apr-23,1-May-23,.,.,1-Aug-23,1-Sep-23,1-Oct-23,.
3,.,.,1-Apr-23,1-May-23,.,.,.,.,.,.
4,.,.,.,1-May-23,1-Jun-23,1-Jul-23,.,.,1-Oct-23,1-Nov-23
5,1-Feb-23,1-Mar-23,1-Apr-23,.,.,.,1-Aug-23,1-Sep-23,.,.

 

proc import 
  datafile="your_path/Book1.csv" 
  dbms=csv 
  out=fake_data
  replace;
run;

 

 

 

PaigeMiller
Diamond | Level 26

@axescot78 wrote:

Sorry about that. I couldn't get the code to work right so I tried another approach. Here is the data which can be pasted in a txt file and saved as a csv, and then there is code to import the file. This won't match the screenshot in my original post. Please ignore that screenshot. The data below is what was intended. Thank you.

 

User_ID,Vst_dt1,Vst_dt2,Vst_dt3,Vst_dt4,Vst_dt5,Vst_dt6,Vst_dt7,Vst_dt8,Vst_dt9,Vst_dt10
1,1-Feb-23,1-Mar-23,1-Apr-23,1-May-23,1-Jun-23,1-Jul-23,1-Aug-23,1-Sep-23,1-Oct-23,1-Nov-23
2,.,1-Mar-23,1-Apr-23,1-May-23,.,.,1-Aug-23,1-Sep-23,1-Oct-23,.
3,.,.,1-Apr-23,1-May-23,.,.,.,.,.,.
4,.,.,.,1-May-23,1-Jun-23,1-Jul-23,.,.,1-Oct-23,1-Nov-23
5,1-Feb-23,1-Mar-23,1-Apr-23,.,.,.,1-Aug-23,1-Sep-23,.,.

 

proc import 
  datafile="your_path/Book1.csv" 
  dbms=csv 
  out=fake_data
  replace;
run;

 


Even if we were to use this data, how do we "account for gaps"? Explain in words, please.

--
Paige Miller
axescot78
Quartz | Level 8

@Kurt_Bremser I don't completely understand the difference between the 2 methods that you're asking about. I ended up counting the number of months that's not missing and subtracting by the total number of months to get the missing months. I left it as months but easy enough to convert to days, especially when it's over a span of years.

Kurt_Bremser
Super User

From what I see, you want to count the days of each whole month in which you find a visit date, and have the sum of that. I take it that the same method be used for the "gaps".

But how should the overall "gap" sum be calculated? From a fixed start and end month, same for all users, or between the earliest and latest visit date of a given user?

Patrick
Opal | Level 21

The sample data you shared and the screenshot with the expected count of 59 for row 1 don't match. I've amended the sample data for row one to match your screenshot.

Based on the discussions I'm still not sure what you really want/need. Below code provides 3 calculation options. Have a look at sample data row 2 and the different results when the visit date doesn't start on the first of the month.

data have;
  infile datalines truncover;
  input User_ID $ (Vst_dt1 - Vst_dt10) (:date9.);
  format User_ID $2. Vst_dt: date9.;
  datalines;
01 01Feb2023 . . . . . 01Aug2023
99 05Feb2023 . . . . . 14Aug2023
02 . 01Mar2023 01Apr2023 01May2023 . . 01Aug2023 01Sep2023 01Oct2023 .
03 . . 01Apr2023 01May2023 . . . . . .
04 . . . 01May2023 01Jun2023 01Jul2023 . . 01Oct2023 01Nov2023
05 01Feb2023 01Mar2023 01Apr2023 . . . 01Aug2023 01Sep2023 . .
;

data want;
  set have;
  array mth{*} Vst_dt:;
  do i=1 to dim(mth);
    if not missing(mth[i]) then 
      do;
        mth_count   =sum(mth_count ,1);
        day_count_1 =sum(day_count_1 ,day(intnx('month',mth[i],0,'e')));
        day_count_2 =sum(day_count_2 ,intnx('month',mth[i],0,'e')-mth[i]+1);
      end;
  end;
  drop Vst_dt:;
run;

proc print data=want;
run;

Patrick_0-1706495332669.png

 

sas-innovate-2024.png

Available on demand!

Missed SAS Innovate Las Vegas? Watch all the action for free! View the keynotes, general sessions and 22 breakouts on demand.

 

Register now!

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

Click image to register for webinarClick image to register for webinar

Classroom Training Available!

Select SAS Training centers are offering in-person courses. View upcoming courses for:

View all other training opportunities.

Discussion stats
  • 7 replies
  • 499 views
  • 0 likes
  • 5 in conversation