DATA Step, Macro, Functions and more

Extracting id's satisfying time period conditions

Accepted Solution Solved
Reply
Frequent Contributor
Posts: 87
Accepted Solution

Extracting id's satisfying time period conditions

Hi All,

I have the data in following format 

 

data have;

infile cards;

input id start_dt mmddyy10. end_dt mmddyy10.;

cards;

1   01/01/2013  01/03/2014

2   08/01/2013  01/01/2015

2    03/01/2015  04/03/2015

3   12/12/2014    04/01/2016 ; run;

 

Start_dt is always more than 01/01/2013. My dates in original dataset are stored and displayed in MMDDYY10. format (numeric). Here is what I need-it seemed easy at first but i got stuck in first step itself!

I need to get all those id's which were members for atleast 9 months during both fiscal periods. Ist fiscal period is between july 2013 and June 2014 and second fiscal period is between july2014 and june 2015. Even if a member was  member for 1 day in a month, it counts as 1 month. Many members have more than one record where he/she could end the membership and then restart at later date. Besides, in some rare cases, the end month date of first time membership coincides with start month date of second time membership (for ex, a member could have end_dt on 04/13/2014 and then could restart membership with start_dt on 04/20/2014). This is important to keep in mind so that months of membership are not over estimated.

My first instinct was to extract two tables with each table satisfying 2014 or 2015 fiscal period membership requirement and then join the tables with id to get members who were present in both tables.

Now the first problem i'm having is that i could not get members satisfying the start_dt requirement . When I put the condition below, its returning 0 rows but not an error. My guess is this has to do with date format or something.

 

proc sql;
create table test from source where start_dt between 07/01/2013 and 06/31/2015;
quit;
run;

 

In any case, i believe to accurately extract the informaton im looking for needs more than simple joins and where conditions. 

Thanks in advance for your help. 


Accepted Solutions
Solution
‎10-13-2016 08:55 AM
Respected Advisor
Posts: 4,640

Re: Extracting id's satisfying time period conditions

Read your data with modified list input:

 

data have1;
infile cards;
input id start_dt :mmddyy10. end_dt :mmddyy10. ;
cards;
1   01/01/2013  01/03/2014
2   08/01/2013  01/01/2015
2   03/01/2015  04/03/2015
3   12/12/2014  04/01/2016
;

instead. Without the colon before the informat name SAS tries to read data from the current cursor position. With the colon, SAS reads the next field (skipping any spaces) and then transforms it using the informat.

PG

View solution in original post


All Replies
Frequent Contributor
Posts: 123

Re: Extracting id's satisfying time period conditions

For your join, use sas date literals instead of the format they are currently in:

 

proc sql;
create table test from source where start_dt between '01jul2013'd and '31jun2015'd;
quit;
run;
Frequent Contributor
Posts: 87

Re: Extracting id's satisfying time period conditions

[ Edited ]

EDIT: Actually it worked on my original file but dont work on the test data i provided to you guys. Not sure why. 

 

 

Thanks! But it still gives me 0 rows. Also, this part is just the first step, im not sure how to correctly solve the main problem as described above.

Super User
Posts: 10,460

Re: Extracting id's satisfying time period conditions

Please post the log of the code actually run and possibly the results of Proc Contents. Things do not work correctly when you thing you have a date value but it is actually string or vice versa.

Frequent Contributor
Posts: 87

Re: Extracting id's satisfying time period conditions

This is the code I run and below is the log. But, again, as I said earlier, the date formats u guys suggested earlier did work for my real data. Also, can you help in finding the id's with time conditions mentioned in my first post? Thanks so much!

data have1;
infile cards;
input id start_dt mmddyy10. end_dt mmddyy10. ;
cards;
1   01/01/2013  01/03/2014
2   08/01/2013  01/01/2015
2   03/01/2015  04/03/2015
3   12/12/2014  04/01/2016
;
run;

proc sql;
create table a as select* from have1 where start_dt between '01jul2013'd and '30jun2015'd;
quit;
run;

data have1;
147 infile cards;
148 input id start_dt mmddyy10. end_dt mmddyy10. ;
149 cards;

NOTE: Invalid data for end_dt in line 150 13-22.
RULE: ----+----1----+----2----+----3----+----4----+----5----+----6----+----7----+----8----+--
150 1 01/01/2013 01/03/2014
id=1 start_dt=21915 end_dt=. _ERROR_=1 _N_=1
NOTE: Invalid data for end_dt in line 151 13-22.
151 2 08/01/2013 01/01/2015
id=2 start_dt=22128 end_dt=. _ERROR_=1 _N_=2
NOTE: Invalid data for end_dt in line 152 13-22.
152 2 03/01/2015 04/03/2015
id=2 start_dt=21975 end_dt=. _ERROR_=1 _N_=3
NOTE: Invalid data for end_dt in line 153 13-22.
153 3 12/12/2014 04/01/2016
id=3 start_dt=22261 end_dt=. _ERROR_=1 _N_=4
NOTE: The data set WORK.HAVE1 has 4 observations and 3 variables.
NOTE: DATA statement used (Total process time):
real time 0.06 seconds
cpu time 0.04 seconds


154 ;
155 run;

156 proc sql;
157 create table a as select* from have1 where start_dt between '01jul2013'd and '30jun2015'd;
NOTE: Table WORK.A created, with 0 rows and 3 columns.

158 quit;
NOTE: PROCEDURE SQL used (Total process time):
real time 0.04 seconds
cpu time 0.03 seconds


159 run;

 

Solution
‎10-13-2016 08:55 AM
Respected Advisor
Posts: 4,640

Re: Extracting id's satisfying time period conditions

Read your data with modified list input:

 

data have1;
infile cards;
input id start_dt :mmddyy10. end_dt :mmddyy10. ;
cards;
1   01/01/2013  01/03/2014
2   08/01/2013  01/01/2015
2   03/01/2015  04/03/2015
3   12/12/2014  04/01/2016
;

instead. Without the colon before the informat name SAS tries to read data from the current cursor position. With the colon, SAS reads the next field (skipping any spaces) and then transforms it using the informat.

PG
Frequent Contributor
Posts: 87

Re: Extracting id's satisfying time period conditions

Thanks. But, I'm still strugging with my original problem, although i made quite a progress getting to the point where i have 4 variables, id, start_dt, end_dt and total months of membership. I have to now combine total membership months for each id but subtracting by 1 those cases where the first end_dt of a member was the same as start_dt of next membership period (to avoid counting that month twice). So, what im left with is first extracting mmYYYY from start_dt and end_dt and then using combination of retain statement and first. function perhaps to get what i want. Here is what my code looks so far.

ata hh_2014_2;
set  hh_ ;
if end_dt >= '01jul2013'd and start_dt < '01aug2013'd then july_2013 = 1;
if end_dt >= '01aug2013'd and start_dt < '01sep2013'd then aug_2013 = 1; if end_dt >= '01sep2013'd and start_dt <'01oct2013'd then sep_2013 = 1;
if end_dt >= '01oct2013'd and start_dt < '01nov2013'd  then oct_2013 = 1; if end_dt >= '01nov2013'd and start_dt <'01dec2013'd then nov_2013 = 1;
if end_dt >= '01dec2013'd and start_dt <'01jan2014'd then dec_2013 = 1; if end_dt >= '01jan2014'd and start_dt < '01feb2014'd then jan_2014 = 1;
if end_dt >= '01feb2014'd and start_dt <'01mar2014'd then feb_2014 = 1; if end_dt >= '01mar2014'd and start_dt < '01apr2014'd then mar_2014 = 1;
if end_dt >= '01apr2014'd and start_dt < '01may2014'd then apr_2014 = 1; if end_dt >= '01may2014'd and start_dt < '01jun2014'd then may_2014 = 1;
if end_dt >= '01jun2014'd and start_dt < '01jul2014'd then jun_2014 = 1;
run;

proc sort data = hh_2014_2;
by medicaid_id start_dt;
run;

data hh_2014_3;
set hh_2014_2 ;
total_months = sum(july_2013, aug_2013, sep_2013, oct_2013, nov_2013, dec_2013, jan_2014, feb_2014, mar_2014, apr_2014, may_2014,jun_2014);
run;

proc sql;
create table hh_2014_4 as select medicaid_id, start_dt, end_dt, total_months from hh_2014_3 where total_months > 0;
quit;
run;
Frequent Contributor
Posts: 87

Re: Extracting id's satisfying time period conditions

Super User
Posts: 10,460

Re: Extracting id's satisfying time period conditions

To expand slightly on @JediApprentice, SAS will only accept date literals in the form of 'dd MONyy'd or 'ddMONyyyy'd where dd is days of the month, yy or yyyy is year and MON is the three-letter abbreviation of the month's name and the D at the end is required to tell SAS it is a Date value not a string. Similar things may be done with Time literals and datetime literals used T and DT with different appearance.

☑ This topic is SOLVED.

Need further help from the community? Please ask a new question.

Discussion stats
  • 8 replies
  • 434 views
  • 3 likes
  • 4 in conversation