How to loop through observations with same id number for program begin date in a data step

Accepted Solution Solved
Reply
Occasional Contributor
Posts: 10
Accepted Solution

How to loop through observations with same id number for program begin date in a data step

[ Edited ]

I have a table of students enrolled on November 10, 2015 who qualify for a certain program by student id num (program bit is 1 already for all observations in the table).  This particular program has different levels (represented by different program codes).  There may be multiplebegin and end dates for a student id number for different dates of enrollment in the program.  I want the first date of identification in this program (and grade level identified, but just the first date to start) for students who were enrolled in the school and the program on November 10, 2015 (first "BeginDate" must be on or before November 10, 2015, and last "EndDate" must be on or after November 10, 2015 for each unique student ID #).  Data is shown only for students who were enrolled in the school on 11/10/15, and thus the enrollment date will be the same for multiple instances of a student IDnum.  It doesn't matter when the enrollment began or ended so long as they were enrolled in the school on 11.10.15.  Also the grade level and school number will remain the same for multiple observations of the same student (ID number), as the table shows the grade level/school enrolled in on the particular day 11/10/15 and the particular program enrollments each ID has had/has now.  Some students are infinitely enrolled in this program (EndDate 01Jan3000).  Grade level represent the student's grade level last school year, 2015-16.

 

What I tried...

 

data Check;
merge program (in = a) enrollment (in = b); *program is the program enrollment data and enrollment is the school enrollment data;
by StudentNum;
if a = 1 and b = 1;
FirstInstance = first.IdNum;
LastInstance = last.IdNum;

do IdNum = 1 to n;
if LastInstance = 1 and EndDate <= '10NOV2015:00:00:00.000'dt then notapplicable = 1;
if FirstInstance = 1 and BeginDate >= '10NOV2015:00:00:00.000'dt then notapplicable2 = 1;
end;

run;

 

I'm having a hard time figuring out the do loop.

 

Thanks.

 

IDNum BeginDate EndDate InPrgmBit PrgmCd SchNum Gradelvl EnrollmentBeginDate EnrollmentEndDate
555000 05Mar2004:00:00:00.000 19Jul2005:00:00:00.000 1 540 2 12 12Aug2015:00:00:00.000 26May2016:00:00:00.000
555000 12Aug2006:00:00:00.000 15Aug2007:00:00:00.000 1 540 2 12 12Aug2015:00:00:00.000 26May2016:00:00:00.000
555000 09Sep2008:00:00:00.000 02Mar2010:00:00:00.000 1 540 2 12 12Aug2015:00:00:00.000 26May2016:00:00:00.000
555000 10Apr2011:00:00:00.000 16Oct2013:00:00:00.000 1 540 2 12 12Aug2015:00:00:00.000 26May2016:00:00:00.000
555007 21Apr2012:00:00:00.000 22Jun2013:00:00:00.000 1 250 6 4 05Sep2015:00:00:00.000 11Dec2015:00:00:00.000
555007 27Dec2013:00:00:00.000 05Dec2016:00:00:00.000 1 250 6 4 05Sep2015:00:00:00.000 11Dec2015:00:00:00.000
555822 09Jan2004:00:00:00.000 14Dec2006:00:00:00.000 1 330 8 10 21Sep2015:00:00:00.000 15Jun2016:00:00:00.000
555822 05Mar2007:00:00:00.000 05Mar2008:00:00:00.000 1 330 8 10 21Sep2015:00:00:00.000 15Jun2016:00:00:00.000
555822 05Mar2010:00:00:00.000 05Mar2012:00:00:00.000 1 330 8 10 21Sep2015:00:00:00.000 15Jun2016:00:00:00.000
555822 05Mar2013:00:00:00.000 05Mar2014:00:00:00.000 1 330 8 10 21Sep2015:00:00:00.000 15Jun2016:00:00:00.000
555822 10May2015:00:00:00.000 01Jan3000:00:00:00.000 1 330 8 10 21Sep2015:00:00:00.000 15Jun2016:00:00:00.000
555999 25Feb2014:00:00:00.000 05Mar2015:00:00:00.000 1 250 4 3 05Jul2015:00:00:00.000 12Jun2016:00:00:00.000
555999 10Aug2015:00:00:00.000 08May2016:00:00:00.000 1 250 4 3 05Jul2015:00:00:00.000 12Jun2016:00:00:00.000
556111 05Mar2016:00:00:00.000 01Jan3000:00:00:00.000 1 600 6 0 28Jul2015:00:00:00.000 20May2016:00:00:00.000
556220 05Mar2014:00:00:00.000 01Jan2016:00:00:00.000 1 400 1 2 13Oct2015:00:00:00.000 03Mar2016:00:00:00.000

 

I use SAS 9.4


Accepted Solutions
Solution
‎12-21-2016 01:55 PM
Super User
Posts: 1,230

Re: How to loop through observations with same id number for program begin date in a data step

Now it is much more clear what you are looking for.

I changed 2nd step only but left dates as sas dates and not as datetime.

 

%macro dt2d(v);
&v=datepart(&v); format &v date9.;
%mend dt2d;
options mprint;

data have(rename=(
EnrollmentBeginDate = EnBeginDate
EnrollmentEndDate = EnEndDate
));
infile datalines truncover;
informat BeginDate EndDate
EnrollmentBeginDate EnrollmentEndDate datetime25.;
input IDNum BeginDate EndDate InPrgmBit PrgmCd SchNum Gradelvl
EnrollmentBeginDate EnrollmentEndDate;

%dt2d(BeginDate);
%dt2d(EndDate);
%dt2d(EnrollmentBeginDate);
%dt2d(EnrollmentEndDate);

datalines;
555000 05Mar2004:00:00:00.000 19Jul2005:00:00:00.000 1 540 2 12 12Aug2015:00:00:00.000 26May2016:00:00:00.000
555000 12Aug2006:00:00:00.000 15Aug2007:00:00:00.000 1 540 2 12 12Aug2015:00:00:00.000 26May2016:00:00:00.000
555000 09Sep2008:00:00:00.000 02Mar2010:00:00:00.000 1 540 2 12 12Aug2015:00:00:00.000 26May2016:00:00:00.000
555000 10Apr2011:00:00:00.000 16Oct2013:00:00:00.000 1 540 2 12 12Aug2015:00:00:00.000 26May2016:00:00:00.000
555007 21Apr2012:00:00:00.000 22Jun2013:00:00:00.000 1 250 6 4 05Sep2015:00:00:00.000 11Dec2015:00:00:00.000
555007 27Dec2013:00:00:00.000 05Dec2016:00:00:00.000 1 250 6 4 05Sep2015:00:00:00.000 11Dec2015:00:00:00.000
555822 09Jan2004:00:00:00.000 14Dec2006:00:00:00.000 1 330 8 10 21Sep2015:00:00:00.000 15Jun2016:00:00:00.000
555822 05Mar2007:00:00:00.000 05Mar2008:00:00:00.000 1 330 8 10 21Sep2015:00:00:00.000 15Jun2016:00:00:00.000
555822 05Mar2010:00:00:00.000 05Mar2012:00:00:00.000 1 330 8 10 21Sep2015:00:00:00.000 15Jun2016:00:00:00.000
555822 05Mar2013:00:00:00.000 05Mar2014:00:00:00.000 1 330 8 10 21Sep2015:00:00:00.000 15Jun2016:00:00:00.000
555822 10May2015:00:00:00.000 01Jan3000:00:00:00.000 1 330 8 10 21Sep2015:00:00:00.000 15Jun2016:00:00:00.000
555999 25Feb2014:00:00:00.000 05Mar2015:00:00:00.000 1 250 4 3 05Jul2015:00:00:00.000 12Jun2016:00:00:00.000
555999 10Aug2015:00:00:00.000 08May2016:00:00:00.000 1 250 4 3 05Jul2015:00:00:00.000 12Jun2016:00:00:00.000
556111 05Mar2016:00:00:00.000 01Jan3000:00:00:00.000 1 600 6 0 28Jul2015:00:00:00.000 20May2016:00:00:00.000
556220 05Mar2014:00:00:00.000 01Jan2016:00:00:00.000 1 400 1 2 13Oct2015:00:00:00.000 03Mar2016:00:00:00.000
; run;

%let check_date = '10NOV2015'd;
data want;
format IDNum BeginDate EndDate;
set have;
by IDNum;
retain BeginDT EndDT;
if first.IDNum then do;
BeginDT = BeginDate;
EndDT = .;
end;

if BeginDate le &check_date
then BeginDT = min(BeginDT, BeginDate);
if EndDate ge &check_date
then EndDT = max(EndDT, EndDate);

If last.IDNum and
BeginDT le &check_date and
EndDT ge &check_date then do;
BeginDate = BeginDT;
EndDate = EndDT;
output;
end;
KEEP IDNum BeginDate EndDate;
run;

If you wnt datetime instead sas dates add next lines, before the output statemnt:

   BeginDate = dhms(BeginDate,0,0,0);

   EndDtae = dhms(EndDate,0,0,0);

   format BeginDate EndDate datetime15.;

 

View solution in original post


All Replies
Grand Advisor
Posts: 10,210

Re: How to loop through observations with same id number for program begin date in a data step

Many users here don't want to download Excel files because of virus potential, others have such things blocked by security software. Also if you give us Excel we have to create a SAS data set and due to the non-existent constraints on Excel data cells the result we end up with may not have variables of the same type (numeric or character) and even values.

 

Instructions here: https://communities.sas.com/t5/SAS-Communities-Library/How-to-create-a-data-step-version-of-your-dat... will show you how to generate datastep code that you can post here that will allow us to duplicate your example data and to test code with your data.

 

Note: You might also want to consider are your dates of concern actually dates or the apparent datetimes as posted. They behave differently and generally datetime is somewhat a problem to work with when what you actually want are dates.

 

It also will help if you can show what the desired output for your example data would look like.

Occasional Contributor
Posts: 10

Re: How to loop through observations with same id number for program begin date in a data step

IDNum BeginDate EndDate InPrgmBit PrgmCd SchNum Gradelvl EnrollmentBeginDate EnrollmentEndDate
555000 05Mar2004:00:00:00.000 19Jul2005:00:00:00.000 1 540 2 12 12Aug2015:00:00:00.000 26May2016:00:00:00.000
555000 12Aug2006:00:00:00.000 15Aug2007:00:00:00.000 1 540 2 12 12Aug2015:00:00:00.000 26May2016:00:00:00.000
555000 09Sep2008:00:00:00.000 02Mar2010:00:00:00.000 1 540 2 12 12Aug2015:00:00:00.000 26May2016:00:00:00.000
555000 10Apr2011:00:00:00.000 16Oct2013:00:00:00.000 1 540 2 12 12Aug2015:00:00:00.000 26May2016:00:00:00.000
555007 21Apr2012:00:00:00.000 22Jun2013:00:00:00.000 1 250 6 4 05Sep2015:00:00:00.000 11Dec2015:00:00:00.000
555007 27Dec2013:00:00:00.000 05Dec2016:00:00:00.000 1 250 6 4 05Sep2015:00:00:00.000 11Dec2015:00:00:00.000
555822 09Jan2004:00:00:00.000 14Dec2006:00:00:00.000 1 330 8 10 21Sep2015:00:00:00.000 15Jun2016:00:00:00.000
555822 05Mar2007:00:00:00.000 05Mar2008:00:00:00.000 1 330 8 10 21Sep2015:00:00:00.000 15Jun2016:00:00:00.000
555822 05Mar2010:00:00:00.000 05Mar2012:00:00:00.000 1 330 8 10 21Sep2015:00:00:00.000 15Jun2016:00:00:00.000
555822 05Mar2013:00:00:00.000 05Mar2014:00:00:00.000 1 330 8 10 21Sep2015:00:00:00.000 15Jun2016:00:00:00.000
555822 10May2015:00:00:00.000 01Jan3000:00:00:00.000 1 330 8 10 21Sep2015:00:00:00.000 15Jun2016:00:00:00.000
555999 25Feb2014:00:00:00.000 05Mar2015:00:00:00.000 1 250 4 3 05Jul2015:00:00:00.000 12Jun2016:00:00:00.000
555999 10Aug2015:00:00:00.000 08May2016:00:00:00.000 1 250 4 3 05Jul2015:00:00:00.000 12Jun2016:00:00:00.000
556111 05Mar2016:00:00:00.000 01Jan3000:00:00:00.000 1 600 6 0 28Jul2015:00:00:00.000 20May2016:00:00:00.000
556220 05Mar2014:00:00:00.000 01Jan2016:00:00:00.000 1 400 1 2 13Oct2015:00:00:00.000 03Mar2016:00:00:00.000

 

I want actual dates.  These files are from SQL tables hence the date formats.

 

Desired output is the grade level identified for this program, but for now I just want to be able to pull the first date identified for the program using the conditions I previously described.  Let me know if you need clarification. 

 

 

Super User
Posts: 1,230

Re: How to loop through observations with same id number for program begin date in a data step

[ Edited ]

It is not clear enough what is your target. Anyhow:

 

-  input datetime using INFORMAT datetime25.

- convert datetime variables into date variables by:

 datetime_var = datepart(datetime_var);
 format datetime_var  date9.;  /* rename is not a must */

- you can short your code and do:

%let check_date = '07OCT2015'd;
data Check;
    merge program (in = a) 
    enrollment          (in = b);
 by IDNum ;   /* = StudentNum ???? */
       if a = 1 and b = 1;

      if  first.idnum and begindate ge &check_date then notapplicable2 = 1;
      if  last.idnum and enddate    le &check_date then notapplicable   = 1;
run;

- why did you loop IDNum 1 to n ? - you haven't define value to n ? 

Super User
Posts: 1,230

Re: How to loop through observations with same id number for program begin date in a data step

[ Edited ]

my full test program code is attched:

 

%macro dt2d(v);
      &v=datepart(&v); format &v date9.;
%mend dt2d;

data have;
   infile datalines truncover;
   informat BeginDate EndDate
                 EnrollmentBeginDate EnrollmentEndDate datetime25.;
   input IDNum BeginDate EndDate InPrgmBit PrgmCd SchNum Gradelvl
               EnrollmentBeginDate EnrollmentEndDate;

%dt2d(BeginDate);
%dt2d(EndDate);
%dt2d(EnrollmentBeginDate);
%dt2d(EnrollmentEndDate);

 

datalines;
555000 05Mar2004:00:00:00.000 19Jul2005:00:00:00.000 1 540 2 12 12Aug2015:00:00:00.000 26May2016:00:00:00.000
555000 12Aug2006:00:00:00.000 15Aug2007:00:00:00.000 1 540 2 12 12Aug2015:00:00:00.000 26May2016:00:00:00.000
555000 09Sep2008:00:00:00.000 02Mar2010:00:00:00.000 1 540 2 12 12Aug2015:00:00:00.000 26May2016:00:00:00.000
555000 10Apr2011:00:00:00.000 16Oct2013:00:00:00.000 1 540 2 12 12Aug2015:00:00:00.000 26May2016:00:00:00.000
555007 21Apr2012:00:00:00.000 22Jun2013:00:00:00.000 1 250 6 4 05Sep2015:00:00:00.000 11Dec2015:00:00:00.000
555007 27Dec2013:00:00:00.000 05Dec2016:00:00:00.000 1 250 6 4 05Sep2015:00:00:00.000 11Dec2015:00:00:00.000
555822 09Jan2004:00:00:00.000 14Dec2006:00:00:00.000 1 330 8 10 21Sep2015:00:00:00.000 15Jun2016:00:00:00.000
555822 05Mar2007:00:00:00.000 05Mar2008:00:00:00.000 1 330 8 10 21Sep2015:00:00:00.000 15Jun2016:00:00:00.000
555822 05Mar2010:00:00:00.000 05Mar2012:00:00:00.000 1 330 8 10 21Sep2015:00:00:00.000 15Jun2016:00:00:00.000
555822 05Mar2013:00:00:00.000 05Mar2014:00:00:00.000 1 330 8 10 21Sep2015:00:00:00.000 15Jun2016:00:00:00.000
555822 10May2015:00:00:00.000 01Jan3000:00:00:00.000 1 330 8 10 21Sep2015:00:00:00.000 15Jun2016:00:00:00.000
555999 25Feb2014:00:00:00.000 05Mar2015:00:00:00.000 1 250 4 3 05Jul2015:00:00:00.000 12Jun2016:00:00:00.000
555999 10Aug2015:00:00:00.000 08May2016:00:00:00.000 1 250 4 3 05Jul2015:00:00:00.000 12Jun2016:00:00:00.000
556111 05Mar2016:00:00:00.000 01Jan3000:00:00:00.000 1 600 6 0 28Jul2015:00:00:00.000 20May2016:00:00:00.000
556220 05Mar2014:00:00:00.000 01Jan2016:00:00:00.000 1 400 1 2 13Oct2015:00:00:00.000 03Mar2016:00:00:00.000
; run;

 

data want;
   set have;
    by IDNum;
         if first.IDNum and BeginDate le '07oct2015'd then notapplicable2=1;
         if last.IDNum and EndDate   ge '07oct2015'd then notapplicable=1;
run;

Occasional Contributor
Posts: 10

Re: How to loop through observations with same id number for program begin date in a data step

[ Edited ]

Basically all I want is the BeginDate for each student ID that was in the program on November 10, 2015. So all we really need is IDNum, BeginDate, and EndDate to start. So the first BeginDate must be before November 10, 2015 and the last EndDate must be after November 10, 2015 for a specific ID number for their first date in the program to be counted in this. I would like the final dataset to be IDNums and a new variable called DateIdent with the students who meet the conditions I described. I put example output I would like to see.

 

DATASET

IDNum BeginDate EndDate

555000 05Mar2004:00:00:00.000 19Jul2005:00:00:00.000 

555000 12Aug2006:00:00:00.000 15Aug2007:00:00:00.000 

555000 09Sep2008:00:00:00.000 02Mar2010:00:00:00.000 

555000 10Apr2011:00:00:00.000 16Oct2013:00:00:00.000 

555007 21Apr2012:00:00:00.000 22Jun2013:00:00:00.000 

555007 27Dec2013:00:00:00.000 05Dec2016:00:00:00.000 

555822 09Jan2004:00:00:00.000 14Dec2006:00:00:00.000

555822 05Mar2007:00:00:00.000 05Mar2008:00:00:00.000

555822 05Mar2010:00:00:00.000 05Mar2012:00:00:00.000 

555822 05Mar2013:00:00:00.000 05Mar2014:00:00:00.000 

555822 10May2015:00:00:00.000 01Jan3000:00:00:00.000

555999 25Feb2014:00:00:00.000 05Mar2015:00:00:00.000

555999 10Aug2015:00:00:00.000 08May2016:00:00:00.000 

556111 05Mar2016:00:00:00.000 01Jan3000:00:00:00.000

556220 05Mar2014:00:00:00.000 01Jan2016:00:00:00.000

 

DESIRED OUTPUT

IDNum DateIdentified
555007 21Apr2012:00:00:00.000
555822 09Jan2004:00:00:00.000
555999 25Feb2014:00:00:00.000
556220 05Mar2014:00:00:00.000
Solution
‎12-21-2016 01:55 PM
Super User
Posts: 1,230

Re: How to loop through observations with same id number for program begin date in a data step

Now it is much more clear what you are looking for.

I changed 2nd step only but left dates as sas dates and not as datetime.

 

%macro dt2d(v);
&v=datepart(&v); format &v date9.;
%mend dt2d;
options mprint;

data have(rename=(
EnrollmentBeginDate = EnBeginDate
EnrollmentEndDate = EnEndDate
));
infile datalines truncover;
informat BeginDate EndDate
EnrollmentBeginDate EnrollmentEndDate datetime25.;
input IDNum BeginDate EndDate InPrgmBit PrgmCd SchNum Gradelvl
EnrollmentBeginDate EnrollmentEndDate;

%dt2d(BeginDate);
%dt2d(EndDate);
%dt2d(EnrollmentBeginDate);
%dt2d(EnrollmentEndDate);

datalines;
555000 05Mar2004:00:00:00.000 19Jul2005:00:00:00.000 1 540 2 12 12Aug2015:00:00:00.000 26May2016:00:00:00.000
555000 12Aug2006:00:00:00.000 15Aug2007:00:00:00.000 1 540 2 12 12Aug2015:00:00:00.000 26May2016:00:00:00.000
555000 09Sep2008:00:00:00.000 02Mar2010:00:00:00.000 1 540 2 12 12Aug2015:00:00:00.000 26May2016:00:00:00.000
555000 10Apr2011:00:00:00.000 16Oct2013:00:00:00.000 1 540 2 12 12Aug2015:00:00:00.000 26May2016:00:00:00.000
555007 21Apr2012:00:00:00.000 22Jun2013:00:00:00.000 1 250 6 4 05Sep2015:00:00:00.000 11Dec2015:00:00:00.000
555007 27Dec2013:00:00:00.000 05Dec2016:00:00:00.000 1 250 6 4 05Sep2015:00:00:00.000 11Dec2015:00:00:00.000
555822 09Jan2004:00:00:00.000 14Dec2006:00:00:00.000 1 330 8 10 21Sep2015:00:00:00.000 15Jun2016:00:00:00.000
555822 05Mar2007:00:00:00.000 05Mar2008:00:00:00.000 1 330 8 10 21Sep2015:00:00:00.000 15Jun2016:00:00:00.000
555822 05Mar2010:00:00:00.000 05Mar2012:00:00:00.000 1 330 8 10 21Sep2015:00:00:00.000 15Jun2016:00:00:00.000
555822 05Mar2013:00:00:00.000 05Mar2014:00:00:00.000 1 330 8 10 21Sep2015:00:00:00.000 15Jun2016:00:00:00.000
555822 10May2015:00:00:00.000 01Jan3000:00:00:00.000 1 330 8 10 21Sep2015:00:00:00.000 15Jun2016:00:00:00.000
555999 25Feb2014:00:00:00.000 05Mar2015:00:00:00.000 1 250 4 3 05Jul2015:00:00:00.000 12Jun2016:00:00:00.000
555999 10Aug2015:00:00:00.000 08May2016:00:00:00.000 1 250 4 3 05Jul2015:00:00:00.000 12Jun2016:00:00:00.000
556111 05Mar2016:00:00:00.000 01Jan3000:00:00:00.000 1 600 6 0 28Jul2015:00:00:00.000 20May2016:00:00:00.000
556220 05Mar2014:00:00:00.000 01Jan2016:00:00:00.000 1 400 1 2 13Oct2015:00:00:00.000 03Mar2016:00:00:00.000
; run;

%let check_date = '10NOV2015'd;
data want;
format IDNum BeginDate EndDate;
set have;
by IDNum;
retain BeginDT EndDT;
if first.IDNum then do;
BeginDT = BeginDate;
EndDT = .;
end;

if BeginDate le &check_date
then BeginDT = min(BeginDT, BeginDate);
if EndDate ge &check_date
then EndDT = max(EndDT, EndDate);

If last.IDNum and
BeginDT le &check_date and
EndDT ge &check_date then do;
BeginDate = BeginDT;
EndDate = EndDT;
output;
end;
KEEP IDNum BeginDate EndDate;
run;

If you wnt datetime instead sas dates add next lines, before the output statemnt:

   BeginDate = dhms(BeginDate,0,0,0);

   EndDtae = dhms(EndDate,0,0,0);

   format BeginDate EndDate datetime15.;

 

Occasional Contributor
Posts: 10

Re: How to loop through observations with same id number for program begin date in a data step

[ Edited ]

Thank you so much! I'm just getting back into SAS after three years, and this was very helpful.

☑ This topic is SOLVED.

Need further help from the community? Please ask a new question.

Discussion stats
  • 7 replies
  • 448 views
  • 0 likes
  • 3 in conversation