Hi...I am trying to include another condition for Intake. The conditions are if the student has enrolled in the program for the first time based on AcadYear then Intake is 'Initial' otherwise Intake is 'Continued'. However, I am trying to include another condition and that is if a student has withdrew and has re-enrolled in the following AcadYear, then that student is to be considered 'Initial'. This second condition is the one I am having problem including in the code I have so far. Thanks
data Have; length ID 8 Program $ 13 AcadYear $ 5 StartDate $ 10 WithdrawDate $ 10; infile datalines dlm='' missover dsd; input ID : best32. Program : $char13. AcadYear : $char5. StartDate : $char10. WithdrawDate : $char10.; datalines; 1 AutoMechanics 19/20 2019-09-05 2020-01-15 1 AutoMechanics 20/21 2020-09-05 2 Carpentry 19/20 2019-09-05 2 Carpentry 20/21 2020-09-05 2021-01-15 3 Drafting 19/20 2019-09-05 3 Drafting 20/21 2020-09-05 4 Hairstyling 17/18 2017-09-05 4 Hairstyling 18/19 2018-09-05 4 Hairstyling 19/20 2019-09-05 2020-01-15 4 Hairstyling 20/21 2020-09-05 ; data Want; set Have; by ID Program notsorted; length Intake $ 10; format Intake $char10.; retain AcdYr Witdraw; if first.Program then do; AcdYr = AcadYear; Intake='Initial'; Withdraw=WithdrawDate; end; if first.Program and AcadYear=AcdYr then Intake='Initial'; else if not first.Program and AcadYear=AcdYr then Intake='Initial'; else Intake='Continued'; output; drop AcdYr Withdraw; run;
data Want;
set Want;
by ID Program;
if not missing(lag(WithdrawDate)) and Intake = 'Continued' then
Intake = 'Initial';
if missing(lag(WithdrawDate)) and not missing(WithdrawDate) and lag(Intake) = 'Initial' then
Intake = 'Initial';
run;
Want: ID Program AcadYear StartDate WithdrawDate Intake 1 AutoMechanics 19/20 2019-09-05 2020-01-15 Initial 1 AutoMechanics 20/21 2020-09-05 Initial 2 Carpentry 19/20 2019-09-05 Initial 2 Carpentry 20/21 2020-09-05 2021-01-15 Initial 3 Drafting 19/20 2019-09-05 Initial 3 Drafting 20/21 2020-09-05 Continued 4 Hairstyling 17/18 2017-09-05 Initial 4 Hairstyling 18/19 2018-09-05 Continued 4 Hairstyling 19/20 2019-09-05 2020-01-15 Continued 4 Hairstyling 20/21 2020-09-05 Initial
This is a special class of problem that I handle this way. I create a "journey" string whereby chronological events are added to the end of a string, each different event type indicated by single, unique character. Being that they are added to the string they should be ordered on the string as they occurred. The resulting string can then be used in logic to answer questions. If done simply enough, one can use simple string comparison functions. Perl regular expressions can be employed for more complicated situations.
To use a 'Bounded String Compare' via SAS colon syntax, I'm programming this in terms of a "chronologically reversed journey".
data Have (SortedBy=ID Program AcadYear );
length ID 8 Program $ 13 AcadYear $ 5 StartDate $ 10 WithdrawDate $ 10;
infile datalines dlm='' missover dsd;
input ID : best32. Program : $char13. AcadYear : $char5. StartDate : $char10. WithdrawDate : $char10.;
datalines;
1 AutoMechanics 19/20 2019-09-05 2020-01-15
1 AutoMechanics 20/21 2020-09-05
2 Carpentry 19/20 2019-09-05
2 Carpentry 20/21 2020-09-05 2021-01-15
3 Drafting 19/20 2019-09-05
3 Drafting 20/21 2020-09-05
4 Hairstyling 17/18 2017-09-05
4 Hairstyling 18/19 2018-09-05
4 Hairstyling 19/20 2019-09-05 2020-01-15
4 Hairstyling 20/21 2020-09-05
;
data want_ish;
if 0 then set &syslast;
length Rev_journey $10;
Rev_journey=" "; /*"reversed Journey"*/
do until (Last.Program);
set &syslast;
by ID Program;
Rev_journey=catt("E",Rev_journey);
if not missing(WithdrawDate )
then Rev_journey=catt("W",Rev_journey);
select ;
when (Rev_journey=:"EE") Intake='Continued';
when (Rev_journey=:"WEC") Intake='Continued';
When (Rev_journey=:"EWE") Intake='Initial';
when (Rev_journey=:"E") Intake='Initial';
when (Rev_journey=:"WEE") Intake='Initial';
when (Rev_journey=:"WE") Intake='Initial';
otherwise Intake=' ';/*?*/
end;
if Intake='Continued'
then Rev_journey=catt("C",Rev_journey);
output;
end;
run;
ID | Program | AcadYear | StartDate | WithdrawDate | Rev_journey | Intake |
1 | AutoMechanics | 19/20 | 9/5/2019 | 1/15/2020 | WE | Initial |
1 | AutoMechanics | 20/21 | 9/5/2020 | EWE | Initial | |
2 | Carpentry | 19/20 | 9/5/2019 | E | Initial | |
2 | Carpentry | 20/21 | 9/5/2020 | 1/15/2021 | WEE | Initial |
3 | Drafting | 19/20 | 9/5/2019 | E | Initial | |
3 | Drafting | 20/21 | 9/5/2020 | CEE | Continued | |
4 | Hairstyling | 17/18 | 9/5/2017 | E | Initial | |
4 | Hairstyling | 18/19 | 9/5/2018 | CEE | Continued | |
4 | Hairstyling | 19/20 | 9/5/2019 | 1/15/2020 | CWECEE | Continued |
4 | Hairstyling | 20/21 | 9/5/2020 | ECWECEE | Initial |
This is a special class of problem that I handle this way. I create a "journey" string whereby chronological events are added to the end of a string, each different event type indicated by single, unique character. Being that they are added to the string they should be ordered on the string as they occurred. The resulting string can then be used in logic to answer questions. If done simply enough, one can use simple string comparison functions. Perl regular expressions can be employed for more complicated situations.
To use a 'Bounded String Compare' via SAS colon syntax, I'm programming this in terms of a "chronologically reversed journey".
data Have (SortedBy=ID Program AcadYear );
length ID 8 Program $ 13 AcadYear $ 5 StartDate $ 10 WithdrawDate $ 10;
infile datalines dlm='' missover dsd;
input ID : best32. Program : $char13. AcadYear : $char5. StartDate : $char10. WithdrawDate : $char10.;
datalines;
1 AutoMechanics 19/20 2019-09-05 2020-01-15
1 AutoMechanics 20/21 2020-09-05
2 Carpentry 19/20 2019-09-05
2 Carpentry 20/21 2020-09-05 2021-01-15
3 Drafting 19/20 2019-09-05
3 Drafting 20/21 2020-09-05
4 Hairstyling 17/18 2017-09-05
4 Hairstyling 18/19 2018-09-05
4 Hairstyling 19/20 2019-09-05 2020-01-15
4 Hairstyling 20/21 2020-09-05
;
data want_ish;
if 0 then set &syslast;
length Rev_journey $10;
Rev_journey=" "; /*"reversed Journey"*/
do until (Last.Program);
set &syslast;
by ID Program;
Rev_journey=catt("E",Rev_journey);
if not missing(WithdrawDate )
then Rev_journey=catt("W",Rev_journey);
select ;
when (Rev_journey=:"EE") Intake='Continued';
when (Rev_journey=:"WEC") Intake='Continued';
When (Rev_journey=:"EWE") Intake='Initial';
when (Rev_journey=:"E") Intake='Initial';
when (Rev_journey=:"WEE") Intake='Initial';
when (Rev_journey=:"WE") Intake='Initial';
otherwise Intake=' ';/*?*/
end;
if Intake='Continued'
then Rev_journey=catt("C",Rev_journey);
output;
end;
run;
ID | Program | AcadYear | StartDate | WithdrawDate | Rev_journey | Intake |
1 | AutoMechanics | 19/20 | 9/5/2019 | 1/15/2020 | WE | Initial |
1 | AutoMechanics | 20/21 | 9/5/2020 | EWE | Initial | |
2 | Carpentry | 19/20 | 9/5/2019 | E | Initial | |
2 | Carpentry | 20/21 | 9/5/2020 | 1/15/2021 | WEE | Initial |
3 | Drafting | 19/20 | 9/5/2019 | E | Initial | |
3 | Drafting | 20/21 | 9/5/2020 | CEE | Continued | |
4 | Hairstyling | 17/18 | 9/5/2017 | E | Initial | |
4 | Hairstyling | 18/19 | 9/5/2018 | CEE | Continued | |
4 | Hairstyling | 19/20 | 9/5/2019 | 1/15/2020 | CWECEE | Continued |
4 | Hairstyling | 20/21 | 9/5/2020 | ECWECEE | Initial |
Thanks PhilC
Why is observation 2 of ID 2 considered "Initial", when the student did not withdraw in the previous year?
According to your description (NOT your "want" data as posted), see this code:
data Have;
length
ID $4
Program $ 13
AcadYear $ 5
StartDate 4
WithdrawDate 4
;
infile datalines dlm='' missover dsd;
input
ID $
Program : $char13.
AcadYear : $char5.
StartDate : yymmdd10.
WithdrawDate : yymmdd10.
;
format StartDate WithdrawDate yymmdd10.;
datalines;
1 AutoMechanics 19/20 2019-09-05 2020-01-15
1 AutoMechanics 20/21 2020-09-05
2 Carpentry 19/20 2019-09-05
2 Carpentry 20/21 2020-09-05 2021-01-15
3 Drafting 19/20 2019-09-05
3 Drafting 20/21 2020-09-05
4 Hairstyling 17/18 2017-09-05
4 Hairstyling 18/19 2018-09-05
4 Hairstyling 19/20 2019-09-05 2020-01-15
4 Hairstyling 20/21 2020-09-05
;
data want;
set have;
by id;
length Intake $ 10;
l_year = lag(acadyear);
if first.id or lag(withdrawdate) ne .
then intake = "Initial";
else intake = "Continued";
run;
Note that the initial data step reads the ID as character (you won't do calculations with it), and the dates as SAS date values. Storing dates as strings is BAD because it deprives you of the use of all the nice tools SAS provides for working with dates.
Join us for SAS Innovate 2025, our biggest and most exciting global event of the year, in Orlando, FL, from May 6-9. Sign up by March 14 for just $795.
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.
Ready to level-up your skills? Choose your own adventure.