Hi....I am trying to identify records that are either "Initial" or "Continued' based on the by grouping variables. From the code I am using below, I am getting only the first row as 'Initial' and because it is possible to have duplicates based on the by grouping variables, I would like these duplicates to indicate 'Initial'. Not sure how to do this...thanks.
data List100; set List100; length Intake $10; by StudentUID Programs Level AcademicYearBeg2; if (first.StudentUID and first.Programs and first.Level and first.AcademicYearBeg2) then Intake = 'Initial'; else Intake = 'Continued'; run;
Hi...Continued refers to when the student will continue to take the same program in the following academic year/s whereas Initial refers to the first time the student is enrolled in the program. Thanks.
data have; length StudentUID 8 Programs $ 18 Level $ 1 Term $ 6 AcademicYearBeg2 $ 7 Intake $ 9; format StudentUID best12. Programs $char18. Level $char1. Term $char6. AcademicYearBeg2 $char7. Intake $char9.; informat StudentUID best12. Programs $char18. Level $char1. Term $char6. AcademicYearBeg2 $char7. Intake $char9.; infile datalines4 dlm='7f'x missover dsd; input StudentUID : best32. Programs : $char18. Level : $char1. Term : $char6. AcademicYearBeg2 : $char7. Intake : $char9.; datalines4; 67684Industrial Welding Winter2016/17Initial 67684Industrial Welding Spring2016/17Continued 67684Industrial Welding Fall2017/18Continued 67685AutoMechanics Fall2016/17Initial 67685AutoMechanics Winter2016/17Continued 67685AutoMechanics Spring2016/17Continued 67685AutoMechanics Fall2017/18Continued ;;;; Want: StudentUID Programs Level Term AcademicYearBeg2 Intake 67684 Industrial Welding Winter 2016/17 Initial 67684 Industrial Welding Spring 2016/17 Initial 67684 Industrial Welding Fall 2017/18 Continued 67685 AutoMechanics Fall 2016/17 Initial 67685 AutoMechanics Winter 2016/17 Initial 67685 AutoMechanics Spring 2016/17 Initial 67685 AutoMechanics Fall 2017/18 Continued
Example data please.
And what the expected output should be for that example.
Hint: only include variables needed and enough records to show different types of use cases.
What 'continues'? I might think the year would be a candidate. Which, from your BY statement might mean that you actually don't want First.AcademicBegYear2 and almost certainly do not want First.StudentUID
Hi...Continued refers to when the student will continue to take the same program in the following academic year/s whereas Initial refers to the first time the student is enrolled in the program. Thanks.
data have; length StudentUID 8 Programs $ 18 Level $ 1 Term $ 6 AcademicYearBeg2 $ 7 Intake $ 9; format StudentUID best12. Programs $char18. Level $char1. Term $char6. AcademicYearBeg2 $char7. Intake $char9.; informat StudentUID best12. Programs $char18. Level $char1. Term $char6. AcademicYearBeg2 $char7. Intake $char9.; infile datalines4 dlm='7f'x missover dsd; input StudentUID : best32. Programs : $char18. Level : $char1. Term : $char6. AcademicYearBeg2 : $char7. Intake : $char9.; datalines4; 67684Industrial Welding Winter2016/17Initial 67684Industrial Welding Spring2016/17Continued 67684Industrial Welding Fall2017/18Continued 67685AutoMechanics Fall2016/17Initial 67685AutoMechanics Winter2016/17Continued 67685AutoMechanics Spring2016/17Continued 67685AutoMechanics Fall2017/18Continued ;;;; Want: StudentUID Programs Level Term AcademicYearBeg2 Intake 67684 Industrial Welding Winter 2016/17 Initial 67684 Industrial Welding Spring 2016/17 Initial 67684 Industrial Welding Fall 2017/18 Continued 67685 AutoMechanics Fall 2016/17 Initial 67685 AutoMechanics Winter 2016/17 Initial 67685 AutoMechanics Spring 2016/17 Initial 67685 AutoMechanics Fall 2017/18 Continued
Some more questions.
None of your example data has any value for LEVEL. So what actual role does this play in the process? If it has one you need to include data that shows it. Otherwise I would say that your need for using LEVEL in the BY process is not needed.
For the example you show you also do not want TERM in the by process though the data likely needs to be in the correct order.
For the example data this seems to do what is requested. Note that I added "newintake" as the calculated value and left in your example data set intake.
data want; set have ; by StudentUID Programs AcademicYearBeg2; length newintake $ 10.; comp= lag(AcademicYearBeg2); if first.Programs then newintake='Initial'; else if AcademicYearBeg2=comp then newintake='Initial'; else newintake='Continued'; drop comp; run;
Since you comparison is based on the AcademicYearBeg2 using the Lag function to get the previous for comparison with the current.
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.
Ready to level-up your skills? Choose your own adventure.