Edited to add additional requirements. Sorry for the confusion.
I have a data set that looks like below:
Sample data:
Id | Term | Audit_type | Prog | Level | Load |
1 | Fall 2015 | 1 | ABCD | 01 | F |
1 | Fall 2015 | 2 | ABCD | 01 | F |
1 | Winter 2016 | 1 | ABCD | 02 | F |
1 | Winter 2016 | 2 | ABCD | 02 | F |
1 | Fall 2016 | 1 | ABCD | 03 | F |
1 | Fall 2016 | 2 | ABCD | 03 | F |
1 | Winter 2017 | 1 | ABCD | 04 | F |
1 | Winter 2017 | 2 | ABCD | 04 | F |
2 | Fall 2012 | 1 | LMNO | 01 | F |
2 | Fall 2012 | 2 | LMNO | 01 | F |
2 | Winter 2013 | 1 | LMNO | 02 | F |
2 | Winter 2013 | 2 | LMNO | 02 | F |
2 | Spring 2013 | 1 | LMNO | 03 | P |
2 | Fall 2013 | 1 | LMNO | 03 | F |
2 | Fall 2013 | 2 | LMNO | 03 | F |
2 | Winter2014 | 1 | WXYZ | 01 | F |
2 | Spring 2014 | 1 | PQRS | 01 | P |
2 | Spring 2014 | 2 | PQRS | 01 | P |
2 | Fall 2014 | 1 | PQRS | 01 | P |
2 | Winter 2015 | 1 | PQRS | 01 | F |
2 | Winter 2015 | 2 | PQRS | 01 | F |
3 | Fall 2016 | 1 | EFGH | 01 | F |
3 | Fall 2016 | 2 | EFGH | 01 | F |
3 | Winter 2017 | 1 | EFGH | 02 | F |
3 | Spring 2017 | 1 | EFGH | 02 | P |
3 | Spring 2017 | 2 | EFGH | 02 | P |
3 | Fall 2017 | 1 | EFGH | 02 | F |
3 | Fall 2017 | 2 | EFGH | 02 | F |
What I need is to select the next two records, for when the student comes back after level '01' in the first term, such that the audit_type is always '2'. And this has to be for every id, term, program combination.
For id #2, the student continues and moves forward in the first program. The second program is discontinued and s/he changes it to a third program, which is continued. I need to bring in information for the first and the third programs both.
So basically, my selection needs to be as below:
Desired Solution:
Id | Term | Audit_type | Prog | Level | Load |
1 | Winter 2016 | 2 | ABCD | 02 | F |
1 | Fall 2016 | 2 | ABCD | 03 | F |
2 | Winter 2013 | 2 | LMNO | 02 | F |
2 | Fall 2013 | 2 | LMNO | 03 | F |
2 | Spring 2014 | 2 | PQRS | 01 | P |
2 | Winter 2015 | 2 | PQRS | 01 | F |
3 | Spring 2017 | 2 | EFGH | 02 | P |
3 | Fall 2017 | 2 | EFGH | 02 | F |
I would appreciate it if someone could please help. Thanks in advance.
This might look easier in the morning, but it should work:
data want;
set have;
by id term notsorted;
if first.id then do;
term_count=1;
output_count=0;
end;
else if first.term then term_count + 1;
if term_count > 1 and output_count < 2 and audit_type=2 then do;
output;
output_count + 1;
end;
drop term_count output_count;
run;
@Astounding It does work, thank you very much. But I have an added layer of requirement that I had forgotten to include in the sample data. When I tested your code, that's when I realized my mistake. The sample data and the desired outcome have been edited. I am sorry for the over sight.
If I understand the new requirements properly, a very similar program can do the trick.
proc sort data=have;
by id prog;
run;
data want;
set have;
by id prog term notsorted;
if first.prog then do;
term_count=1;
output_count=0;
end;
else if first.term then term_count + 1;
if term_count > 1 and output_count < 2 and audit_type=2 then do;
output;
output_count + 1;
end;
drop term_count output_count;
run;
It's untested at this point, so see what you get from it.
Like this?
data WANT;
set HAVE;
by ID;
if first.ID then call missing(AFTER_FIRST,OUTPUT_NO);
if ID=lag(ID) and TERM ne lag(TERM) then AFTER_FIRST+1;
if AUDIT_TYPE=2 & AFTER_FIRST then do;
OUTPUT_NO+1;
if OUTPUT_NO in (1,2) then output;
end;
run;
id | Term | y | Audit_type | Prog | Level | Load |
1 | Winter | 2016 | 2 | ABCD | 2 | F |
1 | Fall | 2016 | 2 | ABCD | 3 | F |
2 | Spring | 2014 | 2 | PQRS | 1 | P |
2 | Winter | 2015 | 2 | PQRS | 1 | F |
3 | Spring | 2017 | 2 | EFGH | 2 | P |
3 | Fall | 2017 | 2 | EFGH | 2 | F |
It's finally time to hack! Remember to visit the SAS Hacker's Hub regularly for news and updates.
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.
Ready to level-up your skills? Choose your own adventure.