I'd like to automate my SAS code to calculate what percentile (Q1,Q2,Q3,Q4) my teachers spent in the classroom by school year.
My dataset looks like this:
teacher school_year hours_spent_in_classroom
A 2012 350
A 2013 150
B 2014 250
C 2014 350
ALL_TEACHERS 2012 600
I can use proc univariate to know what percentiles are for each year. I could hard code those percentile values into my dataset, and then do "if hours_spent_in_classroom GE X then quartile = 1". But this is a lot of manual work, and it would change by school_year.
I'd like to have a final dataset that looks like this:
teacher school_year hours_spent_in_classroom Quartile
A 2012 350 1
A 2013 150 2
B 2014 250 3
C 2014 350 4
ALL_TEACHERS 2012 600 4
What's the best way to go about this?
Do you actually have individual records for each teacher each year or is that "Total all teachers" actually the value you have in your data?
And are the 350 hours of teacher A included in the 600? Are you attempting to get a common value for the quartiles across years or use a different one per year?
If you have individual records then Proc Rank may do this.You didn't provide enough records to do much with quartiles so I made a few up.
data have;
input teacher $ school_year hours_spent_in_classroom;
datalines;
A 2012 350
B 2012 100
C 2012 150
D 2012 50
A 2013 150
B 2013 300
C 2013 100
D 2013 175
A 2014 50
B 2014 250
C 2014 350
D 2014 200
;
run;
proc sort data=have;
by school_year teacher;
run;
proc rank data=have out=want groups=4 descending;
by school_year;
var hours_spent_in_classroom;
ranks HourRanking;
run;
However if you do not individual hours for all of the teachers then this won't work.
Note that PROC rank has the first rank as 0. If you really want 1 then pass the want data set through a data step to add one to the ranks.
PROC RANK GROUPS=4
Do you actually have individual records for each teacher each year or is that "Total all teachers" actually the value you have in your data?
And are the 350 hours of teacher A included in the 600? Are you attempting to get a common value for the quartiles across years or use a different one per year?
If you have individual records then Proc Rank may do this.You didn't provide enough records to do much with quartiles so I made a few up.
data have;
input teacher $ school_year hours_spent_in_classroom;
datalines;
A 2012 350
B 2012 100
C 2012 150
D 2012 50
A 2013 150
B 2013 300
C 2013 100
D 2013 175
A 2014 50
B 2014 250
C 2014 350
D 2014 200
;
run;
proc sort data=have;
by school_year teacher;
run;
proc rank data=have out=want groups=4 descending;
by school_year;
var hours_spent_in_classroom;
ranks HourRanking;
run;
However if you do not individual hours for all of the teachers then this won't work.
Note that PROC rank has the first rank as 0. If you really want 1 then pass the want data set through a data step to add one to the ranks.
SAS Innovate 2025 is scheduled for May 6-9 in Orlando, FL. Sign up to be first to learn about the agenda and registration!
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.
Ready to level-up your skills? Choose your own adventure.