How would I run an anova table for lab usage by year and academic year and get true variances if many cells are empty? My dataset is sorted by graduation year in a 3-year academic program, with lab hours recorded in the following fashion (the actual dataset is much larger: 70 observations and 29 variables)
Student Grad 2016 2017 2018 2019 2020
student1 2018 58 83 108
student2 2018 60 86 103
student3 2018 63 80 110
student4 2019 57 84 113
student5 2019 61 79 117
student6 2019 64 82 109
student7 2020 55 80 117
student8 2020 61 82 121
student9 2020 62 87 116
Any and all help is appreciated. Thanks!
ANOVA compares observarions, not variables. So you need to transpose your data first. Something like:
data have;
input Student $ Grad y2016 y2017 y2018 y2019 y2020;
datalines;
student1 2018 58 83 108 . .
student2 2018 60 86 103 . .
student3 2018 63 80 110 . .
studenta 2018 59 82 106 . .
student4 2019 . 57 84 113 .
student5 2019 . 61 79 117 .
student6 2019 . 64 82 109 .
studentb 2019 . 60 81 115 .
student7 2020 . . 55 80 117
student8 2020 . . 61 82 121
student9 2020 . . 62 87 116
studentc 2020 . . 57 85 118
;
proc transpose data=have out=temp1 prefix=yr; by student grad notsorted; run;
data temp2;
set temp1;
if yr1;
year = input(substr(_name_,2), 4.);
aYear = year - grad + 3; /* student academic year */
keep student grad year aYear yr1;
rename yr1 = labHours;
run;
proc glm data=temp2;
class aYear year;
model labHours = aYear year / solution;
lsmeans aYear;
lsmeans year;
run;
Note: this is testing for main effects only. Interactions between year and academic year are not estimable.
Are you ready for the spotlight? We're accepting content ideas for SAS Innovate 2025 to be held May 6-9 in Orlando, FL. The call is open until September 16. Read more here about why you should contribute and what is in it for you!
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.