Hello all! I’m a student and I’m having a lot of trouble even beginning this project. Not sure how to even start here. I have thousands of data points.

Essentially I have 5 variables. 3 of which have multiple categories (4+), 1 continuous variable, and 1 dual category variable (e.g. something like gender). How do I use ancova to analyze these! I’m so confused. I don’t even know where to begin. Please help! I’m using SAS
A simple place to start is in PROC GLM or PROC GLIMMIX, depending on what assumptions you want to make about the error distribution of the Y variable, something you haven't mentioned. What is your Y variable? What distribution are the errors?


Anyway, if the variable Y has normally distributed errors, then use PROC GLM like this:


proc glm data=have;
     class categoryvariablename1 categoryvariablename2 
         categoryvariablename3 binaryvariablename;
     model y=continuousvariable categoryvariablename1 
         categoryvariablename2 categoryvariablename3 binaryvariablename/solution;
     lsmeans categoryvariablename1 categoryvariablename2
         categoryvariablename3 binaryvariablename;
