Contributor
Posts: 22

# how to determine relationship between two dependent variables

I want to generate relationship between two dependent variables. I can do that easily using PROC REG. But I do want to test this relationship at differnt levels of categorical variables involved. I would also want to plot those relation for each individual level of categorical variable including their R2 value and significance level Please help me to make model for this.
Trusted Advisor
Posts: 2,116

## Re: how to determine relationship between two dependent variables

Posted in reply to Bhupinder
Since you are talking about two DEPENDENT variables, you need to do the testing of linear models in the GLM procedure. You can set up your model similar to that in REG. To actually look at the relationships between the two dependent variables, you will need to use the CONTRAST statements. That will also allow you to get the R-square by level of the categorical variable. There are several exampled in the documentation similar to the situation you described.
Contributor
Posts: 22

## Re: how to determine relationship between two dependent variables

The dependent variables are plant_height1 and yield. The categorical variables are tillplc, Prate, and Krate with their different levels.

I am trying to do something like this -

proc glm data = plant_height1;
model yield = pl_height1; by tiiplc prate krate;
run;

The concern is to test at each level of tillplc. I don't know if I can do it this way. Once I have these models then I want to plot these relations for each individual level of categorical variable including their R2 values and significance level.

PROC GPLOT DATA=plant_height1;
PLOT Pl_height1*Yield; by tillplc;
RUN;
QUIT;

But, then how to get R2 on this plot.

Thanks
Bhupinder
Trusted Advisor
Posts: 2,116

## Re: how to determine relationship between two dependent variables

Posted in reply to Bhupinder
This is much more of a statistical modelling question, and the topic of an entire course in linear models. You are treating pl_height1 and an INDEPENDENT variable in your model example, rhather than the dependent variable that you said initially. In using the BY statement, you are not adjusting for the effect of the categorical variables, merely testing the linear relationship within each combination.

You can do this with REG, you do not have to use GLM (GLM would be required with two dependent variables, but with one of each you can use REG). Look at the ODS Statistical graphics, there is one that may do a graph you would find useful (Figure 21.3)
http://support.sas.com/documentation/cdl/en/statug/63347/HTML/default/viewer.htm#statug_odsgraph_sec... .

If you want to control for the categorical variables, you are either back to GLM (where you can use a CLASS statement), or you need to recode the categorical variables as a series of binary variables. However, in GLM you lose the R-squared part of the plot. Either way, you now have a host of other modelling issues (which gets to my "entire course" comment).

You can do the approach you described for a first look at your data, but you likely will want to consult with a statistician before diving too deeply into the modelling area. You may benefit from talking to a statistician who focuses in the agricultural area, as there are some techniques that they use that I (a biostatistician) haven't used since grad school.

Doc Muhlbaier
Duke
Discussion stats
• 3 replies
• 138 views
• 0 likes
• 2 in conversation