BookmarkSubscribeRSS Feed
Olanike
Fluorite | Level 6

Hi All,

I will be glad if someone can help me out. I am trying to find the effect of climate covarability on yield in a multi-yr field work. To be specific, I want to know how maximum Temperature, Solar rdadiation and rainfall influence crop yields jointly and in isolation across the years.

I already performed a simple linear regression which included climate covariability. Now I will like to perform a partial least regression to exclude the effects of climate coverability, which means I want to isolate the effects of a single climate factor by removing statistically the

effects of other controlling factors.

 

Here my purpose is to know the response of the yield to my target climate variable (rainfall). In summary,

 

  1. I want to remove the influence of other controlling climate factors (Maximum temperature and solar radiation) I mean the crop yield variability that can be explained by temperature and solar radiation
  2. Calculate the residuals (r1) of regressing crop yields against the Temperature and solar radiation
  3. Compute the residuals (r2) of regressing the target variable (rainfall) against the controlling variables (i.e. Temperature and solar radiation)
  4. Calculate the linear regression of r1 and r2
  5. I will then compute the sensitivity of crop yield to the target variable (rainfall) as the slope of the partial regression.

Please find attached my code and the results?. I feel I am missing some vital outputs based on my syntax. i am using SAS version 9.3.

8 REPLIES 8
PaigeMiller
Diamond | Level 26

@Olanike wrote:

Now I will like to perform a partial least regression to exclude the effects of climate coverability, which means I want to isolate the effects of a single climate factor by removing statistically the effects of other controlling factors.


Either I'm not understanding what you want to do, or you don't understand what partial least squares regression does, and I think it is the latter. The general usage of PLS is to allow you to put all of the independent variables into the model, and the fitted model will give you "better" model fits and predicted values in the presence of correlation between the independent variables than an ordinary least squares regression will. There is no concept in PLS of removing the effects of other controlling factors. By "better", I mean that PLS will give smaller mean square error of the regression coefficients and predicted values than OLS will give when fitting the same model, as was shown in this article: http://amstat.tandfonline.com/doi/abs/10.1080/00401706.1993.10485033

 

 

--
Paige Miller
PaigeMiller
Diamond | Level 26

Adding ...

 

I do understand the idea of removing the effects of other controlling factors, and you can certainly perform calculations to do this; but this will not eliminate the problem of having independent variables correlated with one another. If the correlations are high, you may think you have eliminated the effects of other controlling factors, but I don't think the math really accomplishes this.

--
Paige Miller
Olanike
Fluorite | Level 6

Thank you so much PaigeMiller. I went through the material you sent and got lots of information.

 

1. Please do you have any idea how I can eliminate the effects of other controlling factors, i mean the calculations ?, my target is rainfall and I want to eliminate the effects of temperature and solar radiation

 

2.Please how do I interprete the results I got in my last attached file, I have 3 factors but I don't know which is which

 

 

PaigeMiller
Diamond | Level 26

@Olanike wrote:

Thank you so much PaigeMiller. I went through the material you sent and got lots of information.

 

1. Please do you have any idea how I can eliminate the effects of other controlling factors, i mean the calculations ?, my target is rainfall and I want to eliminate the effects of temperature and solar radiation

 

2.Please how do I interprete the results I got in my last attached file, I have 3 factors but I don't know which is which

 

 


I have a very different philosophy for analyzing this data. I would not even try to eliminate the effects of the the other factors.

 

The results show that the first factor (or dimension) explains 45.9303% of the response variability, and the remaining factors explain very little. If you look at the loadings in dimension 1, all of the five input variables are roughly equal in importance.

--
Paige Miller
Olanike
Fluorite | Level 6

Thanks PaigeMiller, Your comments are highly valued. I assume that the loadings in the dimensions are my r squared irrespective of the sign (+ or -).

 

In OLS, I was able to find the effect of one parameter (model Yield = rainfall) and mutliple parameters (Model Yield= rainfall Temperatutre) and I got the r-squared for the two models. Do you think I can  try this in PLS?.

 

Most importantly, please do you think I have used the best approach for this data, if not WHAT WOULD YOU SUGGEST.

 

 

Thanks.

PaigeMiller
Diamond | Level 26

The loadings are not R-squared. They are the "importance" of each of the independent variables in that specific PLS dimension.

 

There is no such thing as an R-squared for each independent variable, there is only such a thing as an R-squared for the entire model.

 

It simply doesn't make sense to me to speak of the effect of a variable "controlling for other variables" in this case because the independent variables are correlated with one another, they don't have an effect that is independent of other variables, you can't have that even if you do some tricky math that you think will get you there, the effect of a variable "controlling for other variables" in this case does not exist.

 

I would fit a PLS model involving all of your independent variables, and if that model fits well enough, then that's what I would do. If it doesn't fit well, then you need to try additional things, depending on your data, which I don't have, so I can't really advise.

--
Paige Miller
Olanike
Fluorite | Level 6

Hi PaigeMiller, Thanks for your response. Please find attached my data for further suggestion. In the meantime, I am working on using the PLS results. Thanks.

PaigeMiller
Diamond | Level 26

Well, no thank you. I'm not here to do your modeling for you.

 

 

--
Paige Miller

sas-innovate-2024.png

Today is the last day to save with the early bird rate! Register today for just $695 - $100 off the standard rate.

 

Plus, pre-conference courses and tutorials are filling up fast and are always a sellout. Register today to reserve your seat.

 

Register now!

What is ANOVA?

ANOVA, or Analysis Of Variance, is used to compare the averages or means of two or more populations to better understand how they differ. Watch this tutorial for more.

Find more tutorials on the SAS Users YouTube channel.

Discussion stats
  • 8 replies
  • 2275 views
  • 2 likes
  • 2 in conversation