turn on suggestions

Auto-suggest helps you quickly narrow down your search results by suggesting possible matches as you type.

Showing results for

Find a Community

- Home
- /
- Analytics
- /
- Stat Procs
- /
- Proc PLS syntax and interpretation of results

Topic Options

- Subscribe to RSS Feed
- Mark Topic as New
- Mark Topic as Read
- Float this Topic for Current User
- Bookmark
- Subscribe
- Printer Friendly Page

- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Permalink
- Email to a Friend
- Report Inappropriate Content

03-23-2017 12:38 AM

Hi All,

I will be glad if someone can help me out. I am trying to find the effect of climate covarability on yield in a multi-yr field work. To be specific, I want to know how maximum Temperature, Solar rdadiation and rainfall influence crop yields jointly and in isolation across the years.

I already performed a simple linear regression which included climate covariability. Now I will like to perform a **partial least regression** to exclude the effects of climate coverability, which means I want to isolate the effects of a single climate factor by removing statistically the

effects of other controlling factors.

Here my purpose is to know the response of the yield to my target climate variable (rainfall). In summary,

- I want to remove the influence of other controlling climate factors (Maximum temperature and solar radiation) I mean the crop yield variability that can be explained by temperature and solar radiation
- Calculate the residuals (r1) of regressing crop yields against the Temperature and solar radiation
- Compute the residuals (r2) of regressing the target variable (rainfall) against the controlling variables (i.e. Temperature and solar radiation)
- Calculate the linear regression of r1 and r2
- I will then compute the sensitivity of crop yield to the target variable (rainfall) as the slope of the partial regression.

Please find attached my code and the results?. I feel I am missing some vital outputs based on my syntax. i am using SAS version 9.3.

- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Permalink
- Email to a Friend
- Report Inappropriate Content

Posted in reply to Olanike

03-23-2017 08:33 AM - edited 03-23-2017 09:05 AM

Olanike wrote:

Now I will like to perform a

partial least regressionto exclude the effects of climate coverability, which means I want to isolate the effects of a single climate factor by removing statistically the effects of other controlling factors.

Either I'm not understanding what you want to do, or you don't understand what **partial least squares regression** does, and I think it is the latter. The general usage of PLS is to allow you to put *all* of the independent variables into the model, and the fitted model will give you "better" model fits and predicted values in the presence of correlation between the independent variables than an ordinary least squares regression will. There is no concept in PLS of removing the effects of other controlling factors. By "better", I mean that PLS will give smaller mean square error of the regression coefficients and predicted values than OLS will give when fitting the same model, as was shown in this article: http://amstat.tandfonline.com/doi/abs/10.1080/00401706.1993.10485033

--

Paige Miller

Paige Miller

- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Permalink
- Email to a Friend
- Report Inappropriate Content

Posted in reply to Olanike

03-23-2017 09:03 AM

Adding ...

I do understand the idea of removing the effects of other controlling factors, and you can certainly perform calculations to do this; but this will not eliminate the problem of having independent variables correlated with one another. If the correlations are high, you may think you have eliminated the effects of other controlling factors, but I don't think the math really accomplishes this.

--

Paige Miller

Paige Miller

- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Permalink
- Email to a Friend
- Report Inappropriate Content

Posted in reply to PaigeMiller

03-23-2017 11:04 AM

Thank you so much PaigeMiller. I went through the material you sent and got lots of information.

1. Please do you have any idea how I can eliminate the effects of other controlling factors, i mean the calculations ?, my target is rainfall and I want to eliminate the effects of temperature and solar radiation

2.Please how do I interprete the results I got in my last attached file, I have 3 factors but I don't know which is which

- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Permalink
- Email to a Friend
- Report Inappropriate Content

Posted in reply to Olanike

03-23-2017 12:56 PM

Olanike wrote:

Thank you so much PaigeMiller. I went through the material you sent and got lots of information.

1. Please do you have any idea how I can eliminate the effects of other controlling factors, i mean the calculations ?, my target is rainfall and I want to eliminate the effects of temperature and solar radiation

2.Please how do I interprete the results I got in my last attached file, I have 3 factors but I don't know which is which

I have a very different philosophy for analyzing this data. I would not even try to eliminate the effects of the the other factors.

The results show that the first factor (or dimension) explains 45.9303% of the response variability, and the remaining factors explain very little. If you look at the loadings in dimension 1, all of the five input variables are roughly equal in importance.

--

Paige Miller

Paige Miller

- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Permalink
- Email to a Friend
- Report Inappropriate Content

Posted in reply to PaigeMiller

03-23-2017 03:22 PM

Thanks PaigeMiller, Your comments are highly valued. I assume that the loadings in the dimensions are my r squared irrespective of the sign (+ or -).

In OLS, I was able to find the effect of one parameter (model Yield = rainfall) and mutliple parameters (Model Yield= rainfall Temperatutre) and I got the r-squared for the two models. Do you think I can try this in PLS?.

Most importantly, please do you think I have used the best approach for this data, if not WHAT WOULD YOU SUGGEST.

Thanks.

- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Permalink
- Email to a Friend
- Report Inappropriate Content

Posted in reply to Olanike

03-23-2017 04:26 PM

The loadings are not R-squared. They are the "importance" of each of the independent variables in that specific PLS dimension.

There is no such thing as an R-squared for each independent variable, there is only such a thing as an R-squared for the entire model.

It simply doesn't make sense to me to speak of the effect of a variable "controlling for other variables" in this case because the independent variables are correlated with one another, they don't have an effect that is independent of other variables, you can't have that even if you do some tricky math that you think will get you there, the effect of a variable "controlling for other variables" in this case does not exist.

I would fit a PLS model involving all of your independent variables, and if that model fits well enough, then that's what I would do. If it doesn't fit well, then you need to try additional things, depending on your data, which I don't have, so I can't really advise.

--

Paige Miller

Paige Miller

- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Permalink
- Email to a Friend
- Report Inappropriate Content

Posted in reply to PaigeMiller

03-24-2017 12:04 AM

Hi PaigeMiller, Thanks for your response. Please find attached my data for further suggestion. In the meantime, I am working on using the PLS results. Thanks.