Turn on suggestions

Auto-suggest helps you quickly narrow down your search results by suggesting possible matches as you type.

Showing results for

- Home
- /
- Analytics
- /
- Stat Procs
- /
- Proc PLS syntax and interpretation of results

Options

- RSS Feed
- Mark Topic as New
- Mark Topic as Read
- Float this Topic for Current User
- Bookmark
- Subscribe
- Mute
- Printer Friendly Page

- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content

Posted 03-23-2017 12:38 AM
(2380 views)

Hi All,

I will be glad if someone can help me out. I am trying to find the effect of climate covarability on yield in a multi-yr field work. To be specific, I want to know how maximum Temperature, Solar rdadiation and rainfall influence crop yields jointly and in isolation across the years.

I already performed a simple linear regression which included climate covariability. Now I will like to perform a **partial least regression** to exclude the effects of climate coverability, which means I want to isolate the effects of a single climate factor by removing statistically the

effects of other controlling factors.

Here my purpose is to know the response of the yield to my target climate variable (rainfall). In summary,

- I want to remove the influence of other controlling climate factors (Maximum temperature and solar radiation) I mean the crop yield variability that can be explained by temperature and solar radiation
- Calculate the residuals (r1) of regressing crop yields against the Temperature and solar radiation
- Compute the residuals (r2) of regressing the target variable (rainfall) against the controlling variables (i.e. Temperature and solar radiation)
- Calculate the linear regression of r1 and r2
- I will then compute the sensitivity of crop yield to the target variable (rainfall) as the slope of the partial regression.

Please find attached my code and the results?. I feel I am missing some vital outputs based on my syntax. i am using SAS version 9.3.

8 REPLIES 8

- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content

@Olanike wrote:

Now I will like to perform a

partial least regressionto exclude the effects of climate coverability, which means I want to isolate the effects of a single climate factor by removing statistically the effects of other controlling factors.

Either I'm not understanding what you want to do, or you don't understand what **partial least squares regression** does, and I think it is the latter. The general usage of PLS is to allow you to put *all* of the independent variables into the model, and the fitted model will give you "better" model fits and predicted values in the presence of correlation between the independent variables than an ordinary least squares regression will. There is no concept in PLS of removing the effects of other controlling factors. By "better", I mean that PLS will give smaller mean square error of the regression coefficients and predicted values than OLS will give when fitting the same model, as was shown in this article: http://amstat.tandfonline.com/doi/abs/10.1080/00401706.1993.10485033

--

Paige Miller

Paige Miller

- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content

Adding ...

I do understand the idea of removing the effects of other controlling factors, and you can certainly perform calculations to do this; but this will not eliminate the problem of having independent variables correlated with one another. If the correlations are high, you may think you have eliminated the effects of other controlling factors, but I don't think the math really accomplishes this.

--

Paige Miller

Paige Miller

- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content

Thank you so much PaigeMiller. I went through the material you sent and got lots of information.

1. Please do you have any idea how I can eliminate the effects of other controlling factors, i mean the calculations ?, my target is rainfall and I want to eliminate the effects of temperature and solar radiation

2.Please how do I interprete the results I got in my last attached file, I have 3 factors but I don't know which is which

- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content

@Olanike wrote:

Thank you so much PaigeMiller. I went through the material you sent and got lots of information.

1. Please do you have any idea how I can eliminate the effects of other controlling factors, i mean the calculations ?, my target is rainfall and I want to eliminate the effects of temperature and solar radiation

2.Please how do I interprete the results I got in my last attached file, I have 3 factors but I don't know which is which

I have a very different philosophy for analyzing this data. I would not even try to eliminate the effects of the the other factors.

The results show that the first factor (or dimension) explains 45.9303% of the response variability, and the remaining factors explain very little. If you look at the loadings in dimension 1, all of the five input variables are roughly equal in importance.

--

Paige Miller

Paige Miller

- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content

Thanks PaigeMiller, Your comments are highly valued. I assume that the loadings in the dimensions are my r squared irrespective of the sign (+ or -).

In OLS, I was able to find the effect of one parameter (model Yield = rainfall) and mutliple parameters (Model Yield= rainfall Temperatutre) and I got the r-squared for the two models. Do you think I can try this in PLS?.

Most importantly, please do you think I have used the best approach for this data, if not WHAT WOULD YOU SUGGEST.

Thanks.

- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content

The loadings are not R-squared. They are the "importance" of each of the independent variables in that specific PLS dimension.

There is no such thing as an R-squared for each independent variable, there is only such a thing as an R-squared for the entire model.

It simply doesn't make sense to me to speak of the effect of a variable "controlling for other variables" in this case because the independent variables are correlated with one another, they don't have an effect that is independent of other variables, you can't have that even if you do some tricky math that you think will get you there, the effect of a variable "controlling for other variables" in this case does not exist.

I would fit a PLS model involving all of your independent variables, and if that model fits well enough, then that's what I would do. If it doesn't fit well, then you need to try additional things, depending on your data, which I don't have, so I can't really advise.

--

Paige Miller

Paige Miller

- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content

- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content

Well, no thank you. I'm not here to do your modeling for you.

--

Paige Miller

Paige Miller

**Don't miss out on SAS Innovate - Register now for the FREE Livestream!**

Can't make it to Vegas? No problem! Watch our general sessions LIVE or on-demand starting April 17th. Hear from SAS execs, best-selling author Adam Grant, Hot Ones host Sean Evans, top tech journalist Kara Swisher, AI expert Cassie Kozyrkov, and the mind-blowing dance crew iLuminate! Plus, get access to over 20 breakout sessions.

What is ANOVA?

ANOVA, or Analysis Of Variance, is used to compare the averages or means of two or more populations to better understand how they differ. Watch this tutorial for more.

Find more tutorials on the SAS Users YouTube channel.