- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
I am trying to run a quadratic regression in SAS Studio. I specify in the Model tab that I want a polynomial of degree 2. The output gives no parameter estimates for the squared term and the test of the model has only 1 degree of freedom.
If I specify a polynomial of degree 3, I get parameter estimates for the first and third degree terms but not for the second degree term.
If I use data transformation to create a squared variable, I can get a parameter estimate for the squared term in the regression.
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
Assume that the software is correct, and plot your data to see why it is correct.
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
Thanks for the response, bu I think there is something wrong with the program. The data are atmospheric CO2 levels measured at Mauna Loa each December from 1958 trough 2018 inclusive. If I do a simple linear of CO2 concentration versus year there is an obvious lack of fit.
If I use data transformation to create a new variable that is the square of year and I enter both in a regression on CO2 concentrations there is a significant improvement in fit.
I get the following parameter estimates and test of model:
However, if I start only with Year and ask SAS to do a polynomial regression of degree 2, I get the following parameter estimates and test of model:
One could argue that the program simply did not include the years-squared term into the model, but the following plot suggests otherwise as there is improved fit when compared to the model with years alone.
I suspect that there is faulty communication between proc glmselect and proc reg in SAS Studio.
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
Is there a way to refer this problem to SAS technical support?
My students request an answer.
Thanks.
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
My suggestion was to plot your data to see why SAS is producing correct results.
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
Post the plot. I don't have your data, and I have retired, so I don't have SAS at the moment. However, I can say with a great deal of confidence, after almost 35 years as a SAS/STAT developer including writing one of the regression procedures, SAS does not get problems like this wrong. Users, on the other hand, often do not understand least squares, floating point arithmetic, collinearity, sequential sweeps, sweeps with rational pivoting, type two tests, and so on, so. If I were a betting man, I would bet that SAS has it right, and the problems lie with your expectations. If you want to contact technical support, you can certainly do that, but SAS Communities questions are not routed to them.
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
You may already know this, but now that you are retired you can download and use SAS University Edition for non-commercial purposes (https://www.sas.com/en_us/software/university-edition/download-software.html). It include SAS/STAT 15.1.
I hope you are enjoying your well-deserved retirement!
Best,
-Brian
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
not done it,
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
Is the data set you are using publicly available, or would you be willing to share it? If so I can take a closer look at it to see if I can figure out what is going on.
Thanks,
-Brian
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
For the sake of closure, I post the response I received from SAS technical support:
William,
As I suspected there is what we call "near" singularity in the data and model that tends to happen with polynomial models. The condition number of X`X matrix is huge as shown in the following PROC IML program:
proc iml;
use sasuser.co2;
read all var {year};
x = j(nrow(year),1) || year || year##2;
print (max(eigval(x`*x))/min(eigval(x`*x)));
quit;
3.1771E21
In this case GLM and REG algorithm handles this particular near singularity better than GLMSELECT. The general recommendation is not use raw Year values in polynomial models; center and scale values of year first. For example, I center the polynomial term and now I reproduce the same results as REG and GLM.
procmeansdata=sasuser.co2;
varyear;
run;
datanew; setsasuser.co2;
centeryr=year - 1988;
run;
procglmselectdata=new;
modelco2= year centeryr*centeryr/selection=none;
quit;
I hope the above information is helpful to you. Please let me know if you have further questions on this particular matter.
Thank you for using SAS and for your patience in my reply.
Kathleen Kiernan
Senior Principal Technical Support Statistician