turn on suggestions

Auto-suggest helps you quickly narrow down your search results by suggesting possible matches as you type.

Showing results for

Find a Community

- Home
- /
- Analytics
- /
- Stat Procs
- /
- Proc GLMSelect and confidence intervals

Topic Options

- Subscribe to RSS Feed
- Mark Topic as New
- Mark Topic as Read
- Float this Topic for Current User
- Bookmark
- Subscribe
- Printer Friendly Page

- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Highlight
- Email to a Friend
- Report Inappropriate Content

01-23-2017 10:01 AM

Hi,

I was wondering if I am correct in my finding that I can't produce confidence intervals in the results output window when I use proc glmselect for a regression. I know it is typical procedure to post a question with data and code etc. but I don't really know where to start with this question. The only work around I have thought about is to use proc glmselect to grab variable names and output it to a data set, then write a macro variable that gets filtered into a proc glm command with the clm option specified after the model statement.

Any help would be great thanks.

Accepted Solutions

Solution

01-26-2017
08:26 AM

- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Highlight
- Email to a Friend
- Report Inappropriate Content

01-23-2017 10:43 AM

As stated in the documentation, "PROC GLMSELECT provides results (displayed tables, output data sets, and macro variables) that make it easy to take the selected model and explore it in more detail in a subsequent procedure such as REG or GLM." However, to get inferential statistics and hypotheses tests, you should select a model and then use a procedure such as PROC GLM.

You don't need to create your own macro variables because PROC GLMSELECT already provides one.

See the section "Macro variables containing selected models" for an example of using macro variables for post-selection analysis. In particular, after you run PROC GLMSELECT, you can use the _GLSIND macro variable to run the selected model in PROC GLM, as shown in this example:

```
proc glmselect data=sashelp.cars;
class origin;
model mpg_city = MSRP|wheelbase|weight|cylinders|origin @2;
run;
%put &=_GLSIND;
proc glm data=sashelp.cars;
class origin;
model mpg_city = &_GLSIND / solution CLPARM; /* or CLB for PROC REG */
run;
```

All Replies

Solution

01-26-2017
08:26 AM

- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Highlight
- Email to a Friend
- Report Inappropriate Content

01-23-2017 10:43 AM

As stated in the documentation, "PROC GLMSELECT provides results (displayed tables, output data sets, and macro variables) that make it easy to take the selected model and explore it in more detail in a subsequent procedure such as REG or GLM." However, to get inferential statistics and hypotheses tests, you should select a model and then use a procedure such as PROC GLM.

You don't need to create your own macro variables because PROC GLMSELECT already provides one.

See the section "Macro variables containing selected models" for an example of using macro variables for post-selection analysis. In particular, after you run PROC GLMSELECT, you can use the _GLSIND macro variable to run the selected model in PROC GLM, as shown in this example:

```
proc glmselect data=sashelp.cars;
class origin;
model mpg_city = MSRP|wheelbase|weight|cylinders|origin @2;
run;
%put &=_GLSIND;
proc glm data=sashelp.cars;
class origin;
model mpg_city = &_GLSIND / solution CLPARM; /* or CLB for PROC REG */
run;
```

- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Highlight
- Email to a Friend
- Report Inappropriate Content

01-26-2017 08:26 AM

Great thanks this is more or less what I was looking for.

- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Highlight
- Email to a Friend
- Report Inappropriate Content

01-25-2017 04:54 PM

We use GLMSelect along with GLM in a 3 step process where we want to quickly do some budget planning across a couple hundred different transaction types:

1) GLMSelect is used so that SAS can take our list of potential independent variables and do some model specification for us (we use a holdout sample and the error against the holdout sample is the selection criteria). We then put these specifications in a spreadsheet by writing the listing output to a text file, and have an excel macro that parses this out to build a table with the specification for each transaction (eg, the CLASS and MODEL statements).

2) We use GLMSelect again, pulling in the spreadsheet and looping through each specification so that we get all of our holdout sample results in one place (we produce charts and some tables with the accuracy against the out-of-sample data).

3) Finally, we use GLM. We again pull in the model specification using the spreadsheet like step 2. However, now we use all of our data (up to today) rather than have a holdout sample; and, we are able to generate the confidence interval bounds along with the final forecast.

- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Highlight
- Email to a Friend
- Report Inappropriate Content

01-25-2017 05:16 PM

@cau83 : I may not fully understand the complexity of your process, but I wonder whether any of your processes could be simplified by using the STORE statement in PROC GLMSELECT? PROC PLM produces a 'Store Information' table that contains the class variables(s) and model effects:

```
data one(drop=i j);
array x{5} x1-x5;
do i=1 to 1000;
classVar = mod(i,4)+1;
do j=1 to 5;
x{j} = ranuni(1);
end;
y = 3*classVar+7*x2+5*x2*x5+rannor(1);
output;
end;
run;
proc glmselect data=one;
class classVar;
model y = classVar x1|x2|x3|x4|x5 @2 /
selection=stepwise(stop=aicc);
store out=glmselectStore; /* STORE the model */
run;
proc plm restore=glmselectStore;
show ClassLevels Parameters;
score data=one(obs=3) /* use PROC PLM to score the model */
out=Score;
run;
```

- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Highlight
- Email to a Friend
- Report Inappropriate Content

01-26-2017 08:24 AM - edited 01-26-2017 08:29 AM

Thanks Rick. I was generally aware of the existence of this kind of functionality, and perhaps it's something to explore as we prepare this year. That being said, we do not always use the GLMSELECT specifications w/o manual change-- for instance, certain parameters (like # of business days in a week, 4 or 5) should only be positive or negative and we would remove it if is wrong if it increases the holdout sample error. Using GLMSELECT as well as the listing output/excel helps to remove friction from a process that we want to be fast because it is serving as the starting point for management rather than the end point (and we spend a lot more time on the business side of things after it's done).

Your original answer may be an easier way to generate the list than using the listing output and excel-- do the automatic macro variables work with a BY statement? Or at least separate with a delimiter and we could match back up?

- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Highlight
- Email to a Friend
- Report Inappropriate Content

01-26-2017 09:43 AM

> do the automatic macro variables work with a BY statement?

Yes, and there is an example in the doc.