Programming the statistical procedures from SAS

Proc GLMSelect and confidence intervals

Accepted Solution Solved
Reply
Contributor
Posts: 39
Accepted Solution

Proc GLMSelect and confidence intervals

Hi,

 

I was wondering if I am correct in my finding that I can't produce confidence intervals in the results output window when I use proc glmselect for a regression. I know it is typical procedure to post a question with data and code etc. but I don't really know where to start with this question. The only work around I have thought about is to use proc glmselect to grab variable names and output it to a data set, then write a macro variable that gets filtered into a proc glm command with the clm option specified after the model statement.

 

Any help would be great thanks.

 

 


Accepted Solutions
Solution
‎01-26-2017 08:26 AM
SAS Super FREQ
Posts: 3,547

Re: Proc GLMSelect and confidence intervals

As stated in the documentation, "PROC GLMSELECT provides results (displayed tables, output data sets, and macro variables) that make it easy to take the selected model and explore it in more detail in a subsequent procedure such as REG or GLM."  However, to get inferential statistics and hypotheses tests, you should select a model and then use a procedure such as PROC GLM.

 

You don't need to create your own macro variables because PROC GLMSELECT already provides one.

See the section "Macro variables containing selected models" for an example of using macro variables for post-selection analysis.  In particular, after you run PROC GLMSELECT, you can use the _GLSIND macro variable to run the selected model in PROC GLM, as shown in this example:

proc glmselect data=sashelp.cars;
class origin;
model mpg_city = MSRP|wheelbase|weight|cylinders|origin @2;
run;

%put &=_GLSIND;

proc glm data=sashelp.cars;
class origin;
model mpg_city = &_GLSIND / solution CLPARM; /* or CLB for PROC REG */
run;

 

 

 

 

View solution in original post


All Replies
Solution
‎01-26-2017 08:26 AM
SAS Super FREQ
Posts: 3,547

Re: Proc GLMSelect and confidence intervals

As stated in the documentation, "PROC GLMSELECT provides results (displayed tables, output data sets, and macro variables) that make it easy to take the selected model and explore it in more detail in a subsequent procedure such as REG or GLM."  However, to get inferential statistics and hypotheses tests, you should select a model and then use a procedure such as PROC GLM.

 

You don't need to create your own macro variables because PROC GLMSELECT already provides one.

See the section "Macro variables containing selected models" for an example of using macro variables for post-selection analysis.  In particular, after you run PROC GLMSELECT, you can use the _GLSIND macro variable to run the selected model in PROC GLM, as shown in this example:

proc glmselect data=sashelp.cars;
class origin;
model mpg_city = MSRP|wheelbase|weight|cylinders|origin @2;
run;

%put &=_GLSIND;

proc glm data=sashelp.cars;
class origin;
model mpg_city = &_GLSIND / solution CLPARM; /* or CLB for PROC REG */
run;

 

 

 

 

Contributor
Posts: 39

Re: Proc GLMSelect and confidence intervals

Great thanks this is more or less what I was looking for.
Contributor
Posts: 69

Re: Proc GLMSelect and confidence intervals

We use GLMSelect along with GLM in a 3 step process where we want to quickly do some budget planning across a couple hundred different transaction types:

 

1) GLMSelect is used so that SAS can take our list of potential independent variables and do some model specification for us (we use a holdout sample and the error against the holdout sample is the selection criteria). We then put these specifications in a spreadsheet by writing the listing output to a text file, and have an excel macro that parses this out to build a table with the specification for each transaction (eg, the CLASS and MODEL statements).

2) We use GLMSelect again, pulling in the spreadsheet and looping through each specification so that we get all of our holdout sample results in one place (we produce charts and some tables with the accuracy against the out-of-sample data).

3) Finally, we use GLM. We again pull in the model specification using the spreadsheet like step 2. However, now we use all of our data (up to today) rather than have a holdout sample; and, we are able to generate the confidence interval bounds along with the final forecast.

SAS Super FREQ
Posts: 3,547

Re: Proc GLMSelect and confidence intervals

@cau83 : I may not fully understand the complexity of your process, but I wonder whether any of your processes could be simplified by using the STORE statement in PROC GLMSELECT? PROC PLM produces a 'Store Information' table that contains the class variables(s) and model effects:

 

data one(drop=i j);
   array x{5} x1-x5;
   do i=1 to 1000;
      classVar = mod(i,4)+1;  
      do j=1 to 5;
         x{j} = ranuni(1);
      end;   
      y     = 3*classVar+7*x2+5*x2*x5+rannor(1);
      output;
   end;
run;
proc glmselect data=one;
   class  classVar;
   model  y = classVar x1|x2|x3|x4|x5 @2 /
                  selection=stepwise(stop=aicc);
   store out=glmselectStore;  /* STORE the model */
run;

proc plm restore=glmselectStore;
show ClassLevels Parameters;
score data=one(obs=3)  /* use PROC PLM to score the model */
      out=Score;
run;
Contributor
Posts: 69

Re: Proc GLMSelect and confidence intervals

[ Edited ]

Thanks Rick. I was generally aware of the existence of this kind of functionality, and perhaps it's something to explore as we prepare this year. That being said, we do not always use the GLMSELECT specifications w/o manual change-- for instance, certain parameters (like # of business days in a week, 4 or 5) should only be positive or negative and we would remove it if is wrong if it increases the holdout sample error. Using GLMSELECT as well as the listing output/excel helps to remove friction from a process that we want to be fast because it is serving as the starting point for management rather than the end point (and we spend a lot more time on the business side of things after it's done).

 

Your original answer may be an easier way to generate the list than using the listing output and excel-- do the automatic macro variables work with a BY statement? Or at least separate with a delimiter and we could match back up?

SAS Super FREQ
Posts: 3,547

Re: Proc GLMSelect and confidence intervals

do the automatic macro variables work with a BY statement?

 

Yes, and there is an example in the doc.

☑ This topic is solved.

Need further help from the community? Please ask a new question.

Discussion stats
  • 6 replies
  • 292 views
  • 2 likes
  • 3 in conversation