<?xml version="1.0" encoding="UTF-8"?>
<rss xmlns:content="http://purl.org/rss/1.0/modules/content/" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" xmlns:taxo="http://purl.org/rss/1.0/modules/taxonomy/" version="2.0">
  <channel>
    <title>topic Re: PROC GLMSELECT Model option when dataset has a large number of variables in Statistical Procedures</title>
    <link>https://communities.sas.com/t5/Statistical-Procedures/PROC-GLMSELECT-Model-option-when-dataset-has-a-large-number-of/m-p/436442#M23035</link>
    <description>&lt;P&gt;&lt;a href="https://communities.sas.com/t5/user/viewprofilepage/user-id/82850"&gt;@OskarE&lt;/a&gt;&amp;nbsp;has written a an article on this very topic in the post &lt;A href="https://communities.sas.com/t5/SAS-Nordic-Users-Group/Juletip-15-Automatic-modeling-with-thousands-millions-of-inputs/gpm-p/422986#M152" target="_self"&gt;Automatic modeling with thousands/millions of inputs but only a few lines of code!&lt;/A&gt;.&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;You can use the PROC CONTENTS approach as in the post or retreive the variables from dictionary.columns like this&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;PRE&gt;&lt;CODE class=" language-sas"&gt;proc sql noprint;
	select name into :GLMVars separated by ' ' from dictionary.columns
	where libname="SASHELP" and memname="BASEBALL" and upcase(name) not contains "SALARY";
	select name into :ClassVars separated by ' ' from dictionary.columns
	where libname="SASHELP" and memname="BASEBALL" and upcase(type)="CHAR" and upcase(name) not contains "SALARY";
quit;

%put &amp;amp;GLMVars.;
%put &amp;amp;ClassVars.;

proc glmselect data=sashelp.baseball;
	class &amp;amp;ClassVars.;
	model salary= &amp;amp;GLMVars. / selection=lasso;
run;&lt;/CODE&gt;&lt;/PRE&gt;</description>
    <pubDate>Mon, 12 Feb 2018 21:00:24 GMT</pubDate>
    <dc:creator>PeterClemmensen</dc:creator>
    <dc:date>2018-02-12T21:00:24Z</dc:date>
    <item>
      <title>PROC GLMSELECT Model option when dataset has a large number of variables</title>
      <link>https://communities.sas.com/t5/Statistical-Procedures/PROC-GLMSELECT-Model-option-when-dataset-has-a-large-number-of/m-p/436431#M23034</link>
      <description>&lt;P&gt;I have a dataset that has a very large number of variables and I am trying to use the PROC GLMSELECT with the LASSO option to select the most important variables. However, I am having problems with how to specify the model. I do not want to write down the names of the 1000 variables. Is there a way to write the model in a more compact way something like the program below?&lt;/P&gt;&lt;PRE&gt;&lt;CODE class=" language-sas"&gt;PROC GLMSELECT DATA=PCData PLOTS =COEFFICIENTS;
MODEL y=(All Vars in Dataset)/ SELECTION=LASSO;
RUN;&lt;/CODE&gt;&lt;/PRE&gt;&lt;P&gt;&amp;nbsp;I am using SAS 9.4.&amp;nbsp;&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;Tahnks&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;</description>
      <pubDate>Mon, 12 Feb 2018 19:34:40 GMT</pubDate>
      <guid>https://communities.sas.com/t5/Statistical-Procedures/PROC-GLMSELECT-Model-option-when-dataset-has-a-large-number-of/m-p/436431#M23034</guid>
      <dc:creator>Babinetos</dc:creator>
      <dc:date>2018-02-12T19:34:40Z</dc:date>
    </item>
    <item>
      <title>Re: PROC GLMSELECT Model option when dataset has a large number of variables</title>
      <link>https://communities.sas.com/t5/Statistical-Procedures/PROC-GLMSELECT-Model-option-when-dataset-has-a-large-number-of/m-p/436442#M23035</link>
      <description>&lt;P&gt;&lt;a href="https://communities.sas.com/t5/user/viewprofilepage/user-id/82850"&gt;@OskarE&lt;/a&gt;&amp;nbsp;has written a an article on this very topic in the post &lt;A href="https://communities.sas.com/t5/SAS-Nordic-Users-Group/Juletip-15-Automatic-modeling-with-thousands-millions-of-inputs/gpm-p/422986#M152" target="_self"&gt;Automatic modeling with thousands/millions of inputs but only a few lines of code!&lt;/A&gt;.&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;You can use the PROC CONTENTS approach as in the post or retreive the variables from dictionary.columns like this&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;PRE&gt;&lt;CODE class=" language-sas"&gt;proc sql noprint;
	select name into :GLMVars separated by ' ' from dictionary.columns
	where libname="SASHELP" and memname="BASEBALL" and upcase(name) not contains "SALARY";
	select name into :ClassVars separated by ' ' from dictionary.columns
	where libname="SASHELP" and memname="BASEBALL" and upcase(type)="CHAR" and upcase(name) not contains "SALARY";
quit;

%put &amp;amp;GLMVars.;
%put &amp;amp;ClassVars.;

proc glmselect data=sashelp.baseball;
	class &amp;amp;ClassVars.;
	model salary= &amp;amp;GLMVars. / selection=lasso;
run;&lt;/CODE&gt;&lt;/PRE&gt;</description>
      <pubDate>Mon, 12 Feb 2018 21:00:24 GMT</pubDate>
      <guid>https://communities.sas.com/t5/Statistical-Procedures/PROC-GLMSELECT-Model-option-when-dataset-has-a-large-number-of/m-p/436442#M23035</guid>
      <dc:creator>PeterClemmensen</dc:creator>
      <dc:date>2018-02-12T21:00:24Z</dc:date>
    </item>
    <item>
      <title>Re: PROC GLMSELECT Model option when dataset has a large number of variables</title>
      <link>https://communities.sas.com/t5/Statistical-Procedures/PROC-GLMSELECT-Model-option-when-dataset-has-a-large-number-of/m-p/436443#M23036</link>
      <description>&lt;P&gt;The usual&amp;nbsp;way to do this is to use the _NUMERIC_ keyword, which means "use all numeric variable":&lt;/P&gt;
&lt;P&gt;&lt;SPAN&gt;model y =&amp;nbsp;_NUMERIC_ / selection=lasso;&lt;/SPAN&gt;&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;&lt;SPAN&gt;Unfortunately, this won't work here because _NUMERIC_ includes the response variable (Y), and of course that variable explains all the variation so the procedure will select Y and stop!&lt;/SPAN&gt;&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;If all the variable begin with the same letter (such as X1, X2, X3, ...) you can use a colon as a wildcard:&lt;/P&gt;
&lt;P&gt;model y = x: / selection=lasso;&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;Otherwise, I suggest you select all numeric variables that are NOT the response into a macro variable. You can use PROC CONTENTS to get the variables and PROC SQL to create the macro variable, as follows:&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;PRE&gt;&lt;CODE class=" language-sas"&gt;proc contents data=sashelp.cars(drop=_CHARACTER_ mpg_city) /* Y */
     out=varnames(keep = varnum name) noprint;
run;

proc sql noprint;
   select name into :XVars separated by ' '
   from varnames;
quit; 

%put &amp;amp;XVars=;

proc glmselect data=sashelp.cars;
model mpg_city= &amp;amp;XVars / selection=lasso;
run;&lt;/CODE&gt;&lt;/PRE&gt;</description>
      <pubDate>Mon, 12 Feb 2018 20:39:17 GMT</pubDate>
      <guid>https://communities.sas.com/t5/Statistical-Procedures/PROC-GLMSELECT-Model-option-when-dataset-has-a-large-number-of/m-p/436443#M23036</guid>
      <dc:creator>Rick_SAS</dc:creator>
      <dc:date>2018-02-12T20:39:17Z</dc:date>
    </item>
    <item>
      <title>Re: PROC GLMSELECT Model option when dataset has a large number of variables</title>
      <link>https://communities.sas.com/t5/Statistical-Procedures/PROC-GLMSELECT-Model-option-when-dataset-has-a-large-number-of/m-p/436531#M23044</link>
      <description>Thanks for the answer! I liked the wildcard option as well!</description>
      <pubDate>Tue, 13 Feb 2018 02:09:51 GMT</pubDate>
      <guid>https://communities.sas.com/t5/Statistical-Procedures/PROC-GLMSELECT-Model-option-when-dataset-has-a-large-number-of/m-p/436531#M23044</guid>
      <dc:creator>Babinetos</dc:creator>
      <dc:date>2018-02-13T02:09:51Z</dc:date>
    </item>
    <item>
      <title>Re: PROC GLMSELECT Model option when dataset has a large number of variables</title>
      <link>https://communities.sas.com/t5/Statistical-Procedures/PROC-GLMSELECT-Model-option-when-dataset-has-a-large-number-of/m-p/436532#M23045</link>
      <description>Thanks for the Link as well! It was very useful!</description>
      <pubDate>Tue, 13 Feb 2018 02:09:54 GMT</pubDate>
      <guid>https://communities.sas.com/t5/Statistical-Procedures/PROC-GLMSELECT-Model-option-when-dataset-has-a-large-number-of/m-p/436532#M23045</guid>
      <dc:creator>Babinetos</dc:creator>
      <dc:date>2018-02-13T02:09:54Z</dc:date>
    </item>
  </channel>
</rss>

