<?xml version="1.0" encoding="UTF-8"?>
<rss xmlns:content="http://purl.org/rss/1.0/modules/content/" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" xmlns:taxo="http://purl.org/rss/1.0/modules/taxonomy/" version="2.0">
  <channel>
    <title>topic Re: Whittle down possible predictor variables in Statistical Procedures</title>
    <link>https://communities.sas.com/t5/Statistical-Procedures/Whittle-down-possible-predictor-variables/m-p/744872#M36247</link>
    <description>Agree, &lt;a href="https://communities.sas.com/t5/user/viewprofilepage/user-id/18408"&gt;@Ksharp&lt;/a&gt;.  &lt;a href="https://communities.sas.com/t5/user/viewprofilepage/user-id/13684"&gt;@Rick_SAS&lt;/a&gt; blogs are very informative.  The rub always is, problem at hand is a bit different from the example given.  Or, there's an option that who-the-hell knows what to do with it.&lt;BR /&gt;&lt;BR /&gt;If the world were just cookie-cutter clear....&lt;BR /&gt;</description>
    <pubDate>Tue, 01 Jun 2021 05:36:09 GMT</pubDate>
    <dc:creator>NKormanik</dc:creator>
    <dc:date>2021-06-01T05:36:09Z</dc:date>
    <item>
      <title>Whittle down possible predictor variables</title>
      <link>https://communities.sas.com/t5/Statistical-Procedures/Whittle-down-possible-predictor-variables/m-p/743448#M36177</link>
      <description>&lt;P&gt;&lt;BR /&gt;1000 possible predictive variables; one target variable -- PROFIT.&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;Looking for a way to whittle down the possible predictive variables.&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;Ideally I'd like to end up with one single best.&amp;nbsp; Or a handful.&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;Last year someone suggested using HPSplit. The results obtained seemed inconclusive. Many, many parameters to guess at in that procedure, and I might have gotten most guesses wrong.&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;Lately I've been coming across lots of mention of XGBoost -- a new whiz kid on the algorithms block.&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;Wondering if you all would recommend using that?&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;Since XGBoost is not natively included in SAS 9.4, however, trying a built-in procedure would be preferred.&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;Any thoughts greatly appreciated.&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;Nicholas Kormanik&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;</description>
      <pubDate>Tue, 25 May 2021 00:37:00 GMT</pubDate>
      <guid>https://communities.sas.com/t5/Statistical-Procedures/Whittle-down-possible-predictor-variables/m-p/743448#M36177</guid>
      <dc:creator>NKormanik</dc:creator>
      <dc:date>2021-05-25T00:37:00Z</dc:date>
    </item>
    <item>
      <title>Re: Whittle down possible predictor variables</title>
      <link>https://communities.sas.com/t5/Statistical-Procedures/Whittle-down-possible-predictor-variables/m-p/743449#M36178</link>
      <description>You've already explored the "statistical basics" such as Principal Components and Variable Clustering?&lt;BR /&gt;Partial Least Squares Regression?&lt;BR /&gt;&lt;BR /&gt;AFAIK XGBoost is a predictive algorithm (and has had good results IME along with LightGBM) not variable selection methodology. &lt;BR /&gt;</description>
      <pubDate>Tue, 25 May 2021 00:43:44 GMT</pubDate>
      <guid>https://communities.sas.com/t5/Statistical-Procedures/Whittle-down-possible-predictor-variables/m-p/743449#M36178</guid>
      <dc:creator>Reeza</dc:creator>
      <dc:date>2021-05-25T00:43:44Z</dc:date>
    </item>
    <item>
      <title>Re: Whittle down possible predictor variables</title>
      <link>https://communities.sas.com/t5/Statistical-Procedures/Whittle-down-possible-predictor-variables/m-p/743529#M36182</link>
      <description>As Reeza said,  XGBoost or Decision Tree are predict model which are not suited to your data, due to your dependent variable  PROFIT is continuous variable ,not category .&lt;BR /&gt;Try PROC PLS or PROC GENSELECT :&lt;BR /&gt;&lt;BR /&gt;ods output  VariableImportancePlot= VariableImportancePlot;&lt;BR /&gt;proc pls data=class  missing=em   nfac=3 plot=(ParmProfiles VIP) details; * cv=split  cvtest(seed=12345);&lt;BR /&gt; class sex;&lt;BR /&gt; model age=weight height sex;&lt;BR /&gt; output out=x predicted=p;&lt;BR /&gt;run;&lt;BR /&gt;&lt;BR /&gt;&lt;BR /&gt;proc hpgenselect data=have ;&lt;BR /&gt;class   birth_province sex   shop_province ;&lt;BR /&gt;model profit = ..............</description>
      <pubDate>Tue, 25 May 2021 12:20:59 GMT</pubDate>
      <guid>https://communities.sas.com/t5/Statistical-Procedures/Whittle-down-possible-predictor-variables/m-p/743529#M36182</guid>
      <dc:creator>Ksharp</dc:creator>
      <dc:date>2021-05-25T12:20:59Z</dc:date>
    </item>
    <item>
      <title>Re: Whittle down possible predictor variables</title>
      <link>https://communities.sas.com/t5/Statistical-Procedures/Whittle-down-possible-predictor-variables/m-p/743530#M36183</link>
      <description>&lt;P&gt;Strongly recommend PROC PLS, which does not require you to "whittle down" the number of predictor variables. In &lt;A href="https://support.sas.com/rnd/app/stat/papers/pls.pdf" target="_self"&gt;this paper&lt;/A&gt;, the author takes 1000 predictor variables, many of which are highly correlated with another, and creates a useful predictive model without the variable selection step. Please note: the syntax for PROC PLS has changed since that paper was written.&lt;/P&gt;</description>
      <pubDate>Tue, 25 May 2021 12:45:56 GMT</pubDate>
      <guid>https://communities.sas.com/t5/Statistical-Procedures/Whittle-down-possible-predictor-variables/m-p/743530#M36183</guid>
      <dc:creator>PaigeMiller</dc:creator>
      <dc:date>2021-05-25T12:45:56Z</dc:date>
    </item>
    <item>
      <title>Re: Whittle down possible predictor variables</title>
      <link>https://communities.sas.com/t5/Statistical-Procedures/Whittle-down-possible-predictor-variables/m-p/743531#M36184</link>
      <description>&lt;P&gt;You might want to listen in on&amp;nbsp;&lt;a href="https://communities.sas.com/t5/user/viewprofilepage/user-id/253176"&gt;@sasmlp&lt;/a&gt;&amp;nbsp;'s webinar (see the announcement in this community) where high-dimensional variable selection in SAS will be covered (likely this will emphasize HPGENSELECT).&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;SteveDenham&lt;/P&gt;</description>
      <pubDate>Tue, 25 May 2021 12:25:54 GMT</pubDate>
      <guid>https://communities.sas.com/t5/Statistical-Procedures/Whittle-down-possible-predictor-variables/m-p/743531#M36184</guid>
      <dc:creator>SteveDenham</dc:creator>
      <dc:date>2021-05-25T12:25:54Z</dc:date>
    </item>
    <item>
      <title>Re: Whittle down possible predictor variables</title>
      <link>https://communities.sas.com/t5/Statistical-Procedures/Whittle-down-possible-predictor-variables/m-p/743545#M36186</link>
      <description>&lt;BLOCKQUOTE&gt;&lt;HR /&gt;&lt;a href="https://communities.sas.com/t5/user/viewprofilepage/user-id/10892"&gt;@PaigeMiller&lt;/a&gt;&amp;nbsp;wrote:&lt;BR /&gt;
&lt;P&gt;Strongly recommend PROC PLS, which does not require you to "whittle down" the number of predictor variables. In &lt;A href="https://support.sas.com/rnd/app/stat/papers/pls.pdf" target="_self"&gt;this paper&lt;/A&gt;, the author takes 1000 predictor variables, many of which are highly correlated with another, and creates a useful predictive model without the variable selection step. Please note: the syntax for PROC PLS has changed since that paper was written.&lt;/P&gt;
&lt;HR /&gt;&lt;/BLOCKQUOTE&gt;
&lt;P&gt;Adding to the above ... PLS is surprisingly robust against multi-collinearity among the X variables, which enables the author to skip the variable selection step. And PLS deserves more attention and more use; although there are probably a thousand published papers now where PLS has been used successfully, it is not widely known amongst data practitioners, and it should be widely known!&lt;/P&gt;</description>
      <pubDate>Tue, 25 May 2021 13:19:19 GMT</pubDate>
      <guid>https://communities.sas.com/t5/Statistical-Procedures/Whittle-down-possible-predictor-variables/m-p/743545#M36186</guid>
      <dc:creator>PaigeMiller</dc:creator>
      <dc:date>2021-05-25T13:19:19Z</dc:date>
    </item>
    <item>
      <title>Re: Whittle down possible predictor variables</title>
      <link>https://communities.sas.com/t5/Statistical-Procedures/Whittle-down-possible-predictor-variables/m-p/743690#M36189</link>
      <description>&lt;P&gt;I greatly appreciate all your suggestions and insights.&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;Seemingly a straight forward problem.&amp;nbsp; Yet, if the proper tool for the job is not known, one can be stumped forever.&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;</description>
      <pubDate>Tue, 25 May 2021 20:14:44 GMT</pubDate>
      <guid>https://communities.sas.com/t5/Statistical-Procedures/Whittle-down-possible-predictor-variables/m-p/743690#M36189</guid>
      <dc:creator>NKormanik</dc:creator>
      <dc:date>2021-05-25T20:14:44Z</dc:date>
    </item>
    <item>
      <title>Re: Whittle down possible predictor variables</title>
      <link>https://communities.sas.com/t5/Statistical-Procedures/Whittle-down-possible-predictor-variables/m-p/743824#M36190</link>
      <description>So learn statistic/probability theory to know the proper tool for the job .&lt;BR /&gt;&lt;a href="https://communities.sas.com/t5/user/viewprofilepage/user-id/13684"&gt;@Rick_SAS&lt;/a&gt;  blog is a good place or sas documentation .</description>
      <pubDate>Wed, 26 May 2021 11:53:35 GMT</pubDate>
      <guid>https://communities.sas.com/t5/Statistical-Procedures/Whittle-down-possible-predictor-variables/m-p/743824#M36190</guid>
      <dc:creator>Ksharp</dc:creator>
      <dc:date>2021-05-26T11:53:35Z</dc:date>
    </item>
    <item>
      <title>Re: Whittle down possible predictor variables</title>
      <link>https://communities.sas.com/t5/Statistical-Procedures/Whittle-down-possible-predictor-variables/m-p/744872#M36247</link>
      <description>Agree, &lt;a href="https://communities.sas.com/t5/user/viewprofilepage/user-id/18408"&gt;@Ksharp&lt;/a&gt;.  &lt;a href="https://communities.sas.com/t5/user/viewprofilepage/user-id/13684"&gt;@Rick_SAS&lt;/a&gt; blogs are very informative.  The rub always is, problem at hand is a bit different from the example given.  Or, there's an option that who-the-hell knows what to do with it.&lt;BR /&gt;&lt;BR /&gt;If the world were just cookie-cutter clear....&lt;BR /&gt;</description>
      <pubDate>Tue, 01 Jun 2021 05:36:09 GMT</pubDate>
      <guid>https://communities.sas.com/t5/Statistical-Procedures/Whittle-down-possible-predictor-variables/m-p/744872#M36247</guid>
      <dc:creator>NKormanik</dc:creator>
      <dc:date>2021-06-01T05:36:09Z</dc:date>
    </item>
  </channel>
</rss>

