<?xml version="1.0" encoding="UTF-8"?>
<rss xmlns:content="http://purl.org/rss/1.0/modules/content/" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" xmlns:taxo="http://purl.org/rss/1.0/modules/taxonomy/" version="2.0">
  <channel>
    <title>topic Re: How to LASSO with wide data in Statistical Procedures</title>
    <link>https://communities.sas.com/t5/Statistical-Procedures/How-to-LASSO-with-wide-data/m-p/697610#M33659</link>
    <description>&lt;P&gt;Well, I think it's a bad idea, but that's just my opinion.&amp;nbsp; What you are describing is a throw everything at the data, look at what shows up and try to make sense of it approach.&amp;nbsp; However, I will wager a fair amount that you (or the literature) has some expert knowledge about the variables and their relative importance.&amp;nbsp; I would start there.&amp;nbsp; Then, rather than regression, I would consider approaches like decision trees and variable clustering.&amp;nbsp; Check out the SAS Data Mining and Machine Learning community for info on these approaches.&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;SteveDenham&lt;/P&gt;</description>
    <pubDate>Mon, 09 Nov 2020 13:43:59 GMT</pubDate>
    <dc:creator>SteveDenham</dc:creator>
    <dc:date>2020-11-09T13:43:59Z</dc:date>
    <item>
      <title>How to LASSO with wide data</title>
      <link>https://communities.sas.com/t5/Statistical-Procedures/How-to-LASSO-with-wide-data/m-p/697249#M33634</link>
      <description>&lt;P&gt;Hi,&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;I have a dataset with ~170 binary dummy variables (clinical indicators) and close to 1,000,000 rows of data. I'm interested in modeling the 170 dummy variables as well as the second order interactions between them.&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;I believe there should be n*(n-1) / 2 = 14,365 2nd order interactions, which makes the dataset very wide (as well as long).&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;I've tried writing a model call to HPGENSELECT into a text file with all variables, but even with a small subset of observations (1000), the code ran for a very long time before I finally killed it.&amp;nbsp;&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;Do you have any suggestions for getting something like this to run? A valid answer is "that's a bad idea, why would you do that" &lt;span class="lia-unicode-emoji" title=":slightly_smiling_face:"&gt;🙂&lt;/span&gt;&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;SAS Version:&amp;nbsp;9.04M5&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;Thanks in advance!&lt;/P&gt;</description>
      <pubDate>Fri, 06 Nov 2020 21:10:34 GMT</pubDate>
      <guid>https://communities.sas.com/t5/Statistical-Procedures/How-to-LASSO-with-wide-data/m-p/697249#M33634</guid>
      <dc:creator>dschmidt</dc:creator>
      <dc:date>2020-11-06T21:10:34Z</dc:date>
    </item>
    <item>
      <title>Re: How to LASSO with wide data</title>
      <link>https://communities.sas.com/t5/Statistical-Procedures/How-to-LASSO-with-wide-data/m-p/697610#M33659</link>
      <description>&lt;P&gt;Well, I think it's a bad idea, but that's just my opinion.&amp;nbsp; What you are describing is a throw everything at the data, look at what shows up and try to make sense of it approach.&amp;nbsp; However, I will wager a fair amount that you (or the literature) has some expert knowledge about the variables and their relative importance.&amp;nbsp; I would start there.&amp;nbsp; Then, rather than regression, I would consider approaches like decision trees and variable clustering.&amp;nbsp; Check out the SAS Data Mining and Machine Learning community for info on these approaches.&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;SteveDenham&lt;/P&gt;</description>
      <pubDate>Mon, 09 Nov 2020 13:43:59 GMT</pubDate>
      <guid>https://communities.sas.com/t5/Statistical-Procedures/How-to-LASSO-with-wide-data/m-p/697610#M33659</guid>
      <dc:creator>SteveDenham</dc:creator>
      <dc:date>2020-11-09T13:43:59Z</dc:date>
    </item>
    <item>
      <title>Re: How to LASSO with wide data</title>
      <link>https://communities.sas.com/t5/Statistical-Procedures/How-to-LASSO-with-wide-data/m-p/697633#M33661</link>
      <description>&lt;P&gt;Thanks, appreciate the feedback! I was planning on using the lasso as an exploratory tool, but I suppose something like a decision tree might do the same job better &lt;span class="lia-unicode-emoji" title=":slightly_smiling_face:"&gt;🙂&lt;/span&gt;&lt;/P&gt;</description>
      <pubDate>Mon, 09 Nov 2020 15:02:34 GMT</pubDate>
      <guid>https://communities.sas.com/t5/Statistical-Procedures/How-to-LASSO-with-wide-data/m-p/697633#M33661</guid>
      <dc:creator>dschmidt</dc:creator>
      <dc:date>2020-11-09T15:02:34Z</dc:date>
    </item>
  </channel>
</rss>

