<?xml version="1.0" encoding="UTF-8"?>
<rss xmlns:content="http://purl.org/rss/1.0/modules/content/" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" xmlns:taxo="http://purl.org/rss/1.0/modules/taxonomy/" version="2.0">
  <channel>
    <title>topic Re: How to deal with missing values for categorical variables? in Statistical Procedures</title>
    <link>https://communities.sas.com/t5/Statistical-Procedures/How-to-deal-with-missing-values-for-categorical-variables/m-p/932768#M46511</link>
    <description>Yes. You "should check the missing patterns firstly, and try to impute the missing values."&lt;BR /&gt;Check &lt;a href="https://communities.sas.com/t5/user/viewprofilepage/user-id/13684"&gt;@Rick_SAS&lt;/a&gt; blog:&lt;BR /&gt;&lt;A href="https://blogs.sas.com/content/iml/2017/11/29/visualize-patterns-missing-values.html" target="_blank"&gt;https://blogs.sas.com/content/iml/2017/11/29/visualize-patterns-missing-values.html&lt;/A&gt;&lt;BR /&gt;&lt;A href="https://blogs.sas.com/content/iml/2016/04/20/visualize-missing-data-sas.html" target="_blank"&gt;https://blogs.sas.com/content/iml/2016/04/20/visualize-missing-data-sas.html&lt;/A&gt;&lt;BR /&gt;&lt;BR /&gt;</description>
    <pubDate>Tue, 18 Jun 2024 02:15:40 GMT</pubDate>
    <dc:creator>Ksharp</dc:creator>
    <dc:date>2024-06-18T02:15:40Z</dc:date>
    <item>
      <title>How to deal with missing values for categorical variables?</title>
      <link>https://communities.sas.com/t5/Statistical-Procedures/How-to-deal-with-missing-values-for-categorical-variables/m-p/932744#M46503</link>
      <description>&lt;P&gt;Currently, I have like 20 categorical variables (X1 to X20) and one binary outcome variable (Y). Firstly, I used the chi-square test to check the association between X1 to X20 and Y one by one. And then I use proc logistic regression and stepwise selection to select the important predictor and examine the impact of X1 - X20 on Y in a multivariate way. However, after I conducted the logistic regression, the SAS said I don't have a valid observations. Because the large amount of missing values, after I put X1 -X20 together, no observations were found. The missing rate of X1-X20 ranging from 20% to 94%.&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;Maybe I should try to impute the missing values using proc mi, but what I can find are all for continuous variables.&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;Do you think I should just stop at the chi-square test one by one? Or I should check the missing patterns firstly, and try to impute the missing values. If I should try to impute the missing values, does anyone know how to examine the missing pattern and impute for categorical variables? Thank you!&lt;/P&gt;</description>
      <pubDate>Mon, 17 Jun 2024 20:38:45 GMT</pubDate>
      <guid>https://communities.sas.com/t5/Statistical-Procedures/How-to-deal-with-missing-values-for-categorical-variables/m-p/932744#M46503</guid>
      <dc:creator>SAS-questioner</dc:creator>
      <dc:date>2024-06-17T20:38:45Z</dc:date>
    </item>
    <item>
      <title>Re: How to deal with missing values for categorical variables?</title>
      <link>https://communities.sas.com/t5/Statistical-Procedures/How-to-deal-with-missing-values-for-categorical-variables/m-p/932747#M46504</link>
      <description>I think most statisticians would question the usefulness of using a variable as a predictor that has 94% of missing values.&lt;BR /&gt;That being said, if the goal is to use model selection, then multiple imputation isn’t really an option.  Because it creates multiple imputed data sets it is probable that you will get different models selected for some of the data sets.  This means you will not be able to combine the results for a single set of parameters.&lt;BR /&gt;If you want to impute categorical variables, then you can use the DISCRIM or LOGISTIC methods on either the FCS or MONOTONE statements in Proc MI.  This section of the documentation might be helpful:&lt;BR /&gt;&lt;A href="https://documentation.sas.com/doc/en/statug/15.2/statug_mi_details05.htm" target="_blank"&gt;https://documentation.sas.com/doc/en/statug/15.2/statug_mi_details05.htm&lt;/A&gt;&lt;BR /&gt;</description>
      <pubDate>Mon, 17 Jun 2024 20:47:44 GMT</pubDate>
      <guid>https://communities.sas.com/t5/Statistical-Procedures/How-to-deal-with-missing-values-for-categorical-variables/m-p/932747#M46504</guid>
      <dc:creator>SAS_Rob</dc:creator>
      <dc:date>2024-06-17T20:47:44Z</dc:date>
    </item>
    <item>
      <title>Re: How to deal with missing values for categorical variables?</title>
      <link>https://communities.sas.com/t5/Statistical-Procedures/How-to-deal-with-missing-values-for-categorical-variables/m-p/932748#M46505</link>
      <description>&lt;P&gt;What exactly do your X1 through X20 represent?&lt;/P&gt;</description>
      <pubDate>Mon, 17 Jun 2024 20:50:38 GMT</pubDate>
      <guid>https://communities.sas.com/t5/Statistical-Procedures/How-to-deal-with-missing-values-for-categorical-variables/m-p/932748#M46505</guid>
      <dc:creator>ballardw</dc:creator>
      <dc:date>2024-06-17T20:50:38Z</dc:date>
    </item>
    <item>
      <title>Re: How to deal with missing values for categorical variables?</title>
      <link>https://communities.sas.com/t5/Statistical-Procedures/How-to-deal-with-missing-values-for-categorical-variables/m-p/932749#M46506</link>
      <description>I would combine the approaches here. &lt;BR /&gt;Drop rows with missing more than 80% and drop columns with more than 80% and see where that leave you. &lt;BR /&gt;&lt;BR /&gt;You'll need to do this both ways I suspect. &lt;BR /&gt;&lt;BR /&gt;Also, examine why there are so many missing, are they missing at random or systemic? &lt;BR /&gt;&lt;BR /&gt;&lt;A href="https://blogs.sas.com/content/iml/2016/04/18/patterns-of-missing-data-in-sas.html" target="_blank"&gt;https://blogs.sas.com/content/iml/2016/04/18/patterns-of-missing-data-in-sas.html&lt;/A&gt;</description>
      <pubDate>Mon, 17 Jun 2024 20:52:07 GMT</pubDate>
      <guid>https://communities.sas.com/t5/Statistical-Procedures/How-to-deal-with-missing-values-for-categorical-variables/m-p/932749#M46506</guid>
      <dc:creator>Reeza</dc:creator>
      <dc:date>2024-06-17T20:52:07Z</dc:date>
    </item>
    <item>
      <title>Re: How to deal with missing values for categorical variables?</title>
      <link>https://communities.sas.com/t5/Statistical-Procedures/How-to-deal-with-missing-values-for-categorical-variables/m-p/932753#M46507</link>
      <description>&lt;P&gt;Hi, Reeza, thank you for the reply. So, I don't need to stop at the chi-square step. I can try to drop the rows and columns with missing more than 80% and test the rest of data with logistic regression right?&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;Yeah, actually I found the link you put there, but when I test with the proc mi code in there, I need to put class statement and FCS or monotone, in this case can I still examine the pattern of the missing?&amp;nbsp;&lt;/P&gt;</description>
      <pubDate>Mon, 17 Jun 2024 22:01:31 GMT</pubDate>
      <guid>https://communities.sas.com/t5/Statistical-Procedures/How-to-deal-with-missing-values-for-categorical-variables/m-p/932753#M46507</guid>
      <dc:creator>SAS-questioner</dc:creator>
      <dc:date>2024-06-17T22:01:31Z</dc:date>
    </item>
    <item>
      <title>Re: How to deal with missing values for categorical variables?</title>
      <link>https://communities.sas.com/t5/Statistical-Procedures/How-to-deal-with-missing-values-for-categorical-variables/m-p/932754#M46508</link>
      <description>Thank you for the reply, all those variables are survey item with yes/no, or other 5 category-options.</description>
      <pubDate>Mon, 17 Jun 2024 22:02:28 GMT</pubDate>
      <guid>https://communities.sas.com/t5/Statistical-Procedures/How-to-deal-with-missing-values-for-categorical-variables/m-p/932754#M46508</guid>
      <dc:creator>SAS-questioner</dc:creator>
      <dc:date>2024-06-17T22:02:28Z</dc:date>
    </item>
    <item>
      <title>Re: How to deal with missing values for categorical variables?</title>
      <link>https://communities.sas.com/t5/Statistical-Procedures/How-to-deal-with-missing-values-for-categorical-variables/m-p/932755#M46509</link>
      <description>Thank you for the reply! But there are many predictors, do you think I should stop at the chi-square test step? Or I should try to put all predictors in the model and use the method that you suggest?</description>
      <pubDate>Mon, 17 Jun 2024 22:06:34 GMT</pubDate>
      <guid>https://communities.sas.com/t5/Statistical-Procedures/How-to-deal-with-missing-values-for-categorical-variables/m-p/932755#M46509</guid>
      <dc:creator>SAS-questioner</dc:creator>
      <dc:date>2024-06-17T22:06:34Z</dc:date>
    </item>
    <item>
      <title>Re: How to deal with missing values for categorical variables?</title>
      <link>https://communities.sas.com/t5/Statistical-Procedures/How-to-deal-with-missing-values-for-categorical-variables/m-p/932768#M46511</link>
      <description>Yes. You "should check the missing patterns firstly, and try to impute the missing values."&lt;BR /&gt;Check &lt;a href="https://communities.sas.com/t5/user/viewprofilepage/user-id/13684"&gt;@Rick_SAS&lt;/a&gt; blog:&lt;BR /&gt;&lt;A href="https://blogs.sas.com/content/iml/2017/11/29/visualize-patterns-missing-values.html" target="_blank"&gt;https://blogs.sas.com/content/iml/2017/11/29/visualize-patterns-missing-values.html&lt;/A&gt;&lt;BR /&gt;&lt;A href="https://blogs.sas.com/content/iml/2016/04/20/visualize-missing-data-sas.html" target="_blank"&gt;https://blogs.sas.com/content/iml/2016/04/20/visualize-missing-data-sas.html&lt;/A&gt;&lt;BR /&gt;&lt;BR /&gt;</description>
      <pubDate>Tue, 18 Jun 2024 02:15:40 GMT</pubDate>
      <guid>https://communities.sas.com/t5/Statistical-Procedures/How-to-deal-with-missing-values-for-categorical-variables/m-p/932768#M46511</guid>
      <dc:creator>Ksharp</dc:creator>
      <dc:date>2024-06-18T02:15:40Z</dc:date>
    </item>
    <item>
      <title>Re: How to deal with missing values for categorical variables?</title>
      <link>https://communities.sas.com/t5/Statistical-Procedures/How-to-deal-with-missing-values-for-categorical-variables/m-p/932776#M46512</link>
      <description>&lt;P&gt;You might try to use a decision tree instead of logistic regression. Logistic regression drops the entire observation if any variable is missing. A decision tree doesn't. See the PROC HPSPLIT documentation at&amp;nbsp;&lt;A href="https://documentation.sas.com/doc/en/statug/15.2/statug_hpsplit_examples01.htm" target="_blank" rel="noopener"&gt;https://documentation.sas.com/doc/en/statug/15.2/statug_hpsplit_examples01.htm&lt;/A&gt;&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;</description>
      <pubDate>Tue, 18 Jun 2024 09:12:13 GMT</pubDate>
      <guid>https://communities.sas.com/t5/Statistical-Procedures/How-to-deal-with-missing-values-for-categorical-variables/m-p/932776#M46512</guid>
      <dc:creator>Rick_SAS</dc:creator>
      <dc:date>2024-06-18T09:12:13Z</dc:date>
    </item>
    <item>
      <title>Re: How to deal with missing values for categorical variables?</title>
      <link>https://communities.sas.com/t5/Statistical-Procedures/How-to-deal-with-missing-values-for-categorical-variables/m-p/932778#M46513</link>
      <description>&lt;P&gt;And also you could try Partial Least Square Regression (PROC PLS) also could handle/impute missing value and get importance of variables.&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;PRE&gt;&lt;CODE class=" language-sas"&gt;proc pls data=class  &lt;STRONG&gt;missing=em&lt;/STRONG&gt;   nfac=2 plot=(ParmProfiles &lt;STRONG&gt;VIP&lt;/STRONG&gt;) details; * cv=split  cvtest(seed=12345);
 class sex;
 model age=weight height sex;
* output out=x predicted=p;
run;&lt;/CODE&gt;&lt;/PRE&gt;</description>
      <pubDate>Tue, 18 Jun 2024 09:31:46 GMT</pubDate>
      <guid>https://communities.sas.com/t5/Statistical-Procedures/How-to-deal-with-missing-values-for-categorical-variables/m-p/932778#M46513</guid>
      <dc:creator>Ksharp</dc:creator>
      <dc:date>2024-06-18T09:31:46Z</dc:date>
    </item>
    <item>
      <title>Re: How to deal with missing values for categorical variables?</title>
      <link>https://communities.sas.com/t5/Statistical-Procedures/How-to-deal-with-missing-values-for-categorical-variables/m-p/932798#M46514</link>
      <description>Thank you for the reply, I tried to check the missing pattern by using the procedure listed in the blog, however, SAS kept saying error. Firstly, I need to put "class". Then I need to put "FCS" or "MONOTONE" statement. After I put "FCS", SAS said, there is no continuous variables in the VAR list to impute the variable with FCS methods. It seems like PROC MI doesn't work on all categorical variables?</description>
      <pubDate>Tue, 18 Jun 2024 14:04:16 GMT</pubDate>
      <guid>https://communities.sas.com/t5/Statistical-Procedures/How-to-deal-with-missing-values-for-categorical-variables/m-p/932798#M46514</guid>
      <dc:creator>SAS-questioner</dc:creator>
      <dc:date>2024-06-18T14:04:16Z</dc:date>
    </item>
    <item>
      <title>Re: How to deal with missing values for categorical variables?</title>
      <link>https://communities.sas.com/t5/Statistical-Procedures/How-to-deal-with-missing-values-for-categorical-variables/m-p/932800#M46515</link>
      <description>&lt;BLOCKQUOTE&gt;&lt;HR /&gt;&lt;a href="https://communities.sas.com/t5/user/viewprofilepage/user-id/376504"&gt;@SAS-questioner&lt;/a&gt;&amp;nbsp;wrote:&lt;BR /&gt;Thank you for the reply, all those variables are survey item with yes/no, or other 5 category-options.&lt;HR /&gt;&lt;/BLOCKQUOTE&gt;
&lt;P&gt;If the "missing" results are from skip patterns in the survey questions, i.e. if question 1 is answered no (or yes) then "skip" question 2 then you have issues with dependency which may cause regression problems. Plus they are a known missing cause so could have a special category assigned to handle that conditionality.&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;And what kind of sample design was used in the survey? If the sample design is complex, such as a stratified sample, then you should be using the survey procedures for analysis to correctly use any weights assigned.&lt;/P&gt;</description>
      <pubDate>Tue, 18 Jun 2024 14:16:50 GMT</pubDate>
      <guid>https://communities.sas.com/t5/Statistical-Procedures/How-to-deal-with-missing-values-for-categorical-variables/m-p/932800#M46515</guid>
      <dc:creator>ballardw</dc:creator>
      <dc:date>2024-06-18T14:16:50Z</dc:date>
    </item>
    <item>
      <title>Re: How to deal with missing values for categorical variables?</title>
      <link>https://communities.sas.com/t5/Statistical-Procedures/How-to-deal-with-missing-values-for-categorical-variables/m-p/932801#M46516</link>
      <description>Does it handle the binary outcome?</description>
      <pubDate>Tue, 18 Jun 2024 14:16:56 GMT</pubDate>
      <guid>https://communities.sas.com/t5/Statistical-Procedures/How-to-deal-with-missing-values-for-categorical-variables/m-p/932801#M46516</guid>
      <dc:creator>SAS-questioner</dc:creator>
      <dc:date>2024-06-18T14:16:56Z</dc:date>
    </item>
    <item>
      <title>Re: How to deal with missing values for categorical variables?</title>
      <link>https://communities.sas.com/t5/Statistical-Procedures/How-to-deal-with-missing-values-for-categorical-variables/m-p/932802#M46517</link>
      <description>There is no skip patterns in the survey questions. They should answer all questions, but I don't know why there are so many missing items. It's simple design.</description>
      <pubDate>Tue, 18 Jun 2024 14:22:27 GMT</pubDate>
      <guid>https://communities.sas.com/t5/Statistical-Procedures/How-to-deal-with-missing-values-for-categorical-variables/m-p/932802#M46517</guid>
      <dc:creator>SAS-questioner</dc:creator>
      <dc:date>2024-06-18T14:22:27Z</dc:date>
    </item>
    <item>
      <title>Re: How to deal with missing values for categorical variables?</title>
      <link>https://communities.sas.com/t5/Statistical-Procedures/How-to-deal-with-missing-values-for-categorical-variables/m-p/932803#M46518</link>
      <description>I assume that the ERROR message you are receiving is the following:&lt;BR /&gt;ERROR: The CLASS variables cannot be used as covariates in an FCS discriminant method with the default CLASSEFFECT=EXCLUDE option.&lt;BR /&gt;The solution is to use the CLASSEFFECTS=INCLUDE option on the FCS DISCRIM statement.&lt;BR /&gt;&lt;A href="https://go.documentation.sas.com/doc/en/pgmsascdc/9.4_3.5/statug/statug_mi_syntax05.htm#statug.mi.fcsdiscrimopt" target="_blank"&gt;https://go.documentation.sas.com/doc/en/pgmsascdc/9.4_3.5/statug/statug_mi_syntax05.htm#statug.mi.fcsdiscrimopt&lt;/A&gt;</description>
      <pubDate>Tue, 18 Jun 2024 14:25:15 GMT</pubDate>
      <guid>https://communities.sas.com/t5/Statistical-Procedures/How-to-deal-with-missing-values-for-categorical-variables/m-p/932803#M46518</guid>
      <dc:creator>SAS_Rob</dc:creator>
      <dc:date>2024-06-18T14:25:15Z</dc:date>
    </item>
    <item>
      <title>Re: How to deal with missing values for categorical variables?</title>
      <link>https://communities.sas.com/t5/Statistical-Procedures/How-to-deal-with-missing-values-for-categorical-variables/m-p/932921#M46525</link>
      <description>Yes. I think so. And better use numeric type variable.</description>
      <pubDate>Wed, 19 Jun 2024 03:21:10 GMT</pubDate>
      <guid>https://communities.sas.com/t5/Statistical-Procedures/How-to-deal-with-missing-values-for-categorical-variables/m-p/932921#M46525</guid>
      <dc:creator>Ksharp</dc:creator>
      <dc:date>2024-06-19T03:21:10Z</dc:date>
    </item>
  </channel>
</rss>

