<?xml version="1.0" encoding="UTF-8"?>
<rss xmlns:content="http://purl.org/rss/1.0/modules/content/" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" xmlns:taxo="http://purl.org/rss/1.0/modules/taxonomy/" version="2.0">
  <channel>
    <title>topic Deal with missing data for the prediction model in SAS Programming</title>
    <link>https://communities.sas.com/t5/SAS-Programming/Deal-with-missing-data-for-the-prediction-model/m-p/310238#M66891</link>
    <description>&lt;P&gt;Hello, I need to build a preditive model for my data.&amp;nbsp;The variables include: year, gender, race, num_cases, location, ageGroup, disease(1:yes, 0:No), maritual_status, charge. We did not collect race data for the first 4 years (coded 99).&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;1) How can I deal with the missing race data? (It seems the race is significant.)&lt;/P&gt;
&lt;P&gt;2) How can I build a model to predict the probability (disease="yes") for the given variables in the dataset?&lt;/P&gt;
&lt;P&gt;Very appreciate for any suggestions and help. Thanks.&amp;nbsp;&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;data have;&lt;BR /&gt;input year Gender$ Race Num_Cases Location gender agegroup Disease maritual_status charge;&lt;BR /&gt;datalines;&lt;BR /&gt;1 M 99 1 8 1 6 1 5 900&lt;BR /&gt;1 M 99 5 3 1 6 1 1 152&lt;BR /&gt;1 F 99 3 16 0 7 1 6 588&lt;BR /&gt;1 M 99 26 3 1 7 1 2 79&lt;BR /&gt;1 M 99 1 16 1 7 1 6 179&lt;BR /&gt;1 M 99 1 12 1 5 1 2 100&lt;BR /&gt;1 M 99 2 4 1 7 1 1 245&lt;BR /&gt;2 M 99 1 3 1 5 1 5 625&lt;BR /&gt;2 F 99 2 3 0 5 1 1 35&lt;BR /&gt;2 F 99 1 16 0 6 1 2 144&lt;BR /&gt;2 F 99 1 3 0 5 0 5 625&lt;BR /&gt;2 F 99 1 3 0 6 0 4 576&lt;BR /&gt;2 F 99 3 3 0 6 1 4 192&lt;BR /&gt;3 M 99 6 3 1 5 0 1 500&lt;BR /&gt;3 M 99 1 3 1 7 1 2 196&lt;BR /&gt;3 M 99 1 1 1 6 0 1 36&lt;BR /&gt;3 M 99 1 3 1 5 0 1 25&lt;BR /&gt;4 M 99 1 3 1 5 0 2 100&lt;BR /&gt;4 M 99 3 16 1 5 1 2 352&lt;BR /&gt;4 F 99 1 16 0 6 1 6 1296&lt;BR /&gt;4 F 99 6 11 0 7 0 2 254&lt;BR /&gt;5 M 1 1 3 1 5 1 1 25&lt;BR /&gt;5 F 2 3 16 0 4 1 2 213&lt;BR /&gt;6 F 1 1 2 0 7 1 6 184&lt;BR /&gt;6 F 1 1 13 0 7 1 2 196&lt;BR /&gt;6 F 1 1 4 0 7 0 1 49&lt;BR /&gt;6 M 4 2 3 1 5 0 1 125&lt;BR /&gt;6 F 5 33 3 0 6 0 5 80&lt;BR /&gt;7 F 4 1 16 0 7 0 6 1764&lt;BR /&gt;7 F 4 2 3 0 6 0 6 648&lt;BR /&gt;7 M 6 1 16 1 6 1 6 1296&lt;BR /&gt;7 F 2 1 2 0 5 0 5 625&lt;BR /&gt;7 F 1 24 3 0 5 0 2 452&lt;BR /&gt;7 F 1 1 3 0 6 1 1 362&lt;BR /&gt;8 M 5 2 10 1 7 1 2 980&lt;BR /&gt;8 M 1 5 3 1 4 1 1 350&lt;BR /&gt;8 F 1 1 3 0 6 0 99 352&lt;BR /&gt;8 M 5 1 3 1 5 0 1 25&lt;BR /&gt;8 M 1 1 3 1 7 0 1 49&lt;BR /&gt;9 M 1 1 13 1 7 1 5 1225&lt;BR /&gt;9 M 5 4 16 1 7 0 1 122&lt;BR /&gt;9 F 5 2 3 0 7 1 2 98&lt;BR /&gt;9 M 1 1 1 1 7 1 5 126&lt;BR /&gt;10 F 1 66 3 0 6 0 1 54&lt;BR /&gt;10 F 2 1 1 0 5 0 1 25&lt;BR /&gt;10 F 1 2 4 0 6 0 5 450&lt;BR /&gt;10 M 1 3 16 1 7 0 5 408&lt;BR /&gt;11 F 1 1 8 0 7 1 2 196&lt;BR /&gt;11 M 3 1 3 1 7 1 2 196&lt;BR /&gt;11 M 5 5 3 1 6 0 1 72&lt;BR /&gt;11 M 1 2 3 1 7 1 1 245&lt;BR /&gt;12 F 1 9 11 0 7 0 6 196&lt;BR /&gt;12 M 5 2 3 1 5 0 1 125&lt;BR /&gt;12 F 5 13 3 0 7 0 6 150&lt;BR /&gt;12 M 0 2 3 1 7 0 2 98&lt;BR /&gt;13 F 5 3 3 0 5 0 1 215&lt;BR /&gt;13 M 0 4 3 1 5 0 1 625&lt;BR /&gt;13 M 2 25 3 1 7 1 2 784&lt;BR /&gt;13 M 1 1 1 1 7 0 99 480&lt;BR /&gt;13 M 2 27 3 1 7 0 2 725&lt;BR /&gt;;&lt;/P&gt;</description>
    <pubDate>Tue, 08 Nov 2016 21:10:50 GMT</pubDate>
    <dc:creator>Yurie</dc:creator>
    <dc:date>2016-11-08T21:10:50Z</dc:date>
    <item>
      <title>Deal with missing data for the prediction model</title>
      <link>https://communities.sas.com/t5/SAS-Programming/Deal-with-missing-data-for-the-prediction-model/m-p/310238#M66891</link>
      <description>&lt;P&gt;Hello, I need to build a preditive model for my data.&amp;nbsp;The variables include: year, gender, race, num_cases, location, ageGroup, disease(1:yes, 0:No), maritual_status, charge. We did not collect race data for the first 4 years (coded 99).&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;1) How can I deal with the missing race data? (It seems the race is significant.)&lt;/P&gt;
&lt;P&gt;2) How can I build a model to predict the probability (disease="yes") for the given variables in the dataset?&lt;/P&gt;
&lt;P&gt;Very appreciate for any suggestions and help. Thanks.&amp;nbsp;&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;data have;&lt;BR /&gt;input year Gender$ Race Num_Cases Location gender agegroup Disease maritual_status charge;&lt;BR /&gt;datalines;&lt;BR /&gt;1 M 99 1 8 1 6 1 5 900&lt;BR /&gt;1 M 99 5 3 1 6 1 1 152&lt;BR /&gt;1 F 99 3 16 0 7 1 6 588&lt;BR /&gt;1 M 99 26 3 1 7 1 2 79&lt;BR /&gt;1 M 99 1 16 1 7 1 6 179&lt;BR /&gt;1 M 99 1 12 1 5 1 2 100&lt;BR /&gt;1 M 99 2 4 1 7 1 1 245&lt;BR /&gt;2 M 99 1 3 1 5 1 5 625&lt;BR /&gt;2 F 99 2 3 0 5 1 1 35&lt;BR /&gt;2 F 99 1 16 0 6 1 2 144&lt;BR /&gt;2 F 99 1 3 0 5 0 5 625&lt;BR /&gt;2 F 99 1 3 0 6 0 4 576&lt;BR /&gt;2 F 99 3 3 0 6 1 4 192&lt;BR /&gt;3 M 99 6 3 1 5 0 1 500&lt;BR /&gt;3 M 99 1 3 1 7 1 2 196&lt;BR /&gt;3 M 99 1 1 1 6 0 1 36&lt;BR /&gt;3 M 99 1 3 1 5 0 1 25&lt;BR /&gt;4 M 99 1 3 1 5 0 2 100&lt;BR /&gt;4 M 99 3 16 1 5 1 2 352&lt;BR /&gt;4 F 99 1 16 0 6 1 6 1296&lt;BR /&gt;4 F 99 6 11 0 7 0 2 254&lt;BR /&gt;5 M 1 1 3 1 5 1 1 25&lt;BR /&gt;5 F 2 3 16 0 4 1 2 213&lt;BR /&gt;6 F 1 1 2 0 7 1 6 184&lt;BR /&gt;6 F 1 1 13 0 7 1 2 196&lt;BR /&gt;6 F 1 1 4 0 7 0 1 49&lt;BR /&gt;6 M 4 2 3 1 5 0 1 125&lt;BR /&gt;6 F 5 33 3 0 6 0 5 80&lt;BR /&gt;7 F 4 1 16 0 7 0 6 1764&lt;BR /&gt;7 F 4 2 3 0 6 0 6 648&lt;BR /&gt;7 M 6 1 16 1 6 1 6 1296&lt;BR /&gt;7 F 2 1 2 0 5 0 5 625&lt;BR /&gt;7 F 1 24 3 0 5 0 2 452&lt;BR /&gt;7 F 1 1 3 0 6 1 1 362&lt;BR /&gt;8 M 5 2 10 1 7 1 2 980&lt;BR /&gt;8 M 1 5 3 1 4 1 1 350&lt;BR /&gt;8 F 1 1 3 0 6 0 99 352&lt;BR /&gt;8 M 5 1 3 1 5 0 1 25&lt;BR /&gt;8 M 1 1 3 1 7 0 1 49&lt;BR /&gt;9 M 1 1 13 1 7 1 5 1225&lt;BR /&gt;9 M 5 4 16 1 7 0 1 122&lt;BR /&gt;9 F 5 2 3 0 7 1 2 98&lt;BR /&gt;9 M 1 1 1 1 7 1 5 126&lt;BR /&gt;10 F 1 66 3 0 6 0 1 54&lt;BR /&gt;10 F 2 1 1 0 5 0 1 25&lt;BR /&gt;10 F 1 2 4 0 6 0 5 450&lt;BR /&gt;10 M 1 3 16 1 7 0 5 408&lt;BR /&gt;11 F 1 1 8 0 7 1 2 196&lt;BR /&gt;11 M 3 1 3 1 7 1 2 196&lt;BR /&gt;11 M 5 5 3 1 6 0 1 72&lt;BR /&gt;11 M 1 2 3 1 7 1 1 245&lt;BR /&gt;12 F 1 9 11 0 7 0 6 196&lt;BR /&gt;12 M 5 2 3 1 5 0 1 125&lt;BR /&gt;12 F 5 13 3 0 7 0 6 150&lt;BR /&gt;12 M 0 2 3 1 7 0 2 98&lt;BR /&gt;13 F 5 3 3 0 5 0 1 215&lt;BR /&gt;13 M 0 4 3 1 5 0 1 625&lt;BR /&gt;13 M 2 25 3 1 7 1 2 784&lt;BR /&gt;13 M 1 1 1 1 7 0 99 480&lt;BR /&gt;13 M 2 27 3 1 7 0 2 725&lt;BR /&gt;;&lt;/P&gt;</description>
      <pubDate>Tue, 08 Nov 2016 21:10:50 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Programming/Deal-with-missing-data-for-the-prediction-model/m-p/310238#M66891</guid>
      <dc:creator>Yurie</dc:creator>
      <dc:date>2016-11-08T21:10:50Z</dc:date>
    </item>
    <item>
      <title>Re: Deal with missing data for the prediction model</title>
      <link>https://communities.sas.com/t5/SAS-Programming/Deal-with-missing-data-for-the-prediction-model/m-p/310289#M66911</link>
      <description>&lt;PRE&gt;
1) PROC MI
2) PROC LOGISTIC or PROC HPSPLIT or PROC GENMOD

&lt;/PRE&gt;</description>
      <pubDate>Wed, 09 Nov 2016 06:13:01 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Programming/Deal-with-missing-data-for-the-prediction-model/m-p/310289#M66911</guid>
      <dc:creator>Ksharp</dc:creator>
      <dc:date>2016-11-09T06:13:01Z</dc:date>
    </item>
    <item>
      <title>Re: Deal with missing data for the prediction model</title>
      <link>https://communities.sas.com/t5/SAS-Programming/Deal-with-missing-data-for-the-prediction-model/m-p/310465#M66967</link>
      <description>&lt;P&gt;Thanks so much. I will try the procedures that you mentioned here.&amp;nbsp;&lt;/P&gt;</description>
      <pubDate>Wed, 09 Nov 2016 17:57:21 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Programming/Deal-with-missing-data-for-the-prediction-model/m-p/310465#M66967</guid>
      <dc:creator>Yurie</dc:creator>
      <dc:date>2016-11-09T17:57:21Z</dc:date>
    </item>
  </channel>
</rss>

