<?xml version="1.0" encoding="UTF-8"?>
<rss xmlns:content="http://purl.org/rss/1.0/modules/content/" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" xmlns:taxo="http://purl.org/rss/1.0/modules/taxonomy/" version="2.0">
  <channel>
    <title>topic Re: predictive modelling / handling missing values in SAS Programming</title>
    <link>https://communities.sas.com/t5/SAS-Programming/predictive-modelling-handling-missing-values/m-p/264950#M52031</link>
    <description>&lt;P&gt;Thank-you! &amp;nbsp;Yes, I will use a negative value (this website had told me that my submission of this question had been unsuccessful and so &amp;nbsp;- so I was surprised to receive this reponse here! ...) &amp;nbsp;&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;&lt;STRONG&gt;Edit&lt;/STRONG&gt;: I should also note that&amp;nbsp;for regression I made sure to include interaction terms for variables 4 and 5 and 4 and 6.&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;</description>
    <pubDate>Thu, 28 Apr 2016 12:55:30 GMT</pubDate>
    <dc:creator>mduarte</dc:creator>
    <dc:date>2016-04-28T12:55:30Z</dc:date>
    <item>
      <title>predictive modelling / handling missing values</title>
      <link>https://communities.sas.com/t5/SAS-Programming/predictive-modelling-handling-missing-values/m-p/264724#M51974</link>
      <description>&lt;P&gt;This question is an overlap of methodology and SAS programming - hopefully it fits here ...&amp;nbsp;&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;I wish to build a predictive model&amp;nbsp;with explanatory variables that have different types of missing values.&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;e.g. (this is made up)&lt;/P&gt;&lt;P&gt;Response Variable: Primary policy holder of a&amp;nbsp;current&amp;nbsp;insurance policy&amp;nbsp;purchases an additional insurance benefit (add-on)&lt;/P&gt;&lt;P&gt;Explanatory Variable 1: Number of customers on policy (no missing values)&lt;/P&gt;&lt;P&gt;Explanatory Variable 2: How the current policy purchased (sales channel) (Missing values = unknown)&lt;/P&gt;&lt;P&gt;Explanatory Variable 3: Country of origin (can include missing values. &amp;nbsp;Missing values = unknown)&lt;/P&gt;&lt;P&gt;Explanatory Variable 4: How many claims has the customer made (0+) (no missing values)&lt;/P&gt;&lt;P&gt;Explanatory Variable 5: Maximum settlement time of claims made (if missing, this is becaue no claims were made) &amp;nbsp;&lt;/P&gt;&lt;P&gt;Explanatory Variable 6: Maximum claim amount (if missing, this is because no claims were made)&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;Is there a way to distinguish the "missing value" in explanatory variables 5 and 6 (because it is not applicable) as distinct to the missing values in explanatory variables 2 and 3?&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;Effectively, I want to consider explanatory variable 5 missing category as a category but those in variables 2 and 3 as missing.&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;My first step was to use hpsplit to gauge what interactions to include in a logistic regression model (as per paper: &lt;A href="http://www.mwsug.org/proceedings/2012/SA/MWSUG-2012-SA01.pdf" target="_self"&gt;Methods for Interaction Detection in Predictive Modeling Using SAS&amp;nbsp;&lt;/A&gt;) using hpsplit. &amp;nbsp;I see that SAS has special missing values (.a - .z) however it doesn't seem that hpsplit treats them differently. &amp;nbsp; &amp;nbsp;It seems there is a blanket treatment for all missing values via assignmissing=BRANCH|NONE|POPULAR|SIMILAR (from&amp;nbsp;SAS/STAT® 14.1 User’s Guide The HPSPLIT Procedure)&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;Any suggestions would be greatly appreciated on how to handle such missing values / interrelated variables.&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;Thanks&lt;/P&gt;</description>
      <pubDate>Tue, 19 Apr 2016 07:15:33 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Programming/predictive-modelling-handling-missing-values/m-p/264724#M51974</guid>
      <dc:creator>mduarte</dc:creator>
      <dc:date>2016-04-19T07:15:33Z</dc:date>
    </item>
    <item>
      <title>Re: predictive modelling / handling missing values</title>
      <link>https://communities.sas.com/t5/SAS-Programming/predictive-modelling-handling-missing-values/m-p/264759#M51985</link>
      <description>&lt;P&gt;How about you use a value like 0 or -1 for the missing values you want to use?&lt;/P&gt;</description>
      <pubDate>Tue, 19 Apr 2016 11:07:57 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Programming/predictive-modelling-handling-missing-values/m-p/264759#M51985</guid>
      <dc:creator>ChrisNZ</dc:creator>
      <dc:date>2016-04-19T11:07:57Z</dc:date>
    </item>
    <item>
      <title>Re: predictive modelling / handling missing values</title>
      <link>https://communities.sas.com/t5/SAS-Programming/predictive-modelling-handling-missing-values/m-p/264950#M52031</link>
      <description>&lt;P&gt;Thank-you! &amp;nbsp;Yes, I will use a negative value (this website had told me that my submission of this question had been unsuccessful and so &amp;nbsp;- so I was surprised to receive this reponse here! ...) &amp;nbsp;&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;&lt;STRONG&gt;Edit&lt;/STRONG&gt;: I should also note that&amp;nbsp;for regression I made sure to include interaction terms for variables 4 and 5 and 4 and 6.&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;</description>
      <pubDate>Thu, 28 Apr 2016 12:55:30 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Programming/predictive-modelling-handling-missing-values/m-p/264950#M52031</guid>
      <dc:creator>mduarte</dc:creator>
      <dc:date>2016-04-28T12:55:30Z</dc:date>
    </item>
  </channel>
</rss>

