<?xml version="1.0" encoding="UTF-8"?>
<rss xmlns:content="http://purl.org/rss/1.0/modules/content/" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" xmlns:taxo="http://purl.org/rss/1.0/modules/taxonomy/" version="2.0">
  <channel>
    <title>topic Re: How can I encode the class values of a categorical variable into a continuous variable? in SAS Data Science</title>
    <link>https://communities.sas.com/t5/SAS-Data-Science/How-can-I-encode-the-class-values-of-a-categorical-variable-into/m-p/131845#M1143</link>
    <description>&lt;HTML&gt;&lt;HEAD&gt;&lt;/HEAD&gt;&lt;BODY&gt;&lt;P&gt;Hi Tenno.&lt;/P&gt;&lt;P&gt;&lt;/P&gt;&lt;P&gt;You are right to to think carefully around those issues. There is a good chance that using a preliminary regression parameter in a subsequent regression will introduce severe bias into your final inferences. I have done what you are thinking of along the lines of principle components quite successfully, but I would strongly recommend using PROC CORRESP to produce each score.&lt;/P&gt;&lt;/BODY&gt;&lt;/HTML&gt;</description>
    <pubDate>Mon, 03 Dec 2012 00:13:24 GMT</pubDate>
    <dc:creator>Damien_Mather</dc:creator>
    <dc:date>2012-12-03T00:13:24Z</dc:date>
    <item>
      <title>How can I encode the class values of a categorical variable into a continuous variable?</title>
      <link>https://communities.sas.com/t5/SAS-Data-Science/How-can-I-encode-the-class-values-of-a-categorical-variable-into/m-p/131842#M1140</link>
      <description>&lt;HTML&gt;&lt;HEAD&gt;&lt;/HEAD&gt;&lt;BODY&gt;&lt;P&gt;I would like to transform a categorically-valued predictor &lt;SPAN class="il"&gt;variable&lt;/SPAN&gt; into a continuously-valued predictor &lt;SPAN class="il"&gt;variable&lt;/SPAN&gt;. From, say, character class values into real-valued representations of those values. I know that I can do this in several ways: simply by substituting the frequency of a level for the level value itself or by computing the entropy of a level.&lt;/P&gt;&lt;P&gt;&lt;/P&gt;&lt;P&gt;I want to generalize the interpretation of the Information Value of a &lt;SPAN class="il"&gt;variable&lt;/SPAN&gt; from the binary classification "good/bad" application frequently used in credit scoring to a multiclass 1-versus-many representation of the 1-of-N GLM encoding. For example, if there are 3 class values, I would compute the information value of each in turn versus the other two so that, for class labels 'A', 'B', 'C', the three information values would be 'A' vs ('B', 'C'), 'B' vs ('A', 'C') and 'C' vs ('A', 'B') so that I can numerically represent a multiclass &lt;SPAN class="il"&gt;categorical&lt;/SPAN&gt; &lt;SPAN class="il"&gt;variable&lt;/SPAN&gt; as a single real-valued &lt;SPAN class="il"&gt;variable&lt;/SPAN&gt;. I know that there will be only N distinct values produced by this technique, but I will be able to use existing code that works well on continuous-valued &lt;SPAN class="il"&gt;variables&lt;/SPAN&gt;, and I do not know how to incorporate a GLM-encoded &lt;SPAN class="il"&gt;categorical&lt;/SPAN&gt; &lt;SPAN class="il"&gt;variable&lt;/SPAN&gt; into my work.&lt;/P&gt;&lt;P&gt;&lt;/P&gt;&lt;P&gt;Is there a better way than Information Value to transform a &lt;SPAN class="il"&gt;categorical&lt;/SPAN&gt; &lt;SPAN class="il"&gt;variable&lt;/SPAN&gt; into a continuous &lt;SPAN class="il"&gt;variable&lt;/SPAN&gt;?&lt;/P&gt;&lt;P&gt;&lt;/P&gt;&lt;P&gt;How does Enterprise Miner process &lt;SPAN class="il"&gt;categorical&lt;/SPAN&gt; &lt;SPAN class="il"&gt;variables&lt;/SPAN&gt;? Does EM convert a &lt;SPAN class="il"&gt;categorical&lt;/SPAN&gt; &lt;SPAN class="il"&gt;variable&lt;/SPAN&gt; into a real-valued &lt;SPAN class="il"&gt;variable&lt;/SPAN&gt; and then use the real values in splitting a target &lt;SPAN class="il"&gt;variable&lt;/SPAN&gt;?&lt;/P&gt;&lt;/BODY&gt;&lt;/HTML&gt;</description>
      <pubDate>Wed, 21 Nov 2012 16:25:22 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Data-Science/How-can-I-encode-the-class-values-of-a-categorical-variable-into/m-p/131842#M1140</guid>
      <dc:creator>Tenno</dc:creator>
      <dc:date>2012-11-21T16:25:22Z</dc:date>
    </item>
    <item>
      <title>Re: How can I encode the class values of a categorical variable into a continuous variable?</title>
      <link>https://communities.sas.com/t5/SAS-Data-Science/How-can-I-encode-the-class-values-of-a-categorical-variable-into/m-p/131843#M1141</link>
      <description>&lt;HTML&gt;&lt;HEAD&gt;&lt;/HEAD&gt;&lt;BODY&gt;&lt;P&gt;One option is to use the regression node (or something fancier, if you prefer), and predict the target variable just using your single class variable.&amp;nbsp; Then, add a transform variables node that creates a new variable equal to p_target (the result of the model).&amp;nbsp; I also like to drop the various other variables created by the regression.&amp;nbsp; You can also do this outside EM, and paste the result into a transform variables node.&lt;/P&gt;&lt;P&gt;&lt;/P&gt;&lt;P&gt;For examples, you can see my SAS Global Forum paper and/or brainshark presentation:&lt;/P&gt;&lt;P&gt;&lt;/P&gt;&lt;P&gt;&lt;A href="http://support.sas.com/resources/papers/proceedings12/126-2012.pdf"&gt;http://support.sas.com/resources/papers/proceedings12/126-2012.pdf&lt;/A&gt;&lt;/P&gt;&lt;P&gt;&lt;A href="http://my.brainshark.com/Developing-a-Predictive-Model-for-Customer-Trip-Purpose-to-Be-Integrated-into-Enterprise-Strategy-an-409872397"&gt;http://my.brainshark.com/Developing-a-Predictive-Model-for-Customer-Trip-Purpose-to-Be-Integrated-into-Enterprise-Strategy-an-409872397&lt;/A&gt;&lt;/P&gt;&lt;/BODY&gt;&lt;/HTML&gt;</description>
      <pubDate>Sun, 02 Dec 2012 01:23:27 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Data-Science/How-can-I-encode-the-class-values-of-a-categorical-variable-into/m-p/131843#M1141</guid>
      <dc:creator>jlevine</dc:creator>
      <dc:date>2012-12-02T01:23:27Z</dc:date>
    </item>
    <item>
      <title>Re: How can I encode the class values of a categorical variable into a continuous variable?</title>
      <link>https://communities.sas.com/t5/SAS-Data-Science/How-can-I-encode-the-class-values-of-a-categorical-variable-into/m-p/131844#M1142</link>
      <description>&lt;HTML&gt;&lt;HEAD&gt;&lt;/HEAD&gt;&lt;BODY&gt;&lt;P&gt;This is an informed answer. Thank you, Mr. Levine.&lt;/P&gt;&lt;P&gt;&lt;/P&gt;&lt;P&gt;Upon reflection, I could also expand the categorical variable into each of its levels using GLM encoding and create a binary indicator vector for each observation where the class level indicator would be set to 1 and all other indicator values would be set to 0. Then, I could run a principal components analysis on the variable and take the first principal component value, which would represent the projection of the variable along the axis of maximum variance and hence explanatory power.&lt;/P&gt;&lt;P&gt;&lt;/P&gt;&lt;P&gt;Regardless of technique, however, I would have to create a framework (did someone say "Write a SAS macro"?) to apply this technique to every categorical variable to be encoded. But this would be not a significant task&amp;nbsp; to perform.&lt;/P&gt;&lt;P&gt;&lt;/P&gt;&lt;P&gt;A related question is: If I use the target (dependent) variable information in constructing the encoded representation of the categorical variable, am I introducing bias into the solution? Bias would distort the modeling results, and could come from dependencies in the data introduced by sampling, for example. Perhaps using target information is not a recommended practice. What do we think about this in general?&lt;/P&gt;&lt;/BODY&gt;&lt;/HTML&gt;</description>
      <pubDate>Sun, 02 Dec 2012 20:32:53 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Data-Science/How-can-I-encode-the-class-values-of-a-categorical-variable-into/m-p/131844#M1142</guid>
      <dc:creator>Tenno</dc:creator>
      <dc:date>2012-12-02T20:32:53Z</dc:date>
    </item>
    <item>
      <title>Re: How can I encode the class values of a categorical variable into a continuous variable?</title>
      <link>https://communities.sas.com/t5/SAS-Data-Science/How-can-I-encode-the-class-values-of-a-categorical-variable-into/m-p/131845#M1143</link>
      <description>&lt;HTML&gt;&lt;HEAD&gt;&lt;/HEAD&gt;&lt;BODY&gt;&lt;P&gt;Hi Tenno.&lt;/P&gt;&lt;P&gt;&lt;/P&gt;&lt;P&gt;You are right to to think carefully around those issues. There is a good chance that using a preliminary regression parameter in a subsequent regression will introduce severe bias into your final inferences. I have done what you are thinking of along the lines of principle components quite successfully, but I would strongly recommend using PROC CORRESP to produce each score.&lt;/P&gt;&lt;/BODY&gt;&lt;/HTML&gt;</description>
      <pubDate>Mon, 03 Dec 2012 00:13:24 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Data-Science/How-can-I-encode-the-class-values-of-a-categorical-variable-into/m-p/131845#M1143</guid>
      <dc:creator>Damien_Mather</dc:creator>
      <dc:date>2012-12-03T00:13:24Z</dc:date>
    </item>
    <item>
      <title>Re: How can I encode the class values of a categorical variable into a continuous variable?</title>
      <link>https://communities.sas.com/t5/SAS-Data-Science/How-can-I-encode-the-class-values-of-a-categorical-variable-into/m-p/131846#M1144</link>
      <description>&lt;HTML&gt;&lt;HEAD&gt;&lt;/HEAD&gt;&lt;BODY&gt;&lt;P&gt;Thanks, Damien. It is very important not to introduce new errors that may confound the results into a problem which one is trying to solve.&lt;/P&gt;&lt;/BODY&gt;&lt;/HTML&gt;</description>
      <pubDate>Mon, 03 Dec 2012 16:58:16 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Data-Science/How-can-I-encode-the-class-values-of-a-categorical-variable-into/m-p/131846#M1144</guid>
      <dc:creator>Tenno</dc:creator>
      <dc:date>2012-12-03T16:58:16Z</dc:date>
    </item>
  </channel>
</rss>

