<?xml version="1.0" encoding="UTF-8"?>
<rss xmlns:content="http://purl.org/rss/1.0/modules/content/" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" xmlns:taxo="http://purl.org/rss/1.0/modules/taxonomy/" version="2.0">
  <channel>
    <title>topic Re: Set dummy variables in SAS Programming</title>
    <link>https://communities.sas.com/t5/SAS-Programming/Set-dummy-variables/m-p/832770#M329188</link>
    <description>&lt;P&gt;You have 4 race categories, but only 3 degrees of freedom among them.&amp;nbsp; If you know the values of NWH, NWB, and HISP you automatically know the value for OTHERRACE - or more generally if you know three of the race dummies, you know what the fourth must be.&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;It appears that proc logistic is implicitly setting OTHERRACE as the reference condition, and all the race-variable beta's will be with respect to otherrace.&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;You could drop otherrace from the model specification, which should eliminate the note, but get the same parameter estimates.&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;But if you want NWH as the reference, drop it instead and keep the other three.&lt;/P&gt;</description>
    <pubDate>Sun, 11 Sep 2022 18:16:29 GMT</pubDate>
    <dc:creator>mkeintz</dc:creator>
    <dc:date>2022-09-11T18:16:29Z</dc:date>
    <item>
      <title>Set dummy variables</title>
      <link>https://communities.sas.com/t5/SAS-Programming/Set-dummy-variables/m-p/832767#M329185</link>
      <description>&lt;P&gt;hi I want to put age, height, BMI, race, and history of hormone therapy into my logistic regression model, how should I write the code?&lt;/P&gt;&lt;P&gt;&lt;STRONG&gt;age, height&lt;/STRONG&gt; are numerical&lt;/P&gt;&lt;P&gt;&lt;STRONG&gt;BMI&lt;/STRONG&gt;&lt;/P&gt;&lt;TABLE cellspacing="0" cellpadding="0"&gt;&lt;TBODY&gt;&lt;TR&gt;&lt;TD&gt;&lt;P class=""&gt;1="0-18.5" 2="18.5-25" 3="25-30" 4="30+" 5=missing&lt;SPAN class=""&gt;&amp;nbsp;&lt;/SPAN&gt;&lt;/P&gt;&lt;/TD&gt;&lt;/TR&gt;&lt;/TBODY&gt;&lt;/TABLE&gt;&lt;P&gt;&lt;STRONG&gt;race&lt;/STRONG&gt; I set as:&lt;/P&gt;&lt;P class=""&gt;NHW =(race = &lt;SPAN class=""&gt;&lt;STRONG&gt;1&lt;/STRONG&gt;&lt;/SPAN&gt;);&lt;/P&gt;&lt;P class=""&gt;NHB =(race = &lt;SPAN class=""&gt;&lt;STRONG&gt;2&lt;/STRONG&gt;&lt;/SPAN&gt;);&lt;/P&gt;&lt;P class=""&gt;hisp =(race = &lt;SPAN class=""&gt;&lt;STRONG&gt;3&lt;/STRONG&gt;&lt;/SPAN&gt;);&lt;/P&gt;&lt;P class=""&gt;otherrace = (race = &lt;SPAN class=""&gt;&lt;STRONG&gt;4&lt;/STRONG&gt;&lt;/SPAN&gt;) and (race = &lt;SPAN class=""&gt;&lt;STRONG&gt;5&lt;/STRONG&gt;&lt;/SPAN&gt;) and (race= &lt;SPAN class=""&gt;&lt;STRONG&gt;6&lt;/STRONG&gt;&lt;/SPAN&gt;);&lt;/P&gt;&lt;P class=""&gt;and I want NHW as reference.&lt;/P&gt;&lt;P class=""&gt;&lt;STRONG&gt;hormone&lt;/STRONG&gt;&lt;/P&gt;&lt;TABLE cellspacing="0" cellpadding="0"&gt;&lt;TBODY&gt;&lt;TR&gt;&lt;TD&gt;&lt;P class=""&gt;.F="No Form" .G="Wrong Gender" .M="Not Answered" 0="No" 1="Yes" 2="Don't Know"&lt;SPAN class=""&gt;&amp;nbsp;&lt;/SPAN&gt;&lt;/P&gt;&lt;/TD&gt;&lt;/TR&gt;&lt;/TBODY&gt;&lt;/TABLE&gt;&lt;P&gt;and I set hormone as&amp;nbsp;&lt;SPAN class=""&gt;dummy variables&lt;/SPAN&gt;&lt;/P&gt;&lt;P class=""&gt;hormone0 =(horm_f = &lt;SPAN class=""&gt;&lt;STRONG&gt;0&lt;/STRONG&gt;&lt;/SPAN&gt;);&lt;SPAN class=""&gt;/*No*/&lt;/SPAN&gt;&lt;/P&gt;&lt;P class=""&gt;hormone1 =(horm_f = &lt;SPAN class=""&gt;&lt;STRONG&gt;1&lt;/STRONG&gt;&lt;/SPAN&gt;);&lt;SPAN class=""&gt;/*Yes*/&lt;/SPAN&gt;&lt;/P&gt;&lt;P class=""&gt;&amp;nbsp;&lt;/P&gt;&lt;P class=""&gt;So my codes for logistic regression:&lt;/P&gt;&lt;PRE&gt;&lt;CODE class=""&gt;proc logistic data = e.tmp;
where em_f in (0,1);
model emf (event = '1')= lm_cat age height BMI NHB hisp otherrace hormone0 hormone1/cl;
run;&lt;/CODE&gt;&lt;/PRE&gt;&lt;P class=""&gt;But the results showed:&lt;/P&gt;&lt;TABLE cellspacing="0" cellpadding="0"&gt;&lt;TBODY&gt;&lt;TR&gt;&lt;TD&gt;&lt;P class=""&gt;Note:&lt;/P&gt;&lt;/TD&gt;&lt;TD&gt;&lt;P class=""&gt;The following parameters have been set to 0, since the variables are a linear combination of other variables as shown&lt;/P&gt;&lt;/TD&gt;&lt;/TR&gt;&lt;/TBODY&gt;&lt;/TABLE&gt;&lt;TABLE cellspacing="0" cellpadding="0"&gt;&lt;TBODY&gt;&lt;TR&gt;&lt;TD&gt;&lt;P class=""&gt;&lt;STRONG&gt;otherrace =&lt;/STRONG&gt;&lt;/P&gt;&lt;/TD&gt;&lt;TD&gt;&lt;P class=""&gt;0&lt;/P&gt;&lt;/TD&gt;&lt;/TR&gt;&lt;/TBODY&gt;&lt;/TABLE&gt;&lt;P class=""&gt;&amp;nbsp;&lt;/P&gt;&lt;P class=""&gt;I don't know why. Can anyone help me?&lt;/P&gt;</description>
      <pubDate>Sun, 11 Sep 2022 17:21:17 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Programming/Set-dummy-variables/m-p/832767#M329185</guid>
      <dc:creator>greenie</dc:creator>
      <dc:date>2022-09-11T17:21:17Z</dc:date>
    </item>
    <item>
      <title>Re: Set dummy variables</title>
      <link>https://communities.sas.com/t5/SAS-Programming/Set-dummy-variables/m-p/832769#M329187</link>
      <description>&lt;P&gt;With SAS you and Proc Logistic, indeed many regression procedures, you do not need to "set dummy variables". Categorical variables belong on a CLASS statement. SAS will create any internal dummies needed for calculations.&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;You would indicate which is the reference level you want on the CLASS statement.&lt;/P&gt;
&lt;P&gt;The CLASS statement must come before the model statement.&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;For your "race" variable I would suggest creating a custom format and use that for creating a group like "otherrace"&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;PRE&gt;proc format;
value myrace
1='NHW'
2='NHB'
3='Hisp'
4,5,6 = 'Other race'
;

Proc logistic data=e.tmp;
   where em_f in (0,1);
   class race (ref='NHW')
         horm_f
         bmi
   ;
   format race myrace. ;
   model emf (event = '1') = lm_cat age height race horm_f bmi /cl
   ;
run;&lt;/PRE&gt;
&lt;P&gt;Variables on the Class statement, if you want to specify the reference level use the FORMATTED value in the (Ref= ) option.&lt;/P&gt;
&lt;P&gt;Since I don't have your formats are data I can't test some possible problems in the code you posted.&lt;/P&gt;
&lt;P&gt;Such as the above code assumes there is a variable named EM_F that is not the variable on the Model statement.&lt;/P&gt;
&lt;P&gt;One might guess that your LM_CAT variable is also some sort of categorical variable. If so it also likely belongs on the CLASS statement.&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;The particular error you show I would expect to see if you actually had included the NHW variable you describe on the model statement.&lt;/P&gt;
&lt;P&gt;The Otherrace is indeed dependent on the other "race" variables you created. The way you created them otherrace would be 1 only when all the others are 0, and 0 only when one of the others is a 1. So it is a linear combination of the other race variables.&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;</description>
      <pubDate>Sun, 11 Sep 2022 18:12:54 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Programming/Set-dummy-variables/m-p/832769#M329187</guid>
      <dc:creator>ballardw</dc:creator>
      <dc:date>2022-09-11T18:12:54Z</dc:date>
    </item>
    <item>
      <title>Re: Set dummy variables</title>
      <link>https://communities.sas.com/t5/SAS-Programming/Set-dummy-variables/m-p/832770#M329188</link>
      <description>&lt;P&gt;You have 4 race categories, but only 3 degrees of freedom among them.&amp;nbsp; If you know the values of NWH, NWB, and HISP you automatically know the value for OTHERRACE - or more generally if you know three of the race dummies, you know what the fourth must be.&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;It appears that proc logistic is implicitly setting OTHERRACE as the reference condition, and all the race-variable beta's will be with respect to otherrace.&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;You could drop otherrace from the model specification, which should eliminate the note, but get the same parameter estimates.&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;But if you want NWH as the reference, drop it instead and keep the other three.&lt;/P&gt;</description>
      <pubDate>Sun, 11 Sep 2022 18:16:29 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Programming/Set-dummy-variables/m-p/832770#M329188</guid>
      <dc:creator>mkeintz</dc:creator>
      <dc:date>2022-09-11T18:16:29Z</dc:date>
    </item>
    <item>
      <title>Re: Set dummy variables</title>
      <link>https://communities.sas.com/t5/SAS-Programming/Set-dummy-variables/m-p/832780#M329190</link>
      <description>&lt;P&gt;Thank you. Yeah I want to set NHW as the ref. So I didn't include NHW in my model, but I keep other three there (NHB, hisp, Otherrace). I think SAS would take NHW as ref, but why it takes Otherrace as ref?&lt;/P&gt;</description>
      <pubDate>Sun, 11 Sep 2022 18:38:45 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Programming/Set-dummy-variables/m-p/832780#M329190</guid>
      <dc:creator>greenie</dc:creator>
      <dc:date>2022-09-11T18:38:45Z</dc:date>
    </item>
    <item>
      <title>Re: Set dummy variables</title>
      <link>https://communities.sas.com/t5/SAS-Programming/Set-dummy-variables/m-p/832781#M329191</link>
      <description>&lt;BLOCKQUOTE&gt;&lt;HR /&gt;&lt;a href="https://communities.sas.com/t5/user/viewprofilepage/user-id/429369"&gt;@greenie&lt;/a&gt;&amp;nbsp;wrote:&lt;BR /&gt;
&lt;P&gt;Thank you. Yeah I want to set NHW as the ref. So I didn't include NHW in my model, but I keep other three there (NHB, hisp, Otherrace). I think SAS would take NHW as ref, but why it takes Otherrace as ref?&lt;/P&gt;
&lt;HR /&gt;&lt;/BLOCKQUOTE&gt;
&lt;P&gt;I'm just guessing but I suppose that proc logistic takes the rightmost predictor listed as the reference when that predictor is a linear combination of predictors to its left.&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;This is the thing about SAS and many other computer programming tasks.&amp;nbsp; The computer is always ready for you to experiment to answer such questions as this.&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;But frankly I think you might be better off using&amp;nbsp;&lt;a href="https://communities.sas.com/t5/user/viewprofilepage/user-id/13884"&gt;@ballardw&lt;/a&gt;'s suggestion, because its syntax allows you to clearly specify the reference category, and also avoid SAS needing to notify you that you have a linear relationship among the 4 final categories.&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;My comments were more directed at WHY you got that note.&lt;/P&gt;</description>
      <pubDate>Sun, 11 Sep 2022 19:59:39 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Programming/Set-dummy-variables/m-p/832781#M329191</guid>
      <dc:creator>mkeintz</dc:creator>
      <dc:date>2022-09-11T19:59:39Z</dc:date>
    </item>
    <item>
      <title>Re: Set dummy variables</title>
      <link>https://communities.sas.com/t5/SAS-Programming/Set-dummy-variables/m-p/832786#M329194</link>
      <description>&lt;BLOCKQUOTE&gt;&lt;HR /&gt;&lt;a href="https://communities.sas.com/t5/user/viewprofilepage/user-id/429369"&gt;@greenie&lt;/a&gt;&amp;nbsp;wrote:&lt;BR /&gt;
&lt;P class=""&gt;otherrace = (race = &lt;SPAN class=""&gt;&lt;STRONG&gt;4&lt;/STRONG&gt;&lt;/SPAN&gt;) and (race = &lt;SPAN class=""&gt;&lt;STRONG&gt;5&lt;/STRONG&gt;&lt;/SPAN&gt;) and (race= &lt;SPAN class=""&gt;&lt;STRONG&gt;6&lt;/STRONG&gt;&lt;/SPAN&gt;);&lt;/P&gt;
&lt;HR /&gt;&lt;/BLOCKQUOTE&gt;
&lt;P&gt;If this is really how you coded &lt;FONT face="courier new,courier"&gt;otherrace&lt;/FONT&gt;, i.e., with ANDs rather than ORs, then this "predictor" is constantly zero. Which I think is what the log means by&lt;/P&gt;
&lt;BLOCKQUOTE&gt;&lt;HR /&gt;
&lt;TABLE cellspacing="0" cellpadding="0"&gt;
&lt;TBODY&gt;
&lt;TR&gt;
&lt;TD&gt;
&lt;P class=""&gt;Note:&lt;/P&gt;
&lt;/TD&gt;
&lt;TD&gt;
&lt;P class=""&gt;The following parameters have been set to 0, since the variables are a linear combination of other variables as shown&lt;/P&gt;
&lt;/TD&gt;
&lt;/TR&gt;
&lt;/TBODY&gt;
&lt;/TABLE&gt;
&lt;TABLE cellspacing="0" cellpadding="0"&gt;
&lt;TBODY&gt;
&lt;TR&gt;
&lt;TD&gt;
&lt;P class=""&gt;&lt;STRONG&gt;otherrace =&lt;/STRONG&gt;&lt;/P&gt;
&lt;/TD&gt;
&lt;TD&gt;
&lt;P class=""&gt;0&lt;/P&gt;
&lt;/TD&gt;
&lt;/TR&gt;
&lt;/TBODY&gt;
&lt;/TABLE&gt;
&lt;P class=""&gt;&amp;nbsp;&lt;/P&gt;
&lt;HR /&gt;&lt;/BLOCKQUOTE&gt;
&lt;P&gt;(0 as a trivial "linear combination" of zero or more variables).&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;As&amp;nbsp;&lt;a href="https://communities.sas.com/t5/user/viewprofilepage/user-id/13884"&gt;@ballardw&lt;/a&gt;&amp;nbsp;wrote, you should really use the CLASS statement for the categorical predictors. It was a great improvement of PROC LOGISTIC when this statement was &lt;A href="https://v8doc.sas.com/sashtml/stat/chap1/sect16.htm" target="_blank" rel="noopener"&gt;introduced in SAS version 8&lt;/A&gt;.&lt;/P&gt;</description>
      <pubDate>Sun, 11 Sep 2022 21:59:15 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Programming/Set-dummy-variables/m-p/832786#M329194</guid>
      <dc:creator>FreelanceReinh</dc:creator>
      <dc:date>2022-09-11T21:59:15Z</dc:date>
    </item>
  </channel>
</rss>

