Programming the statistical procedures from SAS

regression with 2 dummy types

Accepted Solution Solved
Reply
Super Contributor
Posts: 425
Accepted Solution

regression with 2 dummy types

[ Edited ]

Hi,

 

when regressing data that contains dummy variables we omit one of the dummies and then the coefficients that SAS outputs for the dummies is the difference of the effect on the dependent variable of that given dummy less the ommitted dummy, this is straighforward.

 

But what should be done when there are 2 tyes of dummies: suppose that there are dummies A1-A4 and B1-B4. The A and B categories are independent of each other, so I wan to omit A4 in order to study the effect of A1-A3 compared to A4, and to ommit B4 in order to study the effect of B1-B3 compared to B4. But when I ommit A4 and B4, how does SAS know (or how is it possible to make it know) that A4 is related only to A1-A3 and B4 only to B1-B3. 

 

Just a little more ilustration, suppose I have dummies New York, Chicago, Los Angeles and dummies Summer, Winter, Fall, Spring - If I ommit Spring and Los Angeles, it is nonsensical to compare say Chicago with Spring and Fall with Los Aangeles

 

Thanks!


Accepted Solutions
Solution
‎03-26-2017 08:41 AM
SAS Super FREQ
Posts: 3,547

Re: regression with 2 dummy types

You should review how dummy variables represent levels of a categorical variable. Dummy variables merely indicate which of k categories each observation belongs to.  The act of omitting the k+th dummy variable  avoids creating a linear dependent variables because if an observation is not one of the first (k-1) levels, it must belong to the k_th.

 

I recommend that you use the ideas in the link above to let SAS generate the dummy variables for you. Or better yet, avoid  dummy variables and use the CLASS statement, which is easier to interpret.

View solution in original post


All Replies
Super Contributor
Posts: 339

Re: regression with 2 dummy types

What you should do depends on what you would like to do. You only have to make sure that you leave out either the intercept or one dummy variable. One approach would be to keep the intercept, define A1=1 as base scenario. Then you would have: Intercept=1; leave out A1=spring(?); A2=1 if summer, 0 otherwise;A3=1 if autumn, 0 otherwise; A4=1 if winter, 0 otherwise ; B1=1 if NY, 0 otherwise, B2=1 if Chicago, 0 otherwise; ..

Your model for proc reg (y=dep. var) would be: y = A2--B4 .. ;

Solution
‎03-26-2017 08:41 AM
SAS Super FREQ
Posts: 3,547

Re: regression with 2 dummy types

You should review how dummy variables represent levels of a categorical variable. Dummy variables merely indicate which of k categories each observation belongs to.  The act of omitting the k+th dummy variable  avoids creating a linear dependent variables because if an observation is not one of the first (k-1) levels, it must belong to the k_th.

 

I recommend that you use the ideas in the link above to let SAS generate the dummy variables for you. Or better yet, avoid  dummy variables and use the CLASS statement, which is easier to interpret.

Super Contributor
Posts: 425

Re: regression with 2 dummy types

[ Edited ]

Hi Rick,

 

Glad to know that you have another blog! (I subscribed to it as well).

 

Just a small question,

 

In the example the you use, suppose that I also want to include the continous variables height and weight (in addition to the other 2 categorical dummy types). In such a case would I just have to add these variables into the model in the following way:

 

/* same analysis by using the CLASS statement */
proc glm data=Patients;
   class sex BP_Status;              /* generates dummy variables internally */
   model Cholesterol = Sex BP_Status HEIGHT WEIGHT / solution;
   ods select ParameterEstimates;
quit;

Thanks!

SAS Super FREQ
Posts: 3,547

Re: regression with 2 dummy types

That is correct. Just add the continuous variables to the MODEL statement.

 

I am confused by your statement about "another blog." I only write one blog, and it is located at http://blogs.sas.com/content/iml

 

Super Contributor
Posts: 425

Re: regression with 2 dummy types

Silly me, I didn't realize it was the old DO LOOP blog but with a new appearance!

 

And thatnks for the answer!

☑ This topic is solved.

Need further help from the community? Please ask a new question.

Discussion stats
  • 5 replies
  • 185 views
  • 2 likes
  • 3 in conversation