BookmarkSubscribeRSS Feed
iressa1313
Calcite | Level 5

I have a categorical independent variable (phenotype3) with 3 levels and a continuous dependent variable. I have created a dummy variable so that I can compare level 0 (my reference) to both levels 1 and 2 combined as well as individually.  When I use proc GLM to compare both levels 1 and 2 to level 0, I get a significant result. But every time I do a post hoc analysis, the result comes back not significant for each of the 3 combinations which doesn’t make sense. Also when I use the original variable and not the dummy variables in order to look at the 3-way relationship, it comes back as not significant.

 

I have included my code for the dummy coding, the model and my output in case that helps. 

 

**Dummy coding**:

 

data Dummy;
set uniquelacfinal;
 **parsing phenotype into MDD vs HC**;
if phenotype = 2 then MDDtotal = 1;
if phenotype = 1 then MDDtotal = 1;
 else if phenotype in(0) then MDDtotal = 0;
 **parsing into MDD only and TRD only**;
if phenotype = 2 then TRD = 1;
 else if phenotype = 1 then TRD = 0;
 else if phenotype = 0 then TRD = -1;
if phenotype = 1 then MDD = 1;
 else if phenotype = 2 then MDD = 0;
 else if phenotype = 0 then MDD = -1;

run;



 

 

**Comparing both level 2 and 3  to level 1**;

 

proc glm data= dummy;
model loglac= mddtotal age sex;
run;

 

 

 

RESULTS: you can see MDDtotal is significant

**Trying to look at level 1 and 2 vs level 0 separately**;

 

Proc glm data = dummy;
Model loglac = MDD TRD age sex ;
means mdd trd /hovtest;
means mdd trd / lsd waller tukey regwq;
run; 

 

 

 

RESULTS: not significant but I feel my code is wrong and I dont get an estimate for TRD

 

 

 

And when I use the original data set and the original phenotype3 variable, it shows not significant also.  

 

 

ods graphics on;
proc glm data=uniquelacfinal plot=diagnostics;;
   class phenotype3;
   model loglac = phenotype3 age sex;
   means phenotype3 / hovtest welch;
run;
ods graphics off;

 

 

Is my code wrong or is there more likely a problem with my dataset? Thanks for any help!

5 REPLIES 5
PaigeMiller
Diamond | Level 26

You didn't show us the output ... so its hard to say what is going on, other than these statistical tests don't necessarily have to agree.

 

I would not bother creating your own dummy variables. I would simply use the CLASS statement so that SAS will create its own dummy variables behind the scene, and then use that. We KNOW 100% for sure that SAS will create the proper dummy variables, and I haven't gone through your code (and I won't go through your code because using SAS's own dummy variables is the right way to go) to see if you have done it properly. Then the MEANS or LSMEANS will give you comparisons of level 2 to level 0, or level 1 to level 0.

--
Paige Miller
iressa1313
Calcite | Level 5
Hello and thank you for the prompt reply. My output must have been removed when posting. At first I did allow sas to create its own dummy variables and used the means command to compare. I included that code at the bottom of my post. The results I get from this analysis say there is no significance, however, when I manually create a new dataset so that phenotype3 is a binary variable, with levels 1 and 2 combined into one group, and then compare that group to group 0, the results are significant. I hope that comes across clearer than my original post. Could it have something to do with a decreased sample size?
PaigeMiller
Diamond | Level 26

I'll reserve comment until I can see the analysis output using SAS's dummy variables created by the CLASS statement. (So I need to see the code and the output)

--
Paige Miller
Reeza
Super User

The GLM proc uses the variable phenotype3 but first set of code uses phenotype to create the dummy variable. 

 

data Dummy;
set uniquelacfinal;
 **parsing phenotype into MDD vs HC**;
if phenotype = 2 then MDDtotal = 1;
if phenotype = 1 then MDDtotal = 1;
 else if phenotype in(0) then MDDtotal = 0;
 **parsing into MDD only and TRD only**;
if phenotype = 2 then TRD = 1;
 else if phenotype = 1 then TRD = 0;
 else if phenotype = 0 then TRD = -1;
if phenotype = 1 then MDD = 1;
 else if phenotype = 2 then MDD = 0;
 else if phenotype = 0 then MDD = -1;

run;
ods graphics on;
proc glm data=uniquelacfinal plot=diagnostics;;
   class phenotype3;
   model loglac = phenotype3 age sex;
   means phenotype3 / hovtest welch;
run;
ods graphics off;

When you use PROC GLM it outputs the design matrix. Did you compare this to how you dummy coded the variables? 

PaigeMiller
Diamond | Level 26

@Reeza wrote:

When you use PROC GLM it outputs the design matrix. Did you compare this to how you dummy coded the variables? 


There are dozens of equivalent ways to code dummy variables, so comparison might not help.

--
Paige Miller

Ready to join fellow brilliant minds for the SAS Hackathon?

Build your skills. Make connections. Enjoy creative freedom. Maybe change the world. Registration is now open through August 30th. Visit the SAS Hackathon homepage.

Register today!

Click image to register for webinarClick image to register for webinar

Classroom Training Available!

Select SAS Training centers are offering in-person courses. View upcoming courses for:

View all other training opportunities.

Discussion stats
  • 5 replies
  • 489 views
  • 0 likes
  • 3 in conversation