BookmarkSubscribeRSS Feed
🔒 This topic is solved and locked. Need further help from the community? Please sign in and ask a new question.
UGAstudent
Calcite | Level 5
Data UGAStudent_DATA;
Set date_difference; 

if cd4 <= '200' then cd4_levels='1';
if cd4 >='201' <='500' then cd4_levels='2';
if cd4 > '500' then cd4_levels='3'; 
run;

proc genmod data= ugastudent_data ;
class cd4_levels ;
model Factor3 = cd4_levels n_Duration_on_ART_months ; 
run;

I'm trying to run the proc genmod command, but when I look at level 3, it has 0s across the board but levels 1 and 2 have values. How do I get my level 3 data to show up? Its like the data is missing or something

Screen Shot 2018-02-05 at 9.13.14 PM.png

1 ACCEPTED SOLUTION

Accepted Solutions
Reeza
Super User

When you include a CLASS statement it automatically creates the dummy variables. But how the dummy variables are created varies. IN PROC LOGISTIC you have more control than in PROC GLM so you may be better off creating them manually. Check the docs on the CLASS statement for the options to set the method for creating a dummy variable.

View solution in original post

9 REPLIES 9
Reeza
Super User

That's the correct results for a categorical variable, only N-1 levels will have estimates. If you have 3 levels, you only need two variables to identify the third, so you only have estimates for two variables. Review dummy variables and the degrees of freedom in your statistics textbook. 

In matrix language, they would be linear combinations of each other.

 

You can use different dummy coding, but it depends on what you're trying to test. 

 


@UGAstudent wrote:
Data UGAStudent_DATA;
Set date_difference; 

if cd4 <= '200' then cd4_levels='1';
if cd4 >='201' <='500' then cd4_levels='2';
if cd4 > '500' then cd4_levels='3'; 
run;

proc genmod data= ugastudent_data ;
class cd4_levels ;
model Factor3 = cd4_levels n_Duration_on_ART_months ; 
run;

I'm trying to run the proc genmod command, but when I look at level 3, it has 0s across the board but levels 1 and 2 have values. How do I get my level 3 data to show up? Its like the data is missing or something

Screen Shot 2018-02-05 at 9.13.14 PM.png


 

Astounding
PROC Star

Another issue to consider ... is CD4 numeric or character?

 

If CD4 is numeric, there should not be quotes around '200' and '201' and '500'.  

 

If CD4 is character, the comparisons you are asking for may be different than what you expect.  As character strings:

 

'51' and '60' are both greater than '500'

 

'1000' is less than '200'

 

I can't tell what is in your data, but I can tell that these comparisons are suspect.

UGAstudent
Calcite | Level 5
Hi,
CD4 is numerical (as in HIV CD4 count). I removed the quotations around the 200, 201, and 500, and everything seems to work alright. I know how to create a dummy variable for a logistical regression, but I’m not sure if its the same process.


Thanks again !

Reeza
Super User
If your question is answered please mark it as solved.
UGAstudent
Calcite | Level 5

 

Hi Reeza,

 

CD4 is numerical (as in HIV CD4 count). I removed the quotations around the 200, 201, and 500, and everything seems to work alright. I know how to create a dummy variable for a logistical regression, but I’m not sure if its the same process for this command. 


Thanks again !

Reeza
Super User

When you include a CLASS statement it automatically creates the dummy variables. But how the dummy variables are created varies. IN PROC LOGISTIC you have more control than in PROC GLM so you may be better off creating them manually. Check the docs on the CLASS statement for the options to set the method for creating a dummy variable.

UGAstudent
Calcite | Level 5

 

 

Hi I've looked through the class and proc statements online information and tried to look up some things regarding dummy variables but I was not able to find a conclusion. Im not sure what the code should look like so I could start having information show up in the 3rd level created. What am I doing wrong?

 

Data UGAStudent_DATA;
Set date_difference; 

if cd4 <= 200 then cd4_levels='1';
if cd4 >=201 <=500 then cd4_levels='2';
if cd4 > 500 then cd4_levels='3'; 
run;

proc genmod data= ugastudent_data;
class cd4_levels /param=glm ;
model  Factor1 =  cd4_levels n_Duration_on_ART_months ;  
run;
Ksharp
Super User
class cd4_levels /param=glm ;
UGAstudent
Calcite | Level 5

CD4 is numerical (as in HIV CD4 count). I removed the quotations around the 200, 201, and 500, and everything seems to work alright. I know how to create a dummy variable for a logistical regression, but I’m not sure if its the same process.


Thanks again !
 

sas-innovate-wordmark-2025-midnight.png

Register Today!

Join us for SAS Innovate 2025, our biggest and most exciting global event of the year, in Orlando, FL, from May 6-9. Sign up by March 14 for just $795.


Register now!

What is Bayesian Analysis?

Learn the difference between classical and Bayesian statistical approaches and see a few PROC examples to perform Bayesian analysis in this video.

Find more tutorials on the SAS Users YouTube channel.

SAS Training: Just a Click Away

 Ready to level-up your skills? Choose your own adventure.

Browse our catalog!

Discussion stats
  • 9 replies
  • 1743 views
  • 0 likes
  • 4 in conversation