Data UGAStudent_DATA; Set date_difference; if cd4 <= '200' then cd4_levels='1'; if cd4 >='201' <='500' then cd4_levels='2'; if cd4 > '500' then cd4_levels='3'; run; proc genmod data= ugastudent_data ; class cd4_levels ; model Factor3 = cd4_levels n_Duration_on_ART_months ; run;
I'm trying to run the proc genmod command, but when I look at level 3, it has 0s across the board but levels 1 and 2 have values. How do I get my level 3 data to show up? Its like the data is missing or something
When you include a CLASS statement it automatically creates the dummy variables. But how the dummy variables are created varies. IN PROC LOGISTIC you have more control than in PROC GLM so you may be better off creating them manually. Check the docs on the CLASS statement for the options to set the method for creating a dummy variable.
That's the correct results for a categorical variable, only N-1 levels will have estimates. If you have 3 levels, you only need two variables to identify the third, so you only have estimates for two variables. Review dummy variables and the degrees of freedom in your statistics textbook.
In matrix language, they would be linear combinations of each other.
You can use different dummy coding, but it depends on what you're trying to test.
@UGAstudent wrote:
Data UGAStudent_DATA; Set date_difference; if cd4 <= '200' then cd4_levels='1'; if cd4 >='201' <='500' then cd4_levels='2'; if cd4 > '500' then cd4_levels='3'; run; proc genmod data= ugastudent_data ; class cd4_levels ; model Factor3 = cd4_levels n_Duration_on_ART_months ; run;I'm trying to run the proc genmod command, but when I look at level 3, it has 0s across the board but levels 1 and 2 have values. How do I get my level 3 data to show up? Its like the data is missing or something
Another issue to consider ... is CD4 numeric or character?
If CD4 is numeric, there should not be quotes around '200' and '201' and '500'.
If CD4 is character, the comparisons you are asking for may be different than what you expect. As character strings:
'51' and '60' are both greater than '500'
'1000' is less than '200'
I can't tell what is in your data, but I can tell that these comparisons are suspect.
Hi Reeza,
CD4 is numerical (as in HIV CD4 count). I removed the quotations around the 200, 201, and 500, and everything seems to work alright. I know how to create a dummy variable for a logistical regression, but I’m not sure if its the same process for this command.
Thanks again !
When you include a CLASS statement it automatically creates the dummy variables. But how the dummy variables are created varies. IN PROC LOGISTIC you have more control than in PROC GLM so you may be better off creating them manually. Check the docs on the CLASS statement for the options to set the method for creating a dummy variable.
Hi I've looked through the class and proc statements online information and tried to look up some things regarding dummy variables but I was not able to find a conclusion. Im not sure what the code should look like so I could start having information show up in the 3rd level created. What am I doing wrong?
Data UGAStudent_DATA;
Set date_difference;
if cd4 <= 200 then cd4_levels='1';
if cd4 >=201 <=500 then cd4_levels='2';
if cd4 > 500 then cd4_levels='3';
run;
proc genmod data= ugastudent_data;
class cd4_levels /param=glm ;
model Factor1 = cd4_levels n_Duration_on_ART_months ;
run;
class cd4_levels /param=glm ;
Join us for SAS Innovate 2025, our biggest and most exciting global event of the year, in Orlando, FL, from May 6-9. Sign up by March 14 for just $795.
Learn the difference between classical and Bayesian statistical approaches and see a few PROC examples to perform Bayesian analysis in this video.
Find more tutorials on the SAS Users YouTube channel.
Ready to level-up your skills? Choose your own adventure.