Help using Base SAS procedures

Data Missing on output.

Accepted Solution Solved
Reply
Contributor
Posts: 20
Accepted Solution

Data Missing on output.

Data UGAStudent_DATA;
Set date_difference; 

if cd4 <= '200' then cd4_levels='1';
if cd4 >='201' <='500' then cd4_levels='2';
if cd4 > '500' then cd4_levels='3'; 
run;

proc genmod data= ugastudent_data ;
class cd4_levels ;
model Factor3 = cd4_levels n_Duration_on_ART_months ; 
run;

I'm trying to run the proc genmod command, but when I look at level 3, it has 0s across the board but levels 1 and 2 have values. How do I get my level 3 data to show up? Its like the data is missing or something

Screen Shot 2018-02-05 at 9.13.14 PM.png


Accepted Solutions
Solution
‎03-08-2018 01:49 PM
Super User
Posts: 23,733

Re: Data Missing on output.

Posted in reply to UGAstudent

When you include a CLASS statement it automatically creates the dummy variables. But how the dummy variables are created varies. IN PROC LOGISTIC you have more control than in PROC GLM so you may be better off creating them manually. Check the docs on the CLASS statement for the options to set the method for creating a dummy variable.

View solution in original post


All Replies
Super User
Posts: 23,733

Re: Data Missing on output.

[ Edited ]
Posted in reply to UGAstudent

That's the correct results for a categorical variable, only N-1 levels will have estimates. If you have 3 levels, you only need two variables to identify the third, so you only have estimates for two variables. Review dummy variables and the degrees of freedom in your statistics textbook. 

In matrix language, they would be linear combinations of each other.

 

You can use different dummy coding, but it depends on what you're trying to test. 

 


UGAstudent wrote:
Data UGAStudent_DATA;
Set date_difference; 

if cd4 <= '200' then cd4_levels='1';
if cd4 >='201' <='500' then cd4_levels='2';
if cd4 > '500' then cd4_levels='3'; 
run;

proc genmod data= ugastudent_data ;
class cd4_levels ;
model Factor3 = cd4_levels n_Duration_on_ART_months ; 
run;

I'm trying to run the proc genmod command, but when I look at level 3, it has 0s across the board but levels 1 and 2 have values. How do I get my level 3 data to show up? Its like the data is missing or something

Screen Shot 2018-02-05 at 9.13.14 PM.png


 

Super User
Posts: 6,778

Re: Data Missing on output.

Posted in reply to UGAstudent

Another issue to consider ... is CD4 numeric or character?

 

If CD4 is numeric, there should not be quotes around '200' and '201' and '500'.  

 

If CD4 is character, the comparisons you are asking for may be different than what you expect.  As character strings:

 

'51' and '60' are both greater than '500'

 

'1000' is less than '200'

 

I can't tell what is in your data, but I can tell that these comparisons are suspect.

Contributor
Posts: 20

Re: Data Missing on output.

Posted in reply to Astounding
Hi,
CD4 is numerical (as in HIV CD4 count). I removed the quotations around the 200, 201, and 500, and everything seems to work alright. I know how to create a dummy variable for a logistical regression, but I’m not sure if its the same process.


Thanks again !

Super User
Posts: 23,733

Re: Data Missing on output.

Posted in reply to UGAstudent
If your question is answered please mark it as solved.
Contributor
Posts: 20

Re: Data Missing on output.

 

Hi Reeza,

 

CD4 is numerical (as in HIV CD4 count). I removed the quotations around the 200, 201, and 500, and everything seems to work alright. I know how to create a dummy variable for a logistical regression, but I’m not sure if its the same process for this command. 


Thanks again !

Solution
‎03-08-2018 01:49 PM
Super User
Posts: 23,733

Re: Data Missing on output.

Posted in reply to UGAstudent

When you include a CLASS statement it automatically creates the dummy variables. But how the dummy variables are created varies. IN PROC LOGISTIC you have more control than in PROC GLM so you may be better off creating them manually. Check the docs on the CLASS statement for the options to set the method for creating a dummy variable.

Contributor
Posts: 20

Re: Data Missing on output.

 

 

Hi I've looked through the class and proc statements online information and tried to look up some things regarding dummy variables but I was not able to find a conclusion. Im not sure what the code should look like so I could start having information show up in the 3rd level created. What am I doing wrong?

 

Data UGAStudent_DATA;
Set date_difference; 

if cd4 <= 200 then cd4_levels='1';
if cd4 >=201 <=500 then cd4_levels='2';
if cd4 > 500 then cd4_levels='3'; 
run;

proc genmod data= ugastudent_data;
class cd4_levels /param=glm ;
model  Factor1 =  cd4_levels n_Duration_on_ART_months ;  
run;
Super User
Posts: 10,784

Re: Data Missing on output.

Posted in reply to UGAstudent
class cd4_levels /param=glm ;
Contributor
Posts: 20

Re: Data Missing on output.

Posted in reply to UGAstudent

CD4 is numerical (as in HIV CD4 count). I removed the quotations around the 200, 201, and 500, and everything seems to work alright. I know how to create a dummy variable for a logistical regression, but I’m not sure if its the same process.


Thanks again !
 
☑ This topic is solved.

Need further help from the community? Please ask a new question.

Discussion stats
  • 9 replies
  • 214 views
  • 0 likes
  • 4 in conversation