turn on suggestions

Auto-suggest helps you quickly narrow down your search results by suggesting possible matches as you type.

Showing results for

Find a Community

- Home
- /
- SAS Programming
- /
- SAS Procedures
- /
- Logistic regression with different independent var...

Topic Options

- Subscribe to RSS Feed
- Mark Topic as New
- Mark Topic as Read
- Float this Topic for Current User
- Bookmark
- Subscribe
- Printer Friendly Page

- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Highlight
- Email to a Friend
- Report Inappropriate Content

11-27-2012 03:07 AM

Dear anyone

I need to do a logisc regression. My outcome is dichotom. The independent varialbels have different characters, as shown in the spreadsheet. I am in doubt - which SAS code(s) should I use.

I hope you can help

Sincerely

Anders

ID | outcome | Var1 (categorial) | Var 2 (dichotom) | Var 3 (continuing) |

11 | 0 | 0 | a | 45 |

12 | 1 | 3 | b | 65 |

13 | 1 | 2 | b | 34 |

14 | 1 | 0 | a | 56 |

15 | 0 | 1 | b | 32 |

16 | 0 | 3 | b | 56 |

17 | 1 | 2 | b | 67 |

18 | 1 | 1 | a | 54 |

19 | 0 | 1 | a | 67 |

20 | 1 | 0 | b | 54 |

- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Highlight
- Email to a Friend
- Report Inappropriate Content

11-27-2012 07:35 AM

This should get you started, but beware, there are a lot of pitfalls in this area, the primary being enough outcomes of interest for the number of variables included in the model. I am choosing PROC GENMOD because of the presence of the categorical variables Var1 and Var2, and the ease of specifying their effects in PROC GENMOD as opposed to PROC LOGISTIC.

proc genmod data=yourdata;

class var1 var2;

model outcome=var1 var2 var3/dist=binary solution;

lsmeans var1 var2/ilink;

run;

Other things to consider--you have a continuous variable. The analysis I have given here is referred to in the literature as analysis of covariance, and this particular model assumes that the "slope" due to var3 is constant across all levels of var1 and var2. Without knowing how much data is available, I don't know whether you can efficiently investigate whether or not there is evidence for this assumption. Anyway, this should get you started. Stop back in when you have tried it, and see if it is giving you answers that are interpretable. Be sure to read the documentation, not only for PROC GENMOD, but for PROC GLIMMIX and PROC LOGISTIC to get an understanding of exactly what SAS is doing in each of these.

Steve Denham

- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Highlight
- Email to a Friend
- Report Inappropriate Content

12-19-2012 03:39 AM

Dear Steve Denham

Tahnk you for your helpful reply. I am sorry for getting back to you this late. I ended up doing the 'proc logistic' and worked it over over with a member of the statistical department. But thank you anyway.

Cheers

Anders Lødrup

- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Highlight
- Email to a Friend
- Report Inappropriate Content

12-31-2012 02:23 AM

Dear Steve Denham,

I came to the same problem,after I posted

(Proc Logisitic result not include ordinal variables)I found your reply ,very useful.

Thus ,I treat those class variables as continuous,code as below

proc logistic data=slide.sb_vm_training outmodel=slide.model;

model dv = N2 N3 N4 N5 N6 N7 N10 N11 N12 N13 Prin1 Prin2 Prin3 factor1 factor2 factor3 factor4 factor5 factor6 factor7 factor8 /selection=stepwise ;

run;

then the N variable did into the model.

Can you give a brief explaination why proc logistic has shortcoming in nominal and continuous variables combined?

or can you give me some papers to read?

Very thanks.

Dawn

- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Highlight
- Email to a Friend
- Report Inappropriate Content

12-31-2012 07:59 AM

Point one: Nominal and continuous combined often leads to quasi-separation. For papers on this problem, read the documentation for PROC LOGISTIC, and follow the references given there.

Point two: Search this site and the SAS-L listserv for comments regarding stepwise selection of variables. In particular, find the paper by Flom and Cassell at http://www.nesug.org/proceedings/nesug07/sa/sa07.pdf. Stepwise has a variety of problems, not the least of which is that any of the p values associated with the parameters are wrong, as the distributional assumptions are not met. They are also biased towards zero.

Steve Denham

- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Highlight
- Email to a Friend
- Report Inappropriate Content

01-04-2013 03:02 AM

Dear Steve Denham,

Thank u for your wonderful explanation.

Can I ask one more question?

Code:

proc princomp data=slide.sb_vm10 cov outstat=temp_prin1;

var c1-c45;

run;

for eg variables group A with large scope is within (-1M,1M),variables group B with small scope is within (-1,1),

it seems that the coefficient for Eigenvectors like prin1 will be Zero for those variables group B.

Do u know in mind how to deal with such things?

Thx in advance.

Dawn

- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Highlight
- Email to a Friend
- Report Inappropriate Content

01-04-2013 07:12 AM

Rescale! Remember that prinicipal components and the resulting eigenvalues are based on the amount of variability explained. If all of the variability is in group A, then the component will only have a loading on A, as B contributes almost nothing to the total variability.

Steve Denham

- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Highlight
- Email to a Friend
- Report Inappropriate Content

01-04-2013 08:02 PM

Dear Steve,

Got it.Very thank you and very helpful response!!!

Dawn