turn on suggestions

Auto-suggest helps you quickly narrow down your search results by suggesting possible matches as you type.

Showing results for

Find a Community

- Home
- /
- Analytics
- /
- Stat Procs
- /
- Logistic regression or GLM

Topic Options

- Subscribe to RSS Feed
- Mark Topic as New
- Mark Topic as Read
- Float this Topic for Current User
- Bookmark
- Subscribe
- Printer Friendly Page

- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Permalink
- Email to a Friend
- Report Inappropriate Content

11-17-2012 11:50 PM

I am new to logistic and GLM procedures, and therefore I have some syntactical and conceptual questions:

I have a dataset(attached to this post) which has information about the salary and various other important characteristics of all faculty (n=52) in a college. The descriptions of the variables are as follows:

OBS: observation #

SX: sex (0=Male, 1=Female)

RK: rank (1=Assistant Professor, 2=Associate Professor, 3=Full Professor)

YR: # years in current rank

DG: highest degree (0=Masters, 1=Doctorate)

YD: # years since highest degree earned

SL: academic year salary ($)

I need to determine if gender is associated with rank, highest degree, number of years in current rank, number of years since highest degree earned, and academic year salary.

Since my gender is a binary outcome, I have used logistic regression to address the question. However I am getting a result where all my predictors seem highly significant which does not look to be correct. Am I approaching this question correctly or is my syntax not correct? Should I be using GLM?

My code is as follows:

proc logistic data=discrimination;

freq yd;

freq yr;

class rk dg;

model sx(descending) =rk yr dg yd sl;

run;

Another question that I am addressing is:

2. Is there a significant relationship between rank and academic year salary?

I am using a simple regression model. Here I have assigned rank as X (categorical) and salary as Y(continuous). Am I doing this correctly?

Below is the code:

proc reg data=discrimination SIMPLE;

model SL = rk;

run;

- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Permalink
- Email to a Friend
- Report Inappropriate Content

Posted in reply to Biobee

11-18-2012 01:44 AM

A few initial remarks:

Your use of the freq statement is incorrect. You would only use it if you had n identical instances which were represented in one obsevation of your data, with a frequency of n.

Modelling sex as if it were a dependent attribute is a bit perverse. I would expect you to model rank or salary on some set of the other indicators. To use logistic on salary you would have to segment the data, perhaps as low, mid or high.

I'll leave the finer points to others.

Richard in Oz

- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Permalink
- Email to a Friend
- Report Inappropriate Content

Posted in reply to RichardinOz

11-18-2012 12:01 PM

Hi Richard

Thanks for replying to my post and your suggestion for not using freq. With regards to using gender as the outcome variable, I agree with your point of view, however I cannot change what the question requires. So will have to work with gender as outcome.

- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Permalink
- Email to a Friend
- Report Inappropriate Content

Posted in reply to Biobee

11-18-2012 02:46 PM

You state your question as :

I need to determine if gender is associated with rank, highest degree, number of years in current rank, number of years since highest degree earned, and academic year salary.

What did your univariate comparison say for each variable before multivariate model?

Second of all, I'm with RichardinOz, your outcome shouldn't be gender, that is an dependent variable the outcome is something else.

Association doesn't have to mean the variable is the independent variable.

- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Permalink
- Email to a Friend
- Report Inappropriate Content

Posted in reply to Biobee

11-18-2012 07:15 PM

You say "I cannot change what the question requires. So will have to work with gender as outcome."

I disagree. Having gender as the outcome implies you have a population which undergoes sex change as it progresses through rank, academic outcome and salary.

As an analyst you have a responsibility to challenge incorrect assumptions. Otherwise you are little better than a 'script kiddie' throwing code at data in the hope that something sticks. Who is asking the question? Go back to them and get them to restate the problem.

Richard in Oz