BookmarkSubscribeRSS Feed
Biobee
Calcite | Level 5

I am new to logistic and GLM procedures, and therefore I have some syntactical and conceptual questions:


I have a dataset(attached to this post) which has information about the salary and various other important characteristics of all faculty (n=52) in a college.  The descriptions of the variables are as follows:

OBS: observation #

SX: sex (0=Male, 1=Female)

RK: rank (1=Assistant Professor, 2=Associate Professor, 3=Full Professor)

YR: # years in current rank

DG: highest degree (0=Masters, 1=Doctorate)

YD: # years since highest degree earned

SL: academic year salary ($)

I need to determine if gender is associated with rank, highest degree, number of years in current rank, number of years since highest degree earned, and academic year salary.

Since my gender is a binary outcome, I have used logistic regression to address the question. However I am getting a result where all my predictors seem highly significant which does not look to be correct. Am I approaching this question correctly or is my syntax not correct? Should I be using GLM?


My code is as follows:

proc logistic data=discrimination;

freq yd;

freq yr;

class rk dg;

model sx(descending) =rk yr dg yd sl;

run;

Another question that I am addressing is:

2. Is there a significant relationship between rank and academic year salary?

I am using a simple regression model. Here I have assigned rank as X (categorical) and salary as Y(continuous). Am I doing this correctly?

Below is the code:

proc reg data=discrimination SIMPLE;

model SL = rk;

run;

4 REPLIES 4
RichardinOz
Quartz | Level 8

A few initial remarks:

Your use of the freq statement is incorrect.  You would only use it if you had n identical instances which were represented in one obsevation of your data, with a frequency of n.

Modelling sex as if it were a dependent attribute is a bit perverse.  I would expect you to model rank or salary on some set of the other indicators.  To use logistic on salary you would have to segment the data, perhaps as low, mid or high.

I'll leave the finer points to others.

Richard in Oz

Biobee
Calcite | Level 5

Hi Richard

Thanks for replying to my post and your suggestion for not using freq. With regards to using gender as the outcome variable, I agree with your point of view, however I cannot change what the question requires. So will have to work with gender as outcome.

Reeza
Super User

You state your question as :

I need to determine if gender is associated with rank, highest degree, number of years in current rank, number of years since highest degree earned, and academic year salary.


What did your univariate comparison say for each variable before multivariate model?


Second of all, I'm with RichardinOz, your outcome shouldn't be gender, that is an dependent variable the outcome is something else. 


Association doesn't have to mean the variable is the independent variable.

RichardinOz
Quartz | Level 8

You say "I cannot change what the question requires. So will have to work with gender as outcome."

I disagree.  Having gender as the outcome implies you have a population which undergoes sex change as it progresses through rank, academic outcome and salary.

As an analyst you have a responsibility to challenge incorrect assumptions.  Otherwise you are little better than a 'script kiddie' throwing code at data in the hope that something sticks.  Who is asking the question?  Go back to them and get them to restate the problem.

Richard in Oz

sas-innovate-2024.png

Join us for SAS Innovate April 16-19 at the Aria in Las Vegas. Bring the team and save big with our group pricing for a limited time only.

Pre-conference courses and tutorials are filling up fast and are always a sellout. Register today to reserve your seat.

 

Register now!

What is ANOVA?

ANOVA, or Analysis Of Variance, is used to compare the averages or means of two or more populations to better understand how they differ. Watch this tutorial for more.

Find more tutorials on the SAS Users YouTube channel.

Discussion stats
  • 4 replies
  • 1690 views
  • 0 likes
  • 3 in conversation