Statistical Procedures

Programming the statistical procedures from SAS
BookmarkSubscribeRSS Feed
🔒 This topic is solved and locked. Need further help from the community? Please sign in and ask a new question.
lindawoo
Calcite | Level 5

Novice user here!  I am trying to predict salary based on variables such as gender, jobfunction, retention, performance while accounting for the fact that people are in different salary grades which by itself will cause differences in individual salaries from grade to grade.  I don't think I am doing this correctly.  Can someone please help.

 

ods graphics on;
title "Selection Method LASSO Using Cross Validation";
proc glmselect data = train testdata=test
plots(stepAxis=number)=(criterionPanel ASEPlot CRITERIONPANEL);
model salary = grade F1 F2 F3 F4 F5 F6 F7 F8 F9 Gender
Reten Minority P2 P1 AgeCat / selection=LASSO(choose=CV stop=CV) CVDETAILS;
run;
%put &=_glsind;
proc glm data=test;
model Salary = &_glsind / solution clParm;
quit;
ods graphics off;

1 ACCEPTED SOLUTION

Accepted Solutions
Rick_SAS
SAS Super FREQ

You say "I don't think I am doing this correctly." Is there an error? If so, please post the log.

 

I don't know the form of your data (you should always post some sample data, when possible), but I'm guessing that you should define some of the variables as classification variables by using the CLASS statement. Any discrete or categorical variable should be on the CLASS statement, like this:

 

proc glmselect data = train testdata=test;
   class Gender Minority AgeCat /* any others? */;
   model salary = grade F1 F2 F3 F4 F5 F6 F7 F8 F9 Gender
Reten Minority P2 P1 AgeCat / selection=LASSO(choose=CV stop=CV) CVDETAILS;
run;

View solution in original post

1 REPLY 1
Rick_SAS
SAS Super FREQ

You say "I don't think I am doing this correctly." Is there an error? If so, please post the log.

 

I don't know the form of your data (you should always post some sample data, when possible), but I'm guessing that you should define some of the variables as classification variables by using the CLASS statement. Any discrete or categorical variable should be on the CLASS statement, like this:

 

proc glmselect data = train testdata=test;
   class Gender Minority AgeCat /* any others? */;
   model salary = grade F1 F2 F3 F4 F5 F6 F7 F8 F9 Gender
Reten Minority P2 P1 AgeCat / selection=LASSO(choose=CV stop=CV) CVDETAILS;
run;

sas-innovate-white.png

Our biggest data and AI event of the year.

Don’t miss the livestream kicking off May 7. It’s free. It’s easy. And it’s the best seat in the house.

Join us virtually with our complimentary SAS Innovate Digital Pass. Watch live or on-demand in multiple languages, with translations available to help you get the most out of every session.

 

Register now!

What is ANOVA?

ANOVA, or Analysis Of Variance, is used to compare the averages or means of two or more populations to better understand how they differ. Watch this tutorial for more.

Find more tutorials on the SAS Users YouTube channel.

Discussion stats
  • 1 reply
  • 989 views
  • 0 likes
  • 2 in conversation