- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
Novice user here! I am trying to predict salary based on variables such as gender, jobfunction, retention, performance while accounting for the fact that people are in different salary grades which by itself will cause differences in individual salaries from grade to grade. I don't think I am doing this correctly. Can someone please help.
ods graphics on;
title "Selection Method LASSO Using Cross Validation";
proc glmselect data = train testdata=test
plots(stepAxis=number)=(criterionPanel ASEPlot CRITERIONPANEL);
model salary = grade F1 F2 F3 F4 F5 F6 F7 F8 F9 Gender
Reten Minority P2 P1 AgeCat / selection=LASSO(choose=CV stop=CV) CVDETAILS;
run;
%put &=_glsind;
proc glm data=test;
model Salary = &_glsind / solution clParm;
quit;
ods graphics off;
Accepted Solutions
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
You say "I don't think I am doing this correctly." Is there an error? If so, please post the log.
I don't know the form of your data (you should always post some sample data, when possible), but I'm guessing that you should define some of the variables as classification variables by using the CLASS statement. Any discrete or categorical variable should be on the CLASS statement, like this:
proc glmselect data = train testdata=test;
class Gender Minority AgeCat /* any others? */;
model salary = grade F1 F2 F3 F4 F5 F6 F7 F8 F9 Gender
Reten Minority P2 P1 AgeCat / selection=LASSO(choose=CV stop=CV) CVDETAILS;
run;
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
You say "I don't think I am doing this correctly." Is there an error? If so, please post the log.
I don't know the form of your data (you should always post some sample data, when possible), but I'm guessing that you should define some of the variables as classification variables by using the CLASS statement. Any discrete or categorical variable should be on the CLASS statement, like this:
proc glmselect data = train testdata=test;
class Gender Minority AgeCat /* any others? */;
model salary = grade F1 F2 F3 F4 F5 F6 F7 F8 F9 Gender
Reten Minority P2 P1 AgeCat / selection=LASSO(choose=CV stop=CV) CVDETAILS;
run;