BookmarkSubscribeRSS Feed
🔒 This topic is solved and locked. Need further help from the community? Please sign in and ask a new question.
PeterBr
Obsidian | Level 7

Hi All,

 

I am running a lasso model on a large dataset using the code below. I am trying to interpret the results but for some of the variables in the CLASS option, I cannot tell which dummy SAS decided to exclude as the reference group and therefore cannot interpret the coefficients. For example, hsa has 8 levels, 1-8, so SAS will automatically remove one from the model as the reference group. Is there a way to have SAS tell me which one it removed of the 8 (and for the other variables in the CLASS option)?

 

I chose the option stop=none so that I could figure out which dummy SAS removed because SAS would loop through all the variables until none are left. I could then deduce which dummy from each CLASS variable SAS used as the reference group. However, when I run my code, about halfway through running through the variables, SAS stops the lasso and gives a message: "Selection stopped because the change of the maximum absolute correction is tiny."  From a statistical standpoint I am okay with this message but it prevents me from figuring out the reference group used for each CLASS variable. Any way I can override this and have SAS finish the Lasso to the end?

 

proc glmselect data=final.Claim_1617_m7_filterx plots(stepaxis=normb)=all;
CLASS gender_cd hsa anest_method src_pymt_cd1;
MODEL chrg_tot_amt = age HOPD oper_time_numb_ln CC_wght_total multiple_payor OR_staff
gender_cd hsa anest_method src_pymt_cd1 proc_cdJ0690--proc_cdJ3410 pat_race_cd49--pat_race_cd46
/selection = lasso(stop=none choose=cvex);
run;

1 ACCEPTED SOLUTION

Accepted Solutions
PaigeMiller
Diamond | Level 26

By default, SAS sets to coefficient to zero of the last alphabetical level in a CLASS variable.

 

You can use the REF= option on the CLASS statement to override this default.

--
Paige Miller

View solution in original post

7 REPLIES 7
PaigeMiller
Diamond | Level 26

By default, SAS sets to coefficient to zero of the last alphabetical level in a CLASS variable.

 

You can use the REF= option on the CLASS statement to override this default.

--
Paige Miller
PeterBr
Obsidian | Level 7

Hi Paige,

 

When I run the following code, SAS still seems to override my CLASS reference options. Do you have an idea why that might be? For example, anest_method has 4 levels: 10, 20, 30, 40. I am choosing a reference group of 10 but when the results are shown, I see that SAS included level 10 as a variable in the model with a non-zero coefficient and instead omitted level 40 (I know in proc reg or proc glm the reference group will be included in the output but with a coefficient of 0). All of my CLASS variables are in string format.

 

I've also consulted the proc glmselect user's guide and found my syntax to be consistent with what it suggests: https://support.sas.com/documentation/onlinedoc/stat/142/glmselect.pdf

 

Title 'Carpal OP Lasso_New Unknown Payer';
proc glmselect data=final.Claim_1617_m7_filterx plots(stepaxis=normb)=all;
CLASS gender_cd (ref='M') hsa (ref='5') anest_method (ref='10') src_pymt_cd_new (ref='Comm');
MODEL log_chrg_tot_amt = age HOPD log_oper_time_numb CC_wght_total_dum multiple_payor OR_staff
gender_cd hsa anest_method src_pymt_cd_new occur_cd04 proc_cdJ0690--proc_cdJ3410 pat_race_cd04--pat_race_cd05 isolated_rev_cd
/selection = lasso(stop=none choose=cvex);
run;

PaigeMiller
Diamond | Level 26

This doesn't seem right to me. 

 

Can you show us the relevant parts of the output?

--
Paige Miller
PeterBr
Obsidian | Level 7

I attached a truncated version of my output and used dummy data to censor information but the same problem is happening with my real data. You can see that even though I specified hsa (ref='3')  and anest_method (ref='10'), the output shows that Lasso stepped through hsa = 3 and anest_method = 10 as if they are variables to include in the model. It also included those levels in the final specified model without coefficients of 0. SAS instead chose reference groups to be hsa = 5 and anest_method = 40 given that they do not appear in the final model. If I use the same CLASS code but with proc glm it works as expected so seems like this problem is specific to proc glmselect.

 

Code:

 

Title 'Carpal OP Lasso_New3';
proc glmselect data=final.Claim_1617_m7_filter plots(stepaxis=normb)=all;
CLASS gender_cd (ref='M') hsa (ref='3') anest_method (ref='10') src_pymt_cd_new (ref='Comm');
MODEL log_chrg_tot_amt = age HOPD log_oper_time_numb CC_wght_total_dum multiple_payor
gender_cd hsa anest_method src_pymt_cd_new occur_cd04 proc_cdJ0690--proc_cdJ3410 pat_race_cd04--pat_race_cd05 isolated_rev_cd
/selection = lasso(stop=none choose=cvex);
run;

PaigeMiller
Diamond | Level 26

I don't download or open Microsoft Office files. Please just post the portion of the output into your reply.

--
Paige Miller
PeterBr
Obsidian | Level 7

Apologies - copying what was above:

Below is a truncated version of my output using dummy data to censor information but the same problem is happening with my real data. You can see that even though I specified hsa (ref='3')  and anest_method (ref='10'), the output shows that Lasso stepped through hsa = 3 and anest_method = 10 as if they are variables to include in the model. It also included those levels in the final specified model without coefficients of 0. SAS instead chose reference groups to be hsa = 5 and anest_method = 40 given that they do not appear in the final model. If I use the same CLASS code but with proc glm it works as expected so seems like this problem is specific to proc glmselect.

 

Code:

 

Title 'Carpal OP Lasso_New3';
proc glmselect data=final.Claim_1617_m7_filter plots(stepaxis=normb)=all;
CLASS gender_cd (ref='M') hsa (ref='3') anest_method (ref='10') src_pymt_cd_new (ref='Comm');
MODEL log_chrg_tot_amt = age HOPD log_oper_time_numb CC_wght_total_dum multiple_payor
gender_cd hsa anest_method src_pymt_cd_new occur_cd04 proc_cdJ0690--proc_cdJ3410 pat_race_cd04--pat_race_cd05 isolated_rev_cd
/selection = lasso(stop=none choose=cvex);
run;

 

Data Set

FINAL.CLAIM_1617_M7_FILTER

Dependent Variable

log_chrg_tot_amt

Selection Method

LASSO

Stop Criterion

None

Choose Criterion

External Cross Validation

External Cross Validation Method

Random

External Cross Validation Fold

5

Effect Hierarchy Enforced

None

Random Number Seed

616729001

 

Class Level Information

Class

Levels

Values

gender_cd

2

F M

hsa

7

1 2 5 6 7 8 3

anest_method

4

20 30 40 10

src_pymt_cd_new

4

Federal Self-Pay Unknown Comm

 

Dimensions

Number of Effects

404

Number of Effects after Splits

417

Number of Parameters

417

 

LASSO Selection Summary

Step

Effect
Entered

Effect
Removed

Number
Effects In

CVEX PRESS

0

Intercept

 

1

0.4617

1

hsa_1

 

2

0.4422

2

proc_cdJ7120

 

3

0.4326

3

proc_cdJ3490

 

4

0.4171

4

hsa_8

 

5

0.4027

5

proc_cdJ2704

 

6

0.3872

6

hsa_7

 

7

0.3793

7

isolated_rev_cd

 

8

0.3476

8

hsa_2

 

9

0.3398

9

anest_method_10

 

10

0.3201

10

proc_cd64721

 

11

0.2722

11

HOPD

 

12

0.2337

12

proc_cdJ3010

 

13

0.2240

13

anest_method_20

 

14

0.1983

14

log_oper_time_numb

 

15

0.1940

15

hsa_3

 

16

0.1901

 

Parameter Estimates

Parameter

DF

Estimate

Intercept

1

8.204030

HOPD

1

0.042461

log_oper_time_numb

1

0.061125

CC_wght_total_dum

1

0.009938

hsa_1

1

-0.487687

hsa_2

1

-0.326204

hsa_6

1

0.000374

hsa_7

1

0.000773

hsa_8

1

0.098029

hsa_3

1

-0.899385

anest_method_20

1

0.033017

anest_method_30

1

-0.343045

anest_method_10

1

-0.090350

PaigeMiller
Diamond | Level 26

I guess I don't have any other thoughts on why this is happening.

--
Paige Miller

sas-innovate-2024.png

Available on demand!

Missed SAS Innovate Las Vegas? Watch all the action for free! View the keynotes, general sessions and 22 breakouts on demand.

 

Register now!

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

Click image to register for webinarClick image to register for webinar

Classroom Training Available!

Select SAS Training centers are offering in-person courses. View upcoming courses for:

View all other training opportunities.

Discussion stats
  • 7 replies
  • 893 views
  • 1 like
  • 2 in conversation