BookmarkSubscribeRSS Feed
oggylang
Calcite | Level 5

Hi SAS Forum,

 

I am currently doing some LASSO regression, and have a headache, that I hope someone else has had before, and therefore might be able to sort out. I am doing a LASSO regression, and I want to partition my data. I have made a dummy variable indicating whether the data belongs to training or testing part of the data set, but I am having struggles implementing this into the partition statement. The dummy is treated. The SAS documentation is a bit limited and mainly focuses on partitioning by choosing a share of the data rather than choosing based on a variable.

 

 

proc glmselect data = forecastmerge2;
class herkomst;
model PostEmplSumD = Woman Married PriorEmpl c:/
selection = lasso(stop=none choose=cvex);
partition = treated(test=(treated=1) train=(treated=0));
output out=GLMOut p = p_hat;
run;

 

If you have any questions, please let me know. I tried to add all of the coding which seemed relevant for the question.

 

Oggylang 

1 REPLY 1
Rick_SAS
SAS Super FREQ

Here is the documentation that specifies the correct syntax. I think for your data the PARTITION statement would look like this (untested)

partition ROLEVAR=treated(test='1' train='0');

 You can also include a character variable named _ROLE_ in the input data that has the values "TRAIN" and "TEST". If the input data contains a _ROLE_ variable, then you can omit the PARTITION statement.

sas-innovate-wordmark-2025-midnight.png

Register Today!

Join us for SAS Innovate 2025, our biggest and most exciting global event of the year, in Orlando, FL, from May 6-9. Sign up by March 14 for just $795.


Register now!

What is ANOVA?

ANOVA, or Analysis Of Variance, is used to compare the averages or means of two or more populations to better understand how they differ. Watch this tutorial for more.

Find more tutorials on the SAS Users YouTube channel.

Discussion stats
  • 1 reply
  • 594 views
  • 0 likes
  • 2 in conversation