- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
Hi SAS Forum,
I am currently doing some LASSO regression, and have a headache, that I hope someone else has had before, and therefore might be able to sort out. I am doing a LASSO regression, and I want to partition my data. I have made a dummy variable indicating whether the data belongs to training or testing part of the data set, but I am having struggles implementing this into the partition statement. The dummy is treated. The SAS documentation is a bit limited and mainly focuses on partitioning by choosing a share of the data rather than choosing based on a variable.
proc glmselect data = forecastmerge2; class herkomst; model PostEmplSumD = Woman Married PriorEmpl c:/ selection = lasso(stop=none choose=cvex); partition = treated(test=(treated=1) train=(treated=0)); output out=GLMOut p = p_hat; run;
If you have any questions, please let me know. I tried to add all of the coding which seemed relevant for the question.
Oggylang
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
Here is the documentation that specifies the correct syntax. I think for your data the PARTITION statement would look like this (untested)
partition ROLEVAR=treated(test='1' train='0');
You can also include a character variable named _ROLE_ in the input data that has the values "TRAIN" and "TEST". If the input data contains a _ROLE_ variable, then you can omit the PARTITION statement.