- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
Hi, there
I am a SAS newbie and now I am working on some simple linear regression in SAS enterprise miner. I am wondering how can I output the ASE for training, validation and testing respectively.
My flow chart is like:
File imported---->data partition (Training, validation, test)------->code node.
The code node was written as:
ods trace on;
proc glmselect DATA=&EM_IMPORT_DATA;
effect MyPoly = polynomial(A B C/degree=4);
model Y = MyPoly;
run;
ods trace off;
Accepted Solutions
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
OK, it took some googling, but I got this working.
You have a couple options.
1) you pass the train, validation, and test sets using the macro variables so that a Model Comparison node can pick up the partition and calculate the stats.
2) take advantage of the specific proc syntax. I think this is what you were trying to do:
1. Add a data set. In my example I used German Credit from F1->Generate Sample Data Sources
2. Add a Partition node
3. Add a SAS code node with the code below. Change the bold for your own target (response) and inputs (effects).
data mydata;
set &EM_IMPORT_DATA(in=a) &EM_IMPORT_VALIDATE(in=b) &EM_IMPORT_TEST(in=c);
if a then _partition="_Train";
else if b then _partition="_Valid";
else if c then _partition="_Test";
run;
proc glmselect DATA=mydata;
effect MyPoly = polynomial(duration checking savings/degree=4);
model amount = MyPoly;
partition rolevar=_partition(TEST='_Test' TRAIN='_Train' VALIDATE='_Valid');
run;
4. Run.
The output results will give you the ASE of training/validation/testing.
This model isn't fabulous for this data set but hopefully this approach will give you good results on yours!
Good luck!
-m
Good reference: SAS/STAT(R) User's Guide, proc glm select - partition statement
PS If you try the other approach I described, you can easily use a Model Comparison node to compare with HPGLM or any of the model nodes in Enterprise Miner. Very recommended to give this a try!!!!!!!!!!
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
Hi,
You are doing some advanced stuff!
If you have a recent Enterprise Miner version, the easiest is to use the HPGLM node to do your model. And then add a Model Comparison node.
To code your own proc on a SAS Code node you need to use some macro variables so that the Model comparison node catches your partitions correctly. You are on a good track! In addition to &em_import_ data we need the corresponding &em_import_validate, &em_export_validate, &em_export_train, etc.
Try HPGLM node while someone posts a workaround to use proc glmselect on a SAS Code node.
I hope it helps!
-Miguel
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
Hi,
Thanks for your reply, I don't use HP GLM node since it has limitation for polynomial degree (up to 3). I need a little bit higher than that.
Indeed, I need these variables: &em_import_validate, &em_export_validate,&em_export_train. However, I don't know what my equation looks like before I ran my model. For instance, I did a nonlinear model as follows to calculate the ASE:
...
model Yt_1 = Y / (a + b * Y)
...
data EM_IMPORT_VALIDATE_est;
set &EM_IMPORT_VALIDATE. ;
_res2 = (Y1- (Y / (aa + bb *Y) ) )**2;
run;
proc means data=EM_IMPORT_VALIDATE_est noprint;
var _res2;
output out=&EM_EXPORT_VALIDATE(drop=_:) n=validate_n sum=validate_sse;
run;
validate_ase=validate_sse/(validate_n-2);
In this way, I can calculate my ASE for validation portion. My problem here is:
1 if I don't know my predict equation ahead, how can I code to calculate ASE?
2.Without code calculating ASE, I can still calculate the overall ASE for the entire data set by using the obtained regression equation. But how can I figure out which portion of the data set was used to do training and which portion is used to do validation?
Thanks.
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
OK, it took some googling, but I got this working.
You have a couple options.
1) you pass the train, validation, and test sets using the macro variables so that a Model Comparison node can pick up the partition and calculate the stats.
2) take advantage of the specific proc syntax. I think this is what you were trying to do:
1. Add a data set. In my example I used German Credit from F1->Generate Sample Data Sources
2. Add a Partition node
3. Add a SAS code node with the code below. Change the bold for your own target (response) and inputs (effects).
data mydata;
set &EM_IMPORT_DATA(in=a) &EM_IMPORT_VALIDATE(in=b) &EM_IMPORT_TEST(in=c);
if a then _partition="_Train";
else if b then _partition="_Valid";
else if c then _partition="_Test";
run;
proc glmselect DATA=mydata;
effect MyPoly = polynomial(duration checking savings/degree=4);
model amount = MyPoly;
partition rolevar=_partition(TEST='_Test' TRAIN='_Train' VALIDATE='_Valid');
run;
4. Run.
The output results will give you the ASE of training/validation/testing.
This model isn't fabulous for this data set but hopefully this approach will give you good results on yours!
Good luck!
-m
Good reference: SAS/STAT(R) User's Guide, proc glm select - partition statement
PS If you try the other approach I described, you can easily use a Model Comparison node to compare with HPGLM or any of the model nodes in Enterprise Miner. Very recommended to give this a try!!!!!!!!!!