Building models with SAS Enterprise Miner, SAS Factory Miner, SAS Visual Data Mining and Machine Learning or just with programming

How to output ASE for training, validation and testing.

Accepted Solution Solved
Reply
Occasional Contributor
Posts: 19
Accepted Solution

How to output ASE for training, validation and testing.

Hi, there

I am a SAS newbie and now I am working on some simple linear regression in SAS enterprise miner. I am wondering how can I output the ASE for training, validation and testing respectively.

My flow chart is like:

File imported---->data partition (Training, validation, test)------->code node.

The code node was written as:

ods trace on;

proc glmselect DATA=&EM_IMPORT_DATA;

effect MyPoly = polynomial(A B C/degree=4); 

model Y = MyPoly;

run;

ods trace off;


Accepted Solutions
Solution
‎06-23-2015 04:21 PM
Super Contributor
Posts: 337

Re: How to output ASE for training, validation and testing.

OK, it took some googling, but I got this working.

You have a couple options.

1) you pass the train, validation, and test sets using the macro variables so that a Model Comparison node can pick up the partition and calculate the stats.

2) take advantage of the specific proc syntax. I think this is what you were trying to do:

code your own glmselect flow.png

1. Add a data set. In my example I used German Credit from F1->Generate Sample Data Sources

2. Add a Partition node

3. Add a SAS code node with the code below. Change the bold for your own target (response) and inputs (effects).

data mydata;

set &EM_IMPORT_DATA(in=a) &EM_IMPORT_VALIDATE(in=b) &EM_IMPORT_TEST(in=c);

if a then _partition="_Train";

else if b then _partition="_Valid";

else if c then _partition="_Test";

run;

proc glmselect DATA=mydata;

effect MyPoly = polynomial(duration checking savings/degree=4);

model amount = MyPoly;

partition rolevar=_partition(TEST='_Test' TRAIN='_Train' VALIDATE='_Valid');

run;

4. Run.

The output results will give you the ASE of training/validation/testing.

sas code glmselect output.png

This model isn't fabulous for this data set but hopefully this approach will give you good results on yours!

Good luck!

-m

Good reference: SAS/STAT(R) User's Guide, proc glm select - partition statement

PS If you try the other approach I described, you can easily use a Model Comparison node to compare with HPGLM or any of the model nodes in Enterprise Miner. Very recommended to give this a try!!!!!!!!!!

View solution in original post


All Replies
Super Contributor
Posts: 337

Re: How to output ASE for training, validation and testing.

Hi,

You are doing some advanced stuff!

If you have a recent Enterprise Miner version, the easiest is to use the HPGLM node to do your model. And then add a Model Comparison node.

To code your own proc on a SAS Code node you need to use some macro variables so that the Model comparison node catches your partitions correctly. You are on a good track! In addition to &em_import_ data we need the corresponding &em_import_validate, &em_export_validate, &em_export_train, etc.

Try HPGLM node while someone posts a workaround to use proc glmselect on a SAS Code node.

I hope it helps!

-Miguel

Occasional Contributor
Posts: 19

Re: How to output ASE for training, validation and testing.

Posted in reply to M_Maldonado

Hi,

Thanks for your reply, I don't use HP GLM node since it has limitation for polynomial degree (up to 3). I need a little bit higher than that.

Indeed, I need these variables: &em_import_validate, &em_export_validate,&em_export_train. However, I don't know what my equation looks like before I ran my model. For instance, I did a nonlinear model as follows to calculate the ASE:

...

model Yt_1 = Y / (a + b * Y)

...

data EM_IMPORT_VALIDATE_est;

set &EM_IMPORT_VALIDATE. ;

_res2 = (Y1- (Y / (aa + bb *Y) ) )**2;

run;

proc means data=EM_IMPORT_VALIDATE_est noprint;

var _res2;

output out=&EM_EXPORT_VALIDATE(drop=_Smiley Happy n=validate_n sum=validate_sse;

run;

validate_ase=validate_sse/(validate_n-2);

In this way, I can calculate my ASE for validation portion. My problem here is:

1 if I don't know my predict equation ahead, how can I code to calculate ASE?

2.Without code calculating ASE,  I can still calculate the overall ASE for the entire data set by using the obtained regression equation. But how can I figure out which portion of the data set was used to do training and which portion is used to do validation?

Thanks.

Solution
‎06-23-2015 04:21 PM
Super Contributor
Posts: 337

Re: How to output ASE for training, validation and testing.

OK, it took some googling, but I got this working.

You have a couple options.

1) you pass the train, validation, and test sets using the macro variables so that a Model Comparison node can pick up the partition and calculate the stats.

2) take advantage of the specific proc syntax. I think this is what you were trying to do:

code your own glmselect flow.png

1. Add a data set. In my example I used German Credit from F1->Generate Sample Data Sources

2. Add a Partition node

3. Add a SAS code node with the code below. Change the bold for your own target (response) and inputs (effects).

data mydata;

set &EM_IMPORT_DATA(in=a) &EM_IMPORT_VALIDATE(in=b) &EM_IMPORT_TEST(in=c);

if a then _partition="_Train";

else if b then _partition="_Valid";

else if c then _partition="_Test";

run;

proc glmselect DATA=mydata;

effect MyPoly = polynomial(duration checking savings/degree=4);

model amount = MyPoly;

partition rolevar=_partition(TEST='_Test' TRAIN='_Train' VALIDATE='_Valid');

run;

4. Run.

The output results will give you the ASE of training/validation/testing.

sas code glmselect output.png

This model isn't fabulous for this data set but hopefully this approach will give you good results on yours!

Good luck!

-m

Good reference: SAS/STAT(R) User's Guide, proc glm select - partition statement

PS If you try the other approach I described, you can easily use a Model Comparison node to compare with HPGLM or any of the model nodes in Enterprise Miner. Very recommended to give this a try!!!!!!!!!!

🔒 This topic is solved and locked.

Need further help from the community? Please ask a new question.

Discussion stats
  • 3 replies
  • 1250 views
  • 0 likes
  • 2 in conversation