BookmarkSubscribeRSS Feed
🔒 This topic is solved and locked. Need further help from the community? Please sign in and ask a new question.
viva0521
Fluorite | Level 6

Hi, there

I am a SAS newbie and now I am working on some simple linear regression in SAS enterprise miner. I am wondering how can I output the ASE for training, validation and testing respectively.

My flow chart is like:

File imported---->data partition (Training, validation, test)------->code node.

The code node was written as:

ods trace on;

proc glmselect DATA=&EM_IMPORT_DATA;

effect MyPoly = polynomial(A B C/degree=4); 

model Y = MyPoly;

run;

ods trace off;

1 ACCEPTED SOLUTION

Accepted Solutions
M_Maldonado
Barite | Level 11

OK, it took some googling, but I got this working.

You have a couple options.

1) you pass the train, validation, and test sets using the macro variables so that a Model Comparison node can pick up the partition and calculate the stats.

2) take advantage of the specific proc syntax. I think this is what you were trying to do:

code your own glmselect flow.png

1. Add a data set. In my example I used German Credit from F1->Generate Sample Data Sources

2. Add a Partition node

3. Add a SAS code node with the code below. Change the bold for your own target (response) and inputs (effects).

data mydata;

set &EM_IMPORT_DATA(in=a) &EM_IMPORT_VALIDATE(in=b) &EM_IMPORT_TEST(in=c);

if a then _partition="_Train";

else if b then _partition="_Valid";

else if c then _partition="_Test";

run;

proc glmselect DATA=mydata;

effect MyPoly = polynomial(duration checking savings/degree=4);

model amount = MyPoly;

partition rolevar=_partition(TEST='_Test' TRAIN='_Train' VALIDATE='_Valid');

run;

4. Run.

The output results will give you the ASE of training/validation/testing.

sas code glmselect output.png

This model isn't fabulous for this data set but hopefully this approach will give you good results on yours!

Good luck!

-m

Good reference: SAS/STAT(R) User's Guide, proc glm select - partition statement

PS If you try the other approach I described, you can easily use a Model Comparison node to compare with HPGLM or any of the model nodes in Enterprise Miner. Very recommended to give this a try!!!!!!!!!!

View solution in original post

3 REPLIES 3
M_Maldonado
Barite | Level 11

Hi,

You are doing some advanced stuff!

If you have a recent Enterprise Miner version, the easiest is to use the HPGLM node to do your model. And then add a Model Comparison node.

To code your own proc on a SAS Code node you need to use some macro variables so that the Model comparison node catches your partitions correctly. You are on a good track! In addition to &em_import_ data we need the corresponding &em_import_validate, &em_export_validate, &em_export_train, etc.

Try HPGLM node while someone posts a workaround to use proc glmselect on a SAS Code node.

I hope it helps!

-Miguel

viva0521
Fluorite | Level 6

Hi,

Thanks for your reply, I don't use HP GLM node since it has limitation for polynomial degree (up to 3). I need a little bit higher than that.

Indeed, I need these variables: &em_import_validate, &em_export_validate,&em_export_train. However, I don't know what my equation looks like before I ran my model. For instance, I did a nonlinear model as follows to calculate the ASE:

...

model Yt_1 = Y / (a + b * Y)

...

data EM_IMPORT_VALIDATE_est;

set &EM_IMPORT_VALIDATE. ;

_res2 = (Y1- (Y / (aa + bb *Y) ) )**2;

run;

proc means data=EM_IMPORT_VALIDATE_est noprint;

var _res2;

output out=&EM_EXPORT_VALIDATE(drop=_:) n=validate_n sum=validate_sse;

run;

validate_ase=validate_sse/(validate_n-2);

In this way, I can calculate my ASE for validation portion. My problem here is:

1 if I don't know my predict equation ahead, how can I code to calculate ASE?

2.Without code calculating ASE,  I can still calculate the overall ASE for the entire data set by using the obtained regression equation. But how can I figure out which portion of the data set was used to do training and which portion is used to do validation?

Thanks.

M_Maldonado
Barite | Level 11

OK, it took some googling, but I got this working.

You have a couple options.

1) you pass the train, validation, and test sets using the macro variables so that a Model Comparison node can pick up the partition and calculate the stats.

2) take advantage of the specific proc syntax. I think this is what you were trying to do:

code your own glmselect flow.png

1. Add a data set. In my example I used German Credit from F1->Generate Sample Data Sources

2. Add a Partition node

3. Add a SAS code node with the code below. Change the bold for your own target (response) and inputs (effects).

data mydata;

set &EM_IMPORT_DATA(in=a) &EM_IMPORT_VALIDATE(in=b) &EM_IMPORT_TEST(in=c);

if a then _partition="_Train";

else if b then _partition="_Valid";

else if c then _partition="_Test";

run;

proc glmselect DATA=mydata;

effect MyPoly = polynomial(duration checking savings/degree=4);

model amount = MyPoly;

partition rolevar=_partition(TEST='_Test' TRAIN='_Train' VALIDATE='_Valid');

run;

4. Run.

The output results will give you the ASE of training/validation/testing.

sas code glmselect output.png

This model isn't fabulous for this data set but hopefully this approach will give you good results on yours!

Good luck!

-m

Good reference: SAS/STAT(R) User's Guide, proc glm select - partition statement

PS If you try the other approach I described, you can easily use a Model Comparison node to compare with HPGLM or any of the model nodes in Enterprise Miner. Very recommended to give this a try!!!!!!!!!!

Ready to join fellow brilliant minds for the SAS Hackathon?

Build your skills. Make connections. Enjoy creative freedom. Maybe change the world. Registration is now open through August 30th. Visit the SAS Hackathon homepage.

Register today!
How to choose a machine learning algorithm

Use this tutorial as a handy guide to weigh the pros and cons of these commonly used machine learning algorithms.

Find more tutorials on the SAS Users YouTube channel.

Discussion stats
  • 3 replies
  • 4436 views
  • 0 likes
  • 2 in conversation