Building models with SAS Enterprise Miner, SAS Factory Miner, SAS Visual Data Mining and Machine Learning or just with programming

Initial options in variable cluster node in EM 13.2?

Accepted Solution Solved
Reply
Occasional Contributor
Posts: 9
Accepted Solution

Initial options in variable cluster node in EM 13.2?

I'm getting different results of variable clusters from EM and EG. I think it's due to the Initial option. In EG, I can use INITIAL=RANDOM and RANDOM=<seed> to set a random seed so that each time I can get the same clusters.If I don't set INITIAL, the clusters seem to be related to the order of the variables I put in the var statement.

 

But I couldn't find the INIT option in EM. Which INIT method is EM using? Is it possible to set a random seed in EM?


Accepted Solutions
Solution
2 weeks ago
SAS Employee
Posts: 121

Re: Initial options in variable cluster node in EM 13.2?

[ Edited ]

Although the Variable Clustering node in SAS Enterprise Miner uses the VARCLUS procedure, it's implementation in SAS Enterprise Miner is somewhat different.  In general, I would not expect both to obtain exactly the same results from both processes.  For example, you can include class variables in your analysis.   

 

If you add the following lines to your Project Start Code

 

    options MPRINT;

 

and force the node to rerun (e.g. Set the Rerun property for the Input Data Source node to Yes and rerun the flow) then you will be able to view the actual code that is being run.   Here is an excerpt of the code run against the data SAMPSIO.HMEQ that is installed with SAS Enterprise Miner:

 

/*** BEGIN LOG EXCERPT ***/

 

MPRINT(VARCLUS): proc varclus data = EMWS21.Ids2_DATA outstat= EMWS21.VarClus3_OUTSTAT hi short ;
MPRINT(VARCLUS): var
MPRINT(EM_INTERVAL_INPUT): CLAGE CLNO DEBTINC DELINQ DEROG LOAN MORTDUE NINQ VALUE YOJ
MPRINT(VARCLUS): ;
MPRINT(VARCLUS): ;
MPRINT(VARCLUS): run;

 

/*** END LOG EXCERPT ***/

 

You can see from the log above that the INIT= and RANDOM= options are not specified in this call to VARCLUS since those options are not available in the SAS Enterprise Miner interface.  You will also notice iin the log that there is a great deal more processing that is occurring in SAS Enterprise Miner.   Having said that, there are additional options that you can specify in SAS Enterprise Miner that would likely allow you to get answers close to those produced by VARCLUS with the same options as long as grouping variables are not included in your analysis.

 

Hope this helps!

Doug

View solution in original post


All Replies
Solution
2 weeks ago
SAS Employee
Posts: 121

Re: Initial options in variable cluster node in EM 13.2?

[ Edited ]

Although the Variable Clustering node in SAS Enterprise Miner uses the VARCLUS procedure, it's implementation in SAS Enterprise Miner is somewhat different.  In general, I would not expect both to obtain exactly the same results from both processes.  For example, you can include class variables in your analysis.   

 

If you add the following lines to your Project Start Code

 

    options MPRINT;

 

and force the node to rerun (e.g. Set the Rerun property for the Input Data Source node to Yes and rerun the flow) then you will be able to view the actual code that is being run.   Here is an excerpt of the code run against the data SAMPSIO.HMEQ that is installed with SAS Enterprise Miner:

 

/*** BEGIN LOG EXCERPT ***/

 

MPRINT(VARCLUS): proc varclus data = EMWS21.Ids2_DATA outstat= EMWS21.VarClus3_OUTSTAT hi short ;
MPRINT(VARCLUS): var
MPRINT(EM_INTERVAL_INPUT): CLAGE CLNO DEBTINC DELINQ DEROG LOAN MORTDUE NINQ VALUE YOJ
MPRINT(VARCLUS): ;
MPRINT(VARCLUS): ;
MPRINT(VARCLUS): run;

 

/*** END LOG EXCERPT ***/

 

You can see from the log above that the INIT= and RANDOM= options are not specified in this call to VARCLUS since those options are not available in the SAS Enterprise Miner interface.  You will also notice iin the log that there is a great deal more processing that is occurring in SAS Enterprise Miner.   Having said that, there are additional options that you can specify in SAS Enterprise Miner that would likely allow you to get answers close to those produced by VARCLUS with the same options as long as grouping variables are not included in your analysis.

 

Hope this helps!

Doug

☑ This topic is solved.

Need further help from the community? Please ask a new question.

Discussion stats
  • 1 reply
  • 343 views
  • 0 likes
  • 2 in conversation