BookmarkSubscribeRSS Feed
🔒 This topic is solved and locked. Need further help from the community? Please sign in and ask a new question.
Mike90
Quartz | Level 8

I am trying to use the Association Node.

When I click on the Variables property it lists 4 variables, with 2 marked as dropped.  It has selected an ID input and one of the targets.

 

My Input Data source has 22 inputs and I have 3 variables marked as targets.

 

Settings:

Input Data Source node

   Train

       Output: Data (I also tried leaving it at View)

       Role:  Transaction

 

*** How do I get the Association Node to use all of the inputs, and find associations between the inputs, and between the inputs and the targets?

 

 

 

 

1 ACCEPTED SOLUTION

Accepted Solutions
DougWielenga
SAS Employee

Mike90,

 

You would typically actually only specify 2 or 3 roles in doing an Association node.  If you have SAS Enterprise Miner open, click on 

 

    Help --> Contents 

 

and then navigate to 

 

Node Reference 

     Explore

           Assocation Node

 

in the panel on the left and then navigate to Association Node Data Set Requirements in the panel on the right where you will see

 

/*** BEGIN EXCERPT ***/

 

Input Data Format

 

To perform association discovery, the input data set must have a separate observation for each product purchased by each customer, as illustrated in the following table. You must assign the ID role to one variable and the target model role to another variable when you create the data source.

 

To perform sequence discovery, the input data set must have a separate observation for each product purchased by each customer at each visit. In addition to assignment of ID and target roles, you must apply a sequence model role to a time stamp variable. The sequence variable is used for timing comparisons. It can have any numeric value including date/time values. The time or span from observation to observation in the input data set must be on the same scale.

 

/*** END EXCERPT ***/

 

If you are getting errors, there are several possible issues happening including the following:

 

Possible Data issues:

    * The Input Data Source does not have the role set to Transaction

    * The data is not in transactional form (many rows per ID, one per product/item

    * The data does not support Association analysis

 

Possible System Issues:

    * running out of disk space

    * insufficient RAM

    * variables assigned inappropriately

    * compression enabled on the disk 

 

If you look at the error in the log, you will see

 

   ERROR: Insufficient space in file WORK.SORTEDTRAIN.DATA.

 

You also mentioned that there are many variables that are not being displayed.  If those variables are particularly long, they could be making the size of your data enormous.  Association analysis only needs 2 or 3 roles as described, so all of these additional variables are making the data set much larger than necessary.  Given you have insufficient memory, look to drop (don't just 'Reject') all of the unnecessary variable fields and make sure what remains has the appropriate structure for analyzing using the Association node.  You also might want to verify you have plenty of disk space where the data is stored.  If it is on an external USB drive, make plans to move it to an internal disk.   Sorting your data first also reduces the impact on system resources.   Be sure to check the structure of your data against the examples in the help. 

If none of this helps, post the results of running the CONTENTS procedure against your data along with the first several observations so we can suggest next steps. 

 

Hope this helps!

Doug 

View solution in original post

4 REPLIES 4
WendyCzika
SAS Employee

The EM Reference Help has this information, examples, etc.

Mike90
Quartz | Level 8

There is some information if you select 'Help' from the program itself.

 

I have met all the requirement there.

 

When I click on the '...' by Variables in the Association Node, it only lists 4 variables.   It has selected an ID and a target form the 4.

I have no option to select any other input variables.

 

If I hook a DMDB node to the Data Source node, and then change the role to Train in the data source, pressing '...' in the DMDB node give a list of ALL the variables.

 

Why are the variables not available in the Association Node?

 

Mike90
Quartz | Level 8

I tried to run the Association Node anyway, realizing the results would be meaningless with just an ID and Target.

I got the following error running the node.

 

NOTE: There were 67216 observations read from the data set EMWS2.IDS_DATA.
NOTE: The data set WORK.SORTEDTRAIN has 67216 observations and 2 variables.
ERROR: Insufficient space in file WORK.SORTEDTRAIN.DATA.

 

 

 

DougWielenga
SAS Employee

Mike90,

 

You would typically actually only specify 2 or 3 roles in doing an Association node.  If you have SAS Enterprise Miner open, click on 

 

    Help --> Contents 

 

and then navigate to 

 

Node Reference 

     Explore

           Assocation Node

 

in the panel on the left and then navigate to Association Node Data Set Requirements in the panel on the right where you will see

 

/*** BEGIN EXCERPT ***/

 

Input Data Format

 

To perform association discovery, the input data set must have a separate observation for each product purchased by each customer, as illustrated in the following table. You must assign the ID role to one variable and the target model role to another variable when you create the data source.

 

To perform sequence discovery, the input data set must have a separate observation for each product purchased by each customer at each visit. In addition to assignment of ID and target roles, you must apply a sequence model role to a time stamp variable. The sequence variable is used for timing comparisons. It can have any numeric value including date/time values. The time or span from observation to observation in the input data set must be on the same scale.

 

/*** END EXCERPT ***/

 

If you are getting errors, there are several possible issues happening including the following:

 

Possible Data issues:

    * The Input Data Source does not have the role set to Transaction

    * The data is not in transactional form (many rows per ID, one per product/item

    * The data does not support Association analysis

 

Possible System Issues:

    * running out of disk space

    * insufficient RAM

    * variables assigned inappropriately

    * compression enabled on the disk 

 

If you look at the error in the log, you will see

 

   ERROR: Insufficient space in file WORK.SORTEDTRAIN.DATA.

 

You also mentioned that there are many variables that are not being displayed.  If those variables are particularly long, they could be making the size of your data enormous.  Association analysis only needs 2 or 3 roles as described, so all of these additional variables are making the data set much larger than necessary.  Given you have insufficient memory, look to drop (don't just 'Reject') all of the unnecessary variable fields and make sure what remains has the appropriate structure for analyzing using the Association node.  You also might want to verify you have plenty of disk space where the data is stored.  If it is on an external USB drive, make plans to move it to an internal disk.   Sorting your data first also reduces the impact on system resources.   Be sure to check the structure of your data against the examples in the help. 

If none of this helps, post the results of running the CONTENTS procedure against your data along with the first several observations so we can suggest next steps. 

 

Hope this helps!

Doug 

hackathon24-white-horiz.png

2025 SAS Hackathon: There is still time!

Good news: We've extended SAS Hackathon registration until Sept. 12, so you still have time to be part of our biggest event yet – our five-year anniversary!

Register Now

How to choose a machine learning algorithm

Use this tutorial as a handy guide to weigh the pros and cons of these commonly used machine learning algorithms.

Find more tutorials on the SAS Users YouTube channel.

Discussion stats
  • 4 replies
  • 1984 views
  • 0 likes
  • 3 in conversation