Mike90,
You would typically actually only specify 2 or 3 roles in doing an Association node. If you have SAS Enterprise Miner open, click on
Help --> Contents
and then navigate to
Node Reference
Explore
Assocation Node
in the panel on the left and then navigate to Association Node Data Set Requirements in the panel on the right where you will see
/*** BEGIN EXCERPT ***/
Input Data Format
To perform association discovery, the input data set must have a separate observation for each product purchased by each customer, as illustrated in the following table. You must assign the ID role to one variable and the target model role to another variable when you create the data source.
To perform sequence discovery, the input data set must have a separate observation for each product purchased by each customer at each visit. In addition to assignment of ID and target roles, you must apply a sequence model role to a time stamp variable. The sequence variable is used for timing comparisons. It can have any numeric value including date/time values. The time or span from observation to observation in the input data set must be on the same scale.
/*** END EXCERPT ***/
If you are getting errors, there are several possible issues happening including the following:
Possible Data issues:
* The Input Data Source does not have the role set to Transaction
* The data is not in transactional form (many rows per ID, one per product/item
* The data does not support Association analysis
Possible System Issues:
* running out of disk space
* insufficient RAM
* variables assigned inappropriately
* compression enabled on the disk
If you look at the error in the log, you will see
ERROR: Insufficient space in file WORK.SORTEDTRAIN.DATA.
You also mentioned that there are many variables that are not being displayed. If those variables are particularly long, they could be making the size of your data enormous. Association analysis only needs 2 or 3 roles as described, so all of these additional variables are making the data set much larger than necessary. Given you have insufficient memory, look to drop (don't just 'Reject') all of the unnecessary variable fields and make sure what remains has the appropriate structure for analyzing using the Association node. You also might want to verify you have plenty of disk space where the data is stored. If it is on an external USB drive, make plans to move it to an internal disk. Sorting your data first also reduces the impact on system resources. Be sure to check the structure of your data against the examples in the help. If none of this helps, post the results of running the CONTENTS procedure against your data along with the first several observations so we can suggest next steps.
Hope this helps!
Doug
... View more