Building models with SAS Enterprise Miner, SAS Factory Miner, SAS Visual Data Mining and Machine Learning or just with programming

SAS Enterprise miner filter node extemely slow

Accepted Solution Solved
Reply
Occasional Contributor
Posts: 5
Accepted Solution

SAS Enterprise miner filter node extemely slow

Hello,

I'm running a very simple filtering operation with SAS EM filter node, the filter only filters dataset based on whether a variable exists or not. All other filter options are set to none, so the filter node should only do very simple operation. For some reason the filter node takes around 20 minutes to perform this operation that in normal SAS environment would only take fraction of seconds. The data set I'm using is quite large (~3 million rows, 100 columns), but still the operation shouldn't be very difficult to perform... Any hints how to make this faster or do the filtering in a faster way?


Accepted Solutions
Solution
‎01-14-2016 05:28 AM
Occasional Contributor
Posts: 5

Re: SAS Enterprise miner filter node extemely slow

Ended up implementing filtering in the sas code- node as the filter node does not seem to work. Quite straighforward there:

 

data &EM_EXPORT_TRAIN;
 set &EM_IMPORT_DATA;
 where conditions;
run;

View solution in original post


All Replies
Super User
Posts: 19,855

Re: SAS Enterprise miner filter node extemely slow

The filter operation would then actually be copying that big data set over to a new, temporary dataset, without the variable. This may also be happening over a network, slowing things down.  3 million rows shouldn't be that big of a data set though, so you may want to talk to your IT folks about tweaks to your system. 

 

Occasional Contributor
Posts: 5

Re: SAS Enterprise miner filter node extemely slow

Thanks for the answer. Any idea what is so special about filter node that makes it so slow? In the same diagram I'm also loading data, doing data partitioning, building 2 regression models with forward variable selection and completing model comparison. All other steps take around 2 minutes to compute all together but the filtering node which does much simpler things takes 10 times more time. What makes the filtering node so special compared to other calculation nodes that it takes so much time? Loading data using the input data node only takes maximum 30 seconds so I don't think copying the data can take 20 minutes..

Occasional Contributor
Posts: 5

Re: SAS Enterprise miner filter node extemely slow

Is there some other way to do filtering in EM? The filtering I'm trying to do is super simple and as data step it would be the following:

 

data OUTPUT_DATA;
 set INPUT_DATA;
 where not missing(VARIABLE);
run;

 

How to implement this in EM? Using filter-node this takes around 17 minutes to compute which is really not usable.

Solution
‎01-14-2016 05:28 AM
Occasional Contributor
Posts: 5

Re: SAS Enterprise miner filter node extemely slow

Ended up implementing filtering in the sas code- node as the filter node does not seem to work. Quite straighforward there:

 

data &EM_EXPORT_TRAIN;
 set &EM_IMPORT_DATA;
 where conditions;
run;

☑ This topic is solved.

Need further help from the community? Please ask a new question.

Discussion stats
  • 4 replies
  • 818 views
  • 0 likes
  • 2 in conversation