Building models with SAS Enterprise Miner, SAS Factory Miner, SAS Visual Data Mining and Machine Learning or just with programming

SAS Enterprise Miner very slow

Accepted Solution Solved
Reply
Occasional Contributor
Posts: 11
Accepted Solution

SAS Enterprise Miner very slow

I am walking through the Introduction to Enterprise Miner tutorial with Census2000 dataset.  I am using 32-bit edition and the miner is very slow.  For example, when I explore the data, it takes about 4-5 minutes.  I am using it on a laptop with 32GB memory and a Core i7 processor so it is more than sufficient.  I am using SAS 9.4, Windows 8.1 and the most recent update of JRE 1.8.  The JRE at most is using about 4% of CPU time so parallel processing is obviously not taken advantage of by the Enterprise Miner.  Any idea how to speed things up?  This dataset has just 33k tuples so I suspect it can handle any "big data".


Accepted Solutions
Solution
‎08-07-2017 09:59 AM
SAS Employee
Posts: 179

Re: SAS Enterprise Miner very slow

Posted in reply to chenzhang

The data set you are describing is not particularly big and your machine is not particularly incapable.  I would suggest you look at a few things:

 

1 - Where is the data being stored?   If it is on another machine connected via a network or on a USB attached external drive, you could be experiencing issues due to slow I/O.  Data Mining is a memory intensive activity and time lost to I/O can greatly slow down your ability to browse/explore the data.   

 

2 - How full is your hard-drive?   If you have limited disk space, you could be running into resource constraints which are limiting the amount of virtual memory available.  You can also run into issues if you have sufficient RAM but it is being blocked for possible use by other applications.  

 

3 - What is the recommended version of Java?   SAS Enterprise Miner is tested with specific versions of Java.  New Java updates are often not backwards compatible so upgrading Java can actually hurt your performance.  This is often challenging because in many cases, Java is constantly prompting you to update.

 

4 - How old is your project?   Projects which have been in use for sometime can start to perform slower.  Trying to build a new flow in a new project/diagram might improve performance.  

 

5 - How are your variables defined/formatted?   SAS Enterprise Miner normalizes variables to have no more than 32 characters in a the name and no more than 32 characters in the field.    It uses the internal normalized version of the variable for analysis.  You can run into problems separating levels if your variable levels do not differ in the first 32 characters.  You will also find that many exported text data sets have associated formats and/or lengths which are far greater than the field requires (e.g. a Yes/No variable with a length of 200 characters).  It is necessary to allow space for the full formatted length so when fields are unnecessarily long, it can slow processing greatly (even if the field contents are relatively short).  Run the CONTENTS procedure against your data to determine if there are potential issues here.   Any field that is formatted to have a length greater than 32 should be assessed whether the format is necessary (assuming you are not doing text mining).  

 

 

Let me know if any of these potential issues persist in your data/environment.

 

Cordially,

Doug

View solution in original post


All Replies
SAS Employee
Posts: 38

Re: SAS Enterprise Miner very slow

Posted in reply to chenzhang

I'd suggest contacting SAS Technical Support so that they can help you examine the specifics of your environment and help you debug. Technical Support is included in your license and they'd be glad to help!

 

https://support.sas.com/en/technical-support.html

Super User
Posts: 19,772

Re: SAS Enterprise Miner very slow

Posted in reply to chenzhang

Is EM installed locally?

Occasional Contributor
Posts: 11

Re: SAS Enterprise Miner very slow

Yes, EM is installed locally.

Solution
‎08-07-2017 09:59 AM
SAS Employee
Posts: 179

Re: SAS Enterprise Miner very slow

Posted in reply to chenzhang

The data set you are describing is not particularly big and your machine is not particularly incapable.  I would suggest you look at a few things:

 

1 - Where is the data being stored?   If it is on another machine connected via a network or on a USB attached external drive, you could be experiencing issues due to slow I/O.  Data Mining is a memory intensive activity and time lost to I/O can greatly slow down your ability to browse/explore the data.   

 

2 - How full is your hard-drive?   If you have limited disk space, you could be running into resource constraints which are limiting the amount of virtual memory available.  You can also run into issues if you have sufficient RAM but it is being blocked for possible use by other applications.  

 

3 - What is the recommended version of Java?   SAS Enterprise Miner is tested with specific versions of Java.  New Java updates are often not backwards compatible so upgrading Java can actually hurt your performance.  This is often challenging because in many cases, Java is constantly prompting you to update.

 

4 - How old is your project?   Projects which have been in use for sometime can start to perform slower.  Trying to build a new flow in a new project/diagram might improve performance.  

 

5 - How are your variables defined/formatted?   SAS Enterprise Miner normalizes variables to have no more than 32 characters in a the name and no more than 32 characters in the field.    It uses the internal normalized version of the variable for analysis.  You can run into problems separating levels if your variable levels do not differ in the first 32 characters.  You will also find that many exported text data sets have associated formats and/or lengths which are far greater than the field requires (e.g. a Yes/No variable with a length of 200 characters).  It is necessary to allow space for the full formatted length so when fields are unnecessarily long, it can slow processing greatly (even if the field contents are relatively short).  Run the CONTENTS procedure against your data to determine if there are potential issues here.   Any field that is formatted to have a length greater than 32 should be assessed whether the format is necessary (assuming you are not doing text mining).  

 

 

Let me know if any of these potential issues persist in your data/environment.

 

Cordially,

Doug

Occasional Contributor
Posts: 11

Re: SAS Enterprise Miner very slow

Posted in reply to DougWielenga

Doug:

 

I cleaned about 20gb of space on my SSD which is c: system drive since it is quite full.  The EM is quicker by about 50%.  I did not expect it to use virtual memory since I thought I had plenty of physical memory but unsupervised learning could do that depending on the implementation.  I know Apache Spark do most in memory but maybe EM is different. Now I have your guideline I can look into a few more things to see if they will improve performance further.

 

Thanks,

Chen

☑ This topic is solved.

Need further help from the community? Please ask a new question.

Discussion stats
  • 5 replies
  • 329 views
  • 1 like
  • 4 in conversation