Building models with SAS Enterprise Miner, SAS Factory Miner, SAS Visual Data Mining and Machine Learning or just with programming

Is there a way to breakup a proc tmspell run into a set of smaller tasks?

Accepted Solution Solved
Reply
PROC Star
Posts: 7,480
Accepted Solution

Is there a way to breakup a proc tmspell run into a set of smaller tasks?

I'm trying to run proc tmspell on a file that has almost 2 million entries.  The proc has been running for the past 20 hours and I have no way of knowing which will come first: the proc's completion or a power outage.

Is it possible to break such a run up into sections so that one can take advantage of parallel processing?  I haven't been able to find any documentation for the proc, thus am just assuming it may help in accomplishing a particular task.  Specifically, we are trying to create two crosswalks, one regarding spelling and another basically a list of synonyms.

TIA,

Art


Accepted Solutions
Solution
‎08-17-2017 01:46 PM
SAS Employee
Posts: 21

Re: Is there a way to breakup a proc tmspell run into a set of smaller tasks?

There is not a way to have the procedure broken up into multiple sections from a parallel or multi-threaded point of view. The best practice for running PROC TMSPELL on a large data set would be to perform some pre-processing beforehand. For exampl, in the terms table that is output by PROC TGPARSE, subset the table and only use the terms that have a Keep status of Yes. This should shrink the number of terms to be run through PROC TMSPELL considerably and will thus increase performance immensely.

View solution in original post


All Replies
Solution
‎08-17-2017 01:46 PM
SAS Employee
Posts: 21

Re: Is there a way to breakup a proc tmspell run into a set of smaller tasks?

There is not a way to have the procedure broken up into multiple sections from a parallel or multi-threaded point of view. The best practice for running PROC TMSPELL on a large data set would be to perform some pre-processing beforehand. For exampl, in the terms table that is output by PROC TGPARSE, subset the table and only use the terms that have a Keep status of Yes. This should shrink the number of terms to be run through PROC TMSPELL considerably and will thus increase performance immensely.
☑ This topic is solved.

Need further help from the community? Please ask a new question.

Discussion stats
  • 1 reply
  • 255 views
  • 1 like
  • 2 in conversation