05-27-2013 05:39 PM
I have recently installed SAS Enterprise Content Categorization Server in our unix environment and also have studio client for the same.
Can anyone tell me how can I access the directories (source directory which contains all the docs) present in server and use them in Enterprise Content Categorization Studio for creation of projects and uploading he same to the server?
06-20-2013 04:56 PM
Thanks for the information.
Could you please let me know how SAS Enterprise Content Categorization Studio is dependent on Text Miner.
I understand, we can generate boolean rules and concepts from SAS ECC Studio and apply those to documents or any files accessible to ECC studio. Can this be executed in batch mode?
Also can the rules or concepts generated from ECC studio be applied an any models?
Look forward for your reply.
06-20-2013 05:41 PM
In the current configuration, there aren't specific dependencies - but there are definite advantages of using the two products together.
For example, entities can be created in SAS Enterprise Content Categorization and used as custom entities in SAS Text Miner. I think of this as automated discovery that also includes things that you want to find (i.e. custom entities, like product codes for example). For those who don't have the full capabilities of SAS Enterprise Content Categorization, SAS offers an add-on to Text Miner = SAS Concept Creation for SAS Text Miner (specifically for the purpose of creating custom entities to include in text mining discovery). Here is a recent SGF paper on that topic:http://support.sas.com/resources/papers/proceedings13/100-2013.pdf Of course, SAS Enterprise Content Categorization does alot more than just this add-on.
In SAS Text Miner, you can automatically discover Boolean rules to create an initial content categorization rule set - to further refine in SAS Enterprise Content Categorization Studio (some highlights of this capbilities, new to SAS Text Miner 12.1 can be seen in the short product demo video from: Text Mining, SAS Text Miner | SAS
Linguistic rule development in Enterprise Content Categorization is not done in batch - it is user-specified (and thus the benefit of an initial rule set automatically generated from Text Miner). That being said, SAS Enterprise Content Categorization has some methods for automated taxonomy generation, like from Wikipedia and DBPedia. Scoring with models developed from any of the SAS Text Analytics products can be done in batch. The format of the models/rules resulting from SAS Enterprise Content Categorization are in XML, for scoring using the Enterprise Content Categorization Server. That being said, once document are scored, this is new metadata, can be used to enhance existing enterprise search and content management systems, and can be used as new variables for further exploration, reporting and analysis.