Text mining and content categorization

SAS Content Categorization - export does not include all results

Reply
New Contributor mry
New Contributor
Posts: 2

SAS Content Categorization - export does not include all results

Dear experts,

 

I am using SAS Content Categorization 12.2 for text analytics. In my test set are roughly 4000 documents. In order to further assess my results, I would like to export them from SAS. I am using the export function to do this. See screenshot attached.

However, when exporting the results only about 700 documents show up in the resulting CSV file.

I can even specifically see test results for certain documents in SAS which are then not showing up in the export.

 

Does anybody have an idea about the root cause?

Or is there a better way to do this export?

 

Thanks for your help and regards,

 

Fabian

 

 


2017-05-14_09-28-55.png2017-05-14_09-31-34.png2017-05-14_09-31-49.png
Super User
Posts: 10,871

Re: SAS Content Categorization - export does not include all results

Is there any chance that the source documents originated on a Unix system where names are case sensitive? It may be that the exports of files with names like EA Management-5NSWQZZ7.txt and EA Management-5nswqzz7.txt would be overwriting the similar names as Windows really doesn't worry much about case.

Or that the source documents have the same name from different folders?

New Contributor mry
New Contributor
Posts: 2

Re: SAS Content Categorization - export does not include all results

Hi, thanks for the reply. The files are located on a Windows server (the same server which runs SAS). Also the file names are unique and are located in one single source directory.

 

One example for a file which does not show up in the export is "Unscored-VKFX8EDI.txt". I have attached it here for reference.

The strange thing is again, that the test result shows up in SAS itself, but it is just not visible in the export of test results (see screenshot attached).


2017-05-16_06-59-20.png
Ask a Question
Discussion stats
  • 2 replies
  • 220 views
  • 0 likes
  • 2 in conversation