I am hesitant to respond because I may not have the best answer to your question. However, since you posted so long ago, I felt like you deserved a response.
There is no documented limit to document size that I could find. I have converted English and German language PDF files having over 500 pages. The converted files appear to be encoded using UTF-8 encoding, and it appears that a VarChar variable length character variable is used to hold the text. In the older SAS Text Miner software, VarChar data types are not supported, so the PDF file is converted and written to disk. SAS Text Miner reads the complete file from disk, even if it has more than 32,767 characters, where 32,767 is the maximum size of a fixed length character variable. In both SAS Text Miner and SAS Visual Text Analytics, no document truncation occurs unless you force truncation. There can be some challenges in viewing and scrolling through very large documents.
Give it a try. Nothing succeeds like success.
... View more