Text Miner uses a compressed representation of the term-by-doc frequency matrix. You will find an OUT data set in the project data directory of your text miner run. Its label will include the string "OUT" in it. Since a 30,000 document collection will have as many as 500,000 to a million distinct terms, be sure to restrict your terms of interest with a start list. I give an example of creating the cooccurrence matrix with the following code which expands the compressed version to an uncompressed version and then computes the co-occurrence count with proc corr and the sscp option.