To use the SAS text search tools in basic SAS you would first need to convert every file to a SAS data set prior to searching and PDF is not a "nice" data format, likely meaning not practical.
I am not sure whether the SAS Text Miner, if you have access to that, might have more luck.
A secondary issue with PDF files is that the characters are not exactly stored as the letters you expect to make up the key word you may want to search for.
If you have Adobe Acrobat, it provides this capability: https://helpx.adobe.com/nz/acrobat/using/searching-pdfs.html
A Google search will find other tools and techniques.
SAS really isn't a good option for searching documents for key words unless you are doing analytics.
Hi @Emma2021 ,
You may want to invest in a Search Indexing Tool for your Windows Documents. These tools are purpose built, and they probably do their job much better than custom SAS hand coding.
Here is a example of such tool: https://docfetcher.sourceforge.io/en/index.html (Disclaimer: I've never used this before, but long time ago, I've used a similar indexing tool for searching through my saved docs)
https://en.wikipedia.org/wiki/Desktop_search
https://sourceforge.net/software/desktop-search/
Hope this helps,
Ahmed
@Emma2021 wrote:
I don’t want to open each pdf to search the word. I thought SAS has a tool
To search for words in a .pdf you need to open it and scan the text whether you code that now explicitly yourself or some "tool" does it for you in the background.
SAS Text Miner/Text Analytics allows for .pdf sources. If you've got that licensed then look into it.
If not then using SAS you would need to create a directory listing for all .pdf (path and name), call a 3rd party tool like tika to convert the .pdf to text and then use SAS to search through the text for specific terms.
You don't have to. That link I posted describes how you can catalog multiple PDFs to search all of them at the same time.
Hi @Emma2021,
@SASKiwi wrote:
If you have Adobe Acrobat, it provides this capability: https://helpx.adobe.com/nz/acrobat/using/searching-pdfs.html
Note that this capability is also available in the free Acrobat Reader, i.e., you don't need the paid Acrobat Pro software. Just open the "Advanced Search" dialog with Ctrl-Shift-F and you'll see the option to select a folder. The search will automatically include all subfolders of the selected folder.
Join us for SAS Innovate 2025, our biggest and most exciting global event of the year, in Orlando, FL, from May 6-9. Sign up by March 14 for just $795.
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.
Ready to level-up your skills? Choose your own adventure.