I have 500 PDF files need to convert to SAS datasets. I found there are some complex codes show to do one convert (one pdf file to one sas datsset) . This code is prety complex and one do one convert. Is there any simple proc. to do it in a simple way and can quickly convert these 500 pdf files?
Thanks
No.
PDF files are not easily readable by any system 😞
EDIT:
To clarify there's no simple proc. Your best bet is as indicated to save data to a text or machine readable file.
Personally, I would purchase a one month subscription to Adobe and use Adobe Pro to convert it. If you have Adobe Professional, most big corps do, you can batch process all 500 in a script. Adobe has an Automator feature that works well IMO.
I have 500 PDF files need to convert to SAS datasets. I found there are some complex codes show to do one convert (one pdf file to one sas datsset) . This code is prety complex and one do one convert. Is there any simple proc. to do it in a simple way and can quickly convert these 500 pdf files?
Thanks
You may want to look into the option of converting the PDF into a File format that can be accessed by SAS, such as Excel!?
Here is a link with such option: https://wagda.lib.washington.edu/gishelp/tutorial/excel.html
Hope this helps,
Ahmed
To add to what @AhmedAl_Attar posted:
PDF files as such are not "tabular" so there is not really a direct conversion path. Tika would allow you to convert your PDF into a text based document (done that myself, works really well and is simple to use) which you then could read into SAS.
There is also Apache PDFBox which apparently can do PDF to csv conversions - never used it though.
Such a conversion will only make sense if the PDFs in question contain usable data. Since a PDF could also be one big graphical image (like a scan), it is one of the least suited formats for business intelligence data transfer.
I'd rather request the originator to provide data in a format that makes sense. And provide metadata (column descriptions) along.
If the issue has to do with PDF fillable forms and the data contained therein then use a proper PDF tool like Adobe Acrobat Pro to export the data. That will usually result in some form of set that can be imported to SAS.
Join us for SAS Innovate 2025, our biggest and most exciting global event of the year, in Orlando, FL, from May 6-9. Sign up by March 14 for just $795.
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.
Ready to level-up your skills? Choose your own adventure.