PDF to SAS

Reply
Contributor
Posts: 21

PDF to SAS

I have 500 PDF  files need to convert to SAS datasets. I found there are some complex codes show to do one convert (one pdf file to one  sas datsset) . This code is prety complex and one do one convert. Is there any simple proc. to do it in a simple way and can quickly convert  these 500 pdf files?

 

Thanks

 

 

 

Grand Advisor
Posts: 17,396

Re: PDF to SAS

[ Edited ]

No.

PDF files are not easily readable by any system Smiley Sad

 

EDIT:

To clarify there's no simple proc. Your best bet is as indicated to save data to a text or machine readable file. 

Personally, I would purchase a one month subscription to Adobe and use Adobe Pro to convert it. If you have Adobe Professional, most big corps do, you can batch process all 500 in a script. Adobe has an Automator feature that works well IMO. 

Contributor
Posts: 21

how to convert many PDF to SAS dataset

I have 500 PDF  files need to convert to SAS datasets. I found there are some complex codes show to do one convert (one pdf file to one  sas datsset) . This code is prety complex and one do one convert. Is there any simple proc. to do it in a simple way and can quickly convert  these 500 pdf files?

 

Thanks

 

 

 

Regular Contributor
Posts: 211

Re: how to convert many PDF to SAS dataset

You may want to look into the option of converting the PDF into a File format that can be accessed by SAS, such as Excel!?

Here is a link with such option: https://wagda.lib.washington.edu/gishelp/tutorial/excel.html

 

Hope this helps,

Ahmed

Respected Advisor
Posts: 3,837

Re: how to convert many PDF to SAS dataset

To add to what @AhmedAl_Attar posted:

PDF files as such are not "tabular" so there is not really a direct conversion path. Tika would allow you to convert your PDF into a text based document (done that myself, works really well and is simple to use) which you then could read into SAS.

There is also Apache PDFBox which apparently can do PDF to csv conversions - never used it though.

 

https://tika.apache.org/ 

http://pdfbox.apache.org/ 

Esteemed Advisor
Posts: 6,692

Re: PDF to SAS

Such a conversion will only make sense if the PDFs in question contain usable data. Since a PDF could also be one big graphical image (like a scan), it is one of the least suited formats for business intelligence data transfer.

I'd rather request the originator to provide data in a format that makes sense. And provide metadata (column descriptions) along.

---------------------------------------------------------------------------------------------
Maxims of Maximally Efficient SAS Programmers
Grand Advisor
Posts: 10,223

Re: PDF to SAS

If the issue has to do with PDF fillable forms and the data contained therein then use a proper PDF tool like Adobe Acrobat Pro to export the data. That will usually result in some form of set that can be imported to SAS.

Contributor
Posts: 21

Re: PDF to SAS

This is the way I decide to do with my data.

1. Convert PDF to excel using Adobe Acrobat Professional version, which allows me to convert hundreds pdf to excel just by one “click”

2. Read excel to sas using a macro
Ask a Question
Discussion stats
  • 7 replies
  • 566 views
  • 1 like
  • 6 in conversation