BookmarkSubscribeRSS Feed
Rucstat_huadli
Fluorite | Level 6
How can I read *.pdf documents using SAS!
7 REPLIES 7
Cynthia_sas
SAS Super FREQ
Hi!
This paper describes a process whereby you must first take a PDF file and turn it into an ASCII text file before you can read it with SAS. Since PDF is a proprietary format, the process he describes, makes sense. SAS creates PDF format files, it does not read them in their native, binary, format:
http://www8.sas.com/scholars/05/SESUG_05/Proceedings/2005/Serendipity/SER10_05.PDF

One other possibility is that you want to read the data that was collected in a PDF form (an FDF file or an XFDF file), as described in this paper:
http://www2.sas.com/proceedings/sugi27/p032-27.pdf

A third possibility involves printing the PDF document and then scanning it into OCR format, saving the file from the OCR scan and then reading -that- file with SAS (this is a variation of the first possibility).

Good luck!
cynthia
gfjump
Calcite | Level 5

there a variety of online pdf viewer vb.net on the web you can find to read pdf in full version.  you can also have all the processing features: zoom crop scale. most importanly you can convert pdf to various image formats. so it won't be a problem to read pdf now.

cathyhill
Calcite | Level 5

So the process of reading PDF doucment file is, in essence, the process of decoding PDF document to bitmap? By the way, witout using Adobe Acrobat PDF document reader, is there any free source code for us to use in order to view document in web application?

arronlee
Calcite | Level 5

Hi, Cathyhill.

I am using another PDF reader to help me read PDF documents instead of Adobe Acrobat PDF document reader. What's more, using code to deal with the related PDF documents reading problem is too complicated for me. So you can choose some manual toolkits which allows users to customize its features according to our own favors to help you with the related PDF documents reading problem. Remember to check its free trial package first if possible. I hope you success. Good luck.

Best regards,

Arron

Rucstat_huadli
Fluorite | Level 6
Thank u very much!!
mannimanoj
Calcite | Level 5

The easiest and fastest way by far is to use the full version of Adobe Acrobat.  Yes, it's expensive around $800 for the license but most companies will find at some point they need to edit PDFs.  You can also try on-line PDF to Excel converters (google it) but most only do a small number of pages.  There might be other cheaper PDF editors around.

So basically open the PDF in the full verion of Adobe Acrobat and then   File, Save as, select Excel.  Then from there it is plain sailing.  All the other methods I've looked in to are mega complicated and require lots of messing around.

jthy
Calcite | Level 5

I don't know which environment you are working in but if it is Windows you might find the PDF-text-extractor useful. There is a client based free version and there is a command line based version for USD 35. I went all in and invested the 35 dollars and built a routine that creates txt copies of all pdfs in a directory structure, thereby enabling the users to perform text-search and link back to the original pdf. Maybe that can serve as a starting point for you? Take a look at http://www.a-pdf.com/text/index.htm.

SAS Innovate 2025: Call for Content

Are you ready for the spotlight? We're accepting content ideas for SAS Innovate 2025 to be held May 6-9 in Orlando, FL. The call is open until September 25. Read more here about why you should contribute and what is in it for you!

Submit your idea!

Health and Life Sciences Learning

 

Need courses to help you with SAS Life Sciences Analytics Framework, SAS Health Cohort Builder, or other topics? Check out the Health and Life Sciences learning path for all of the offerings.

LEARN MORE

Discussion stats
  • 7 replies
  • 25636 views
  • 1 like
  • 7 in conversation