DATA Step, Macro, Functions and more

Getting the page numbers from PDF

Occasional Contributor
Posts: 8

Getting the page numbers from PDF

HI All, 


  Please help me on this.


 I have dataset with a variable name.









The above given observation's present in a PDF document. 1. Alice present in page number 4 and 5.

                                                                                                2. James present in page number 3

                                                                                               3. Jeffrey present in page number 1, 2 and 3

                                                                                               4. Joyce present in page number  2 and 3

                                                                                               5. Barbara present in page number  1

So i want an output dataset like


  NAME       NUMBER  

   Alice        4, 5 

  James      3 

  Jeffrey     1,  2,  3 

  Joyce       2,  3 

  Barbara   1


Note:- If more than one page number , should be seperated with ", " in output.


Thanks in Advance.




Super User
Super User
Posts: 9,599

Re: Getting the page numbers from PDF

There is no way to know up front how PDF will render the document.  I have seen floating around the internet some very long and complicated code which attempts to guess, but even then they are only right half the time.  I would suggest you evaluate your data before outputting.  This can take many methods, som examples:

- Assume each data row is one row on the output, then divide _n_ by the number you want on each page, e.g:
data want;

  set have;



This will give you a variable with page numbers, then in your proc report your break on PGE variable.  You can make that more or less complicated as you wish.

However why have a list of pages?  It doesn't sound that normal to me.  I tend to always see page number, and sometimes total pages - which is why the renderer code ^{thispage} of ^{lastpage} is given so you can let the software which renders the output (PDF, RTF etc.) decide on page numbering.  At the end of the day page numbering is only there to check you have a complete output, any logical test should be done on the data not on where it appears in output?

Super User
Posts: 10,761

Re: Getting the page numbers from PDF

If your REPORT was  simple, you can count how many rows in every page there were by hand.

Once you get that NUMBER, you can easily get who belong to which page .

Ask a Question
Discussion stats
  • 2 replies
  • 3 in conversation