BookmarkSubscribeRSS Feed
ybz12003
Rhodochrosite | Level 12

Hi, all:

 

I need some help.  Please see the PDF attachment.  The last column of table 1 is 95% CI of Persons living with diagnosed HIV infection.  I would like to separate the lower CI and upper CI into different columns (such as column A and column B).  The final purpose is inputting into SAS file.   However, it could be changed to word/excel file first (see excel file),  or maybe write some SAS code to input.  Any idea how?  Thanks.

 

Warm regards,

Yingtao

 

 

5 REPLIES 5
ybz12003
Rhodochrosite | Level 12

Hi, all:

 

I need some help.  Please see the PDF attachment.  The last column of table 1 is 95% CI of Persons living with diagnosed HIV infection.  I would like to separate the lower CI and upper CI into different columns (such as column A and column B).  The final purpose is inputting into SAS file.   However, it could be changed to word/excel file first (see excel file),  or maybe write some SAS code to input.  Any idea how?  Thanks.

 

Warm regards,

Yingtao

 

ballardw
Super User

Go to CDC MMWR Archive, 2015 is at http://www.cdc.gov/mmwr/index2015.html

Open the article of interest, the pages are html freindly and you can highlight, copy paste into Excel. Each cell is as on the website so likely it would be best to save to CSV and read with more control than importing from Excel. Especially with the footnote indicators in some of the cells.

 

Or do a file save as to save the whole article to a local HTML file.

Reeza
Super User

Get Adobe PDF Professional - it has a tool for allowing copy/paste or extracts of tables. Or a third party tool - this isn't something SAS would be very good at extracting.

 

 

ybz12003
Rhodochrosite | Level 12

Not approve by my manager

RW9
Diamond | Level 26 RW9
Diamond | Level 26

Your biggest problem is going to be getting the data from an output format (PDF in this case) into a useable format.  If you can select the column, and copy and paste, then its fine and a simple process to split it:

data want;
  length ci $20;
  infile datalines dlm="¬";
  input ci $;
  lower=scan(compress(ci,"( )"),1,"-");
  upper=scan(compress(ci,"( )"),2,"-");
datalines;
(123.2 - 345.32)
(34.1 - 451.12)
;
run;

The problem is selecting that data in PDF - which is a document format not a data transfer format.  Can you not get the source data?  If not then as @Reeza suggested, get the Adobe.  Or if your very lucky and the table can be copied to Excel (in a validated way - i.e. data might change formatting etc.) then you could select and copy from there. 

hackathon24-white-horiz.png

The 2025 SAS Hackathon has begun!

It's finally time to hack! Remember to visit the SAS Hacker's Hub regularly for news and updates.

Latest Updates

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

SAS Training: Just a Click Away

 Ready to level-up your skills? Choose your own adventure.

Browse our catalog!

Discussion stats
  • 5 replies
  • 6687 views
  • 6 likes
  • 4 in conversation