SAS Office Analytics, SAS Add-In for Microsoft Office, and other integrations

Can SAS read an HTML table?

Posts: 39

Can SAS read an HTML table?

This is an excel file which has been saved with an .html extension.
Super Contributor
Super Contributor
Posts: 3,176

Re: Can SAS read an HTML table?

Posted in reply to steve_citi
Yes, you can parse an HTML-formatted file using DATA step logic,such as INFILE, INPUT statements with IF/THEN logic. Check the forum archives for prior posts on this same topic discussion.

Scott Barry
SBBWorks, Inc.
Posts: 9,038

Re: Can SAS read an HTML table?

Posted in reply to steve_citi
I need to clarify that when you use ODS HTML to create a file that Excel can open, you are NOT creating a "true, binary" Excel file. ODS HTML creates an ASCII text file with HTML tags to delimit the table and cells within the table.

Starting with Excel 97, Microsoft has built into Office the smarts for how an HTML page can be rendered in Word and Excel. Either Office product will open an HTML file. However, you have to go to FILE --> OPEN and explicitly select a file type of HTML in order to get Word or Excel to "see" the HTML file. If you look at the HTML file with Notepad, you will see HTML tags like <TABLE> and <TD>.

One way around this explicit FILE/OPEN technique is to name the file as ".XLS" -- which "fools" the Windows Registry into opening the HTML file -- this doesn't change what's INSIDE the file -- only how the Windows Registry treats the file. If the file were left with a ".HTML" extension, then most likely, a browser would open the file if you just double clicked.

You cannot use PROC IMPORT to IMPORT an HTML file into SAS. If, however, you RESAVE the HTML file from Excel as either CSV or true .XLS, then you could use PROC IMPORT to open the file.

One of the downsides of trying to read the HTML file back into SAS is that the HTML file contains a lot of "extra" title, footnote and style information that you may have to "parse" over. (Since a SAS dataset does not contain titles or footnotes or colors or fonts --all the style information in the HTML file would not carry over to the SAS dataset.)

Ask a Question
Discussion stats
  • 2 replies
  • 3 in conversation