SAS Data Integration Studio, DataFlux Data Management Studio, SAS/ACCESS, SAS Data Loader for Hadoop and others

Importing HTML file.

Reply
Frequent Contributor
Posts: 76

Importing HTML file.

D/All,

I have a HTML report which has plenty of tables of different data structures reported. For some analysis i need to import this entire HTML file and create different sas datasets for each table output.

Using XPATH i can get the the path for each table ex: html/body/table[1]   html/body/table[2]

My query is how do I tell SAS the location from which it should import.

Request if some one can help on this.

Thanks.

Database Summary

DatabaseSnapshot IdsNumber of InstancesNumber of HostsReport Total (minutes)
IdNameRACBlock SizeBeginEndIn ReportTotalIn ReportTotalDB timeElapsed time
2836634542INST1YES81924743474422220.4360.27

Database Instances Included In Report

  • Listed in order of instance number, I#
I#InstanceHostStartupBegin Snap TimeEnd Snap TimeReleaseElapsed Time(min)DB time(min)Up Time(hrs)Avg Active SessionsPlatform
1INST1hostname102-Dec-14 14:0922-Jan-15 16:0022-Jan-15 16:5911.2.0.2.059.600.281,226.830.00Linux x86 64-bit
2INST2hostname223-Nov-14 07:5722-Jan-15 16:0022-Jan-15 17:0011.2.0.2.059.600.161,449.040.00Linux x86 64-bit

Report Summary

Cache Sizes

  • All values are in Megabytes
  • Listed in order of instance number, I#
  • End values displayed only if different from Begin values
Memory TargetSga TargetDB CacheShared PoolLarge PoolJava PoolStreams PoolPGA Target
I#BeginEndBeginEndBeginEndBeginEndBeginEndBeginEndBeginEndBeginEndLog Buffer
12,000 1,312 480 768 16 16 688 20.20
22,000 1,312 480 768 16 16 688 20.20
Avg2,000 1,312 480 768 16 16 688 20.20
Min2,000 1,312 480 768 16 16 688 20.20
Max2,000 1,312 480 768 16 16 688 20.20
Super User
Posts: 19,878

Re: Importing HTML file.

Is this a one time occurrence or do you need to update it frequently?

Actually...I would still recommend import.io the downloaded application instead. The API will generate a decently formed table for the majority of cases.

Frequent Contributor
Posts: 76

Re: Importing HTML file.

There are heaps of reports generated. Unfortunately the logic used to generate these reports is now known but need to use the ouput data for further analysis.

This can be concurrent hence need to script it for automation. For now I'm manually doing copy paste which is taking lot of time.

Super User
Posts: 5,441

Re: Importing HTML file.

Sounds dangerous to base analysis on some reports created by an unknown logic....

Data never sleeps
Frequent Contributor
Posts: 76

Re: Importing HTML file.

Infact it is but have no other option to me...

I found one nice article ( http://support.sas.com/resources/papers/proceedings09/052-2009.pdf ) but few components are missing in this code and I'm not able to understand what exactly is missing.

Super User
Posts: 19,878

Re: Importing HTML file.

Scraping data from websites and documents is a common task these days.

Given what you've provided its hard to comment further. As much as I love SAS its not what I use to scrape data. Depending on the size/task Import.IO works or save as PDF amd use Adobe pro to convert to Excel. Or Nitro which is free.

Anyways, post a sample of your file or preferably the link if you need more help.

Super User
Posts: 7,868

Re: Importing HTML file.

These all look like stats from a RDBMS. The stats modules of RDBMS's can also write textual data files, which are much easier to read from SAS than HTML. Get those text files from the DBA people and work from that.

---------------------------------------------------------------------------------------------
Maxims of Maximally Efficient SAS Programmers
Ask a Question
Discussion stats
  • 6 replies
  • 413 views
  • 0 likes
  • 4 in conversation