BookmarkSubscribeRSS Feed
VX_Xc
Calcite | Level 5

I need to extract messy data from the website. Could anyone recommend a good textbook that covers how to extract data efficiently from the web, plz?

Thank you.

4 REPLIES 4
art297
Opal | Level 21

Seunghoon,

Do you mean automatically or as in copy/paste?  If it is the latter, I'll be doing an SGF presentation on the topic in April, titled 'Copy and Paste Almost Anything'.  I already presented a draft of the paper at one of my local user group meetings and you can find it at:

http://torsas.ca/page18.php

HTH,

Art

VX_Xc
Calcite | Level 5

I meant automatically. For example I would like to learn PROC (with many optional statements) that extracts data from the HTML file if I give it a address of a website or .html file directory.

maybe there isn't one? Then I would have to use DATA steps with a lot of @<tag> arguments in INPUT statement, which would not be very practical.

But thanks for the link. I will have a look, looks promising.

art297
Opal | Level 21

Then you want to look into proc html and proc soap.  Do a search on the discussion forums for either.  If you include my id or friedeggs id in the search, I'm sure that will help to eliminate much of the noise.

Ksharp
Super User

Yes. You can do it.

filename x url 'http://www.sas.com';
data want(where=(line is not missing));
infile x dsd dlm='<>' lrecl=32767;
input @ '>' line : $400. @@;
run;


Ksharp

sas-innovate-2024.png

Don't miss out on SAS Innovate - Register now for the FREE Livestream!

Can't make it to Vegas? No problem! Watch our general sessions LIVE or on-demand starting April 17th. Hear from SAS execs, best-selling author Adam Grant, Hot Ones host Sean Evans, top tech journalist Kara Swisher, AI expert Cassie Kozyrkov, and the mind-blowing dance crew iLuminate! Plus, get access to over 20 breakout sessions.

 

Register now!

What is Bayesian Analysis?

Learn the difference between classical and Bayesian statistical approaches and see a few PROC examples to perform Bayesian analysis in this video.

Find more tutorials on the SAS Users YouTube channel.

Click image to register for webinarClick image to register for webinar

Classroom Training Available!

Select SAS Training centers are offering in-person courses. View upcoming courses for:

View all other training opportunities.

Discussion stats
  • 4 replies
  • 792 views
  • 3 likes
  • 3 in conversation