BookmarkSubscribeRSS Feed
🔒 This topic is solved and locked. Need further help from the community? Please sign in and ask a new question.
SophieSaas
Fluorite | Level 6

Hi everyone, 

 

I am trying to extract data from my own LinkedIn page, using SAS 9.4.

 

 I've seen the various posts on this community and internet, for instance 

https://blogs.sas.com/content/sasdummy/2017/12/04/scrape-web-page-data/

 

 

I've used the PWENCODE procedure to encode my password into a txt file. I get it back in the macro variable &PASS.

So far, I've written the code :

filename recupFIC "C:\PASS to output file\test.xml";
proc http
	method="GET"
	url="https://www.linkedin.com/company/MYSITENUMBER/admin/analytics"
	out=recupFIC
	WEBAUTHDOMAIN="www.linkedin.com"
	webusername="my username here"
	webpassword="&PASS."
	;
run;

When I run this code, a window opens and I am asked to fill the fields for a metadata server : Server Name, User Id and password. I don't understand where I can find these informations.

 

When I try to scrap a simple page (no authentification required), it works perfectly.

 

Do you have any idea? 

 

Thank you very much in advance,

 

1 ACCEPTED SOLUTION

Accepted Solutions
ChrisHemedinger
Community Manager

WEBAUTHDOMAIN is for an administered SAS mid-tier, so that's not an option you need.

 

WEBUSERNAME and WEBPASSWORD is for "Basic Auth" -- but LinkedIn does not use that mechanism.  3rd party applications must use LinkedIn APIs and connect with OAuth2 -- a much more complex negotiation.  And I'm not sure that LinkedIn APIs provide the data you want to get. Check their Developer site to see what's possible.

 

Web scraping is most likely against LinkedIn's data use policy.  While you might be just trying to experiment with your own profile, taking it further is probably against their rules.  If you just want to "practice" parsing your page, use your web browser to Save As HTML and then use SAS to process that as an INFILE.

 

Chris

It's time to register for SAS Innovate! Join your SAS user peers in Las Vegas on April 16-19 2024.

View solution in original post

2 REPLIES 2
ChrisHemedinger
Community Manager

WEBAUTHDOMAIN is for an administered SAS mid-tier, so that's not an option you need.

 

WEBUSERNAME and WEBPASSWORD is for "Basic Auth" -- but LinkedIn does not use that mechanism.  3rd party applications must use LinkedIn APIs and connect with OAuth2 -- a much more complex negotiation.  And I'm not sure that LinkedIn APIs provide the data you want to get. Check their Developer site to see what's possible.

 

Web scraping is most likely against LinkedIn's data use policy.  While you might be just trying to experiment with your own profile, taking it further is probably against their rules.  If you just want to "practice" parsing your page, use your web browser to Save As HTML and then use SAS to process that as an INFILE.

 

Chris

It's time to register for SAS Innovate! Join your SAS user peers in Las Vegas on April 16-19 2024.
SophieSaas
Fluorite | Level 6
Thank you Chris for your answer. As I was "just" trying to scrap my own page I hadn't realized it could be an issue for LinkedIn.
Thank you!

sas-innovate-2024.png

Don't miss out on SAS Innovate - Register now for the FREE Livestream!

Can't make it to Vegas? No problem! Watch our general sessions LIVE or on-demand starting April 17th. Hear from SAS execs, best-selling author Adam Grant, Hot Ones host Sean Evans, top tech journalist Kara Swisher, AI expert Cassie Kozyrkov, and the mind-blowing dance crew iLuminate! Plus, get access to over 20 breakout sessions.

 

Register now!

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

Click image to register for webinarClick image to register for webinar

Classroom Training Available!

Select SAS Training centers are offering in-person courses. View upcoming courses for:

View all other training opportunities.

Discussion stats
  • 2 replies
  • 921 views
  • 0 likes
  • 2 in conversation