SAS 9.4, Win11 environment.
I'm using the Data step to read in a web page contents and then storing on a local .txt file for subsequent SAS text mining. Essentially, using SAS to simulate a browser function "View page source" and then save to a .txt;
To do this I use the filename url option. That part is working fine.
As I visually examine the html code I've scraped, I see the site is using Google Analytics to track. When hitting the url through SAS data step, are my "do not track" cookies set in my browsers visible to the site's invocation of Google Analytics? As I'm not using a browser, what does the website know and/or collect when that html source is accessed via this method?
Google Analytics and other trackers usually use JavaScript to glean information about the visitor. The script runs when it renders in your browser. When using a client like FILENAME URL or PROC HTTP or cURL, that script does not run. The website still gets information about the page visit including the IP address and details about the web request, but it doesn't "follow your journey" on the site in the way fancier trackers do.
Google Analytics and other trackers usually use JavaScript to glean information about the visitor. The script runs when it renders in your browser. When using a client like FILENAME URL or PROC HTTP or cURL, that script does not run. The website still gets information about the page visit including the IP address and details about the web request, but it doesn't "follow your journey" on the site in the way fancier trackers do.
Registration is now open for SAS Innovate 2025 , our biggest and most exciting global event of the year! Join us in Orlando, FL, May 6-9.
Sign up by Dec. 31 to get the 2024 rate of just $495.
Register now!
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.
Ready to level-up your skills? Choose your own adventure.