BookmarkSubscribeRSS Feed
🔒 This topic is solved and locked. Need further help from the community? Please sign in and ask a new question.
craigwe85
Fluorite | Level 6

Hello,

 

I am trying to import some data from a website into SAS. The website has an "Download Data" button which when you select exports a CSV file to excel. I would like this file to come into SAS. Also, when exported to CSV, the data does table does not start until LIne 26 and has some info about the data above that is not needed.

 

Is it possible to import this? Ive seen example getting data tables from websites but I have yet to find an example of someone pulling data in from a "Download Data" button within a website. The reason I wish to use this data instead of the data table that appears on the website is that it contains the entire years worth of data vs. the table on the website which only shows you the month you selected.

 

 

"http://climate.weather.gc.ca/climate_data/daily_data_e.html?hlyRange=2012-09-10%7C2018-10-24&dlyRang..."

1 ACCEPTED SOLUTION

Accepted Solutions
DaveHorne
SAS Employee

Hi @craigwe85,

Here's a short example using the filename URL method:

 

filename in url "http://climate.weather.gc.ca/climate_data/bulk_data_e.html?format=csv&stationID=50620&Year=2018&Month=10&Day=1&timeframe=2&submit=Download+Data";
filename out "%sysfunc(pathname(work))\data.csv";                                                                                                                 
                                                                                                                                                                  
data _null_;                                                                                                                                                      
  infile in;                                                                                                                                                      
  input;                                                                                                                                                          
  if _n_> 25;                                                                                                                                                     
  file out;                                                                                                                                                       
  put _infile_;                                                                                                                                                   
run;                                                                                                                                                              
                                                                                                                                                                  
PROC IMPORT OUT= WORK.data                                                                                                                                        
     DATAFILE= "%sysfunc(pathname(work))\data.csv"                                                                                                                
     DBMS=CSV REPLACE;                                                                                                                                            
RUN;

Since the first 25 lines of the file contain a report header and legend, the data step filters that out.  From there you could continue parsing the file or see if the proc import gets what you want.  (I got the URL to the CSV by looking at the form code in the html source of that page - you may need to tweak the parameters as needed).

Hope this helps.

View solution in original post

4 REPLIES 4
RW9
Diamond | Level 26 RW9
Diamond | Level 26

First and most important, that is not a csv file.  Please do not perpetuate the myth that anything with commas is a CSV.  The website is generating some sort of report based on their own specification, which is not in compliance with a CSV.

As for importing it, that is pretty straight forward, you can strip out those lines, start import at a specific row and such like.

 

As far as I am aware, there is no operation that can trigger a website process.  You can extract data directly from online sources using proc http.  However that requires the data being available, and you having the appropriate permissions.    

DaveHorne
SAS Employee

Hi @craigwe85,

Here's a short example using the filename URL method:

 

filename in url "http://climate.weather.gc.ca/climate_data/bulk_data_e.html?format=csv&stationID=50620&Year=2018&Month=10&Day=1&timeframe=2&submit=Download+Data";
filename out "%sysfunc(pathname(work))\data.csv";                                                                                                                 
                                                                                                                                                                  
data _null_;                                                                                                                                                      
  infile in;                                                                                                                                                      
  input;                                                                                                                                                          
  if _n_> 25;                                                                                                                                                     
  file out;                                                                                                                                                       
  put _infile_;                                                                                                                                                   
run;                                                                                                                                                              
                                                                                                                                                                  
PROC IMPORT OUT= WORK.data                                                                                                                                        
     DATAFILE= "%sysfunc(pathname(work))\data.csv"                                                                                                                
     DBMS=CSV REPLACE;                                                                                                                                            
RUN;

Since the first 25 lines of the file contain a report header and legend, the data step filters that out.  From there you could continue parsing the file or see if the proc import gets what you want.  (I got the URL to the CSV by looking at the form code in the html source of that page - you may need to tweak the parameters as needed).

Hope this helps.

craigwe85
Fluorite | Level 6

thank you kindly. This is going to save me lots of time every Monday morning.

SuryaKiran
Meteorite | Level 14

If I'm not wrong you can't download the CSV file if it doesn't have a direct url (something like this: "https://raw.githubusercontent.com/zonination/perceptions/master/probly.csv"). The download option on the web page is not pointing for a direct CSV file (might be pulling from a database or something)

In Chrome right click on web page and Inspect and see the url for the DOWNLOAD option, its not pointing to direct csv file.

image.png

 

There is another way to get the data your looking for. What I noticed from my observation is, all of the data is present in the downloaded file is present in the web page itself and you need to select the year and month options. Take the distinct url's for all the years and months you require the data for and compare the whats changing in the url.

 

For example when I searched the data for 2018 and jan, feb and mar the url's are almost similar except the last values that pointing to each month. You can use those url's with PROC HTTP to get the data from web to a file and then later you need to scrap that file to get meaningful data out of it. This task is so complex and also you need to have access to read from external web sites (Most companies doesn't allow). I guess it might work in SAS University Edition. 

 

image.png

 

@ChrisHemedinger has posted several blogs on PROC HTTP. Check them here

 

This is just my idea and never tried it. I might give it a try later.

Thanks,
Suryakiran

sas-innovate-2024.png

Available on demand!

Missed SAS Innovate Las Vegas? Watch all the action for free! View the keynotes, general sessions and 22 breakouts on demand.

 

Register now!

Mastering the WHERE Clause in PROC SQL

SAS' Charu Shankar shares her PROC SQL expertise by showing you how to master the WHERE clause using real winter weather data.

Find more tutorials on the SAS Users YouTube channel.

Discussion stats
  • 4 replies
  • 28810 views
  • 7 likes
  • 4 in conversation