10-11-2017 03:54 PM
** This is all public data. If this code gets to working, this script can become a resource for anyone who needs hourly weather details! **
My goal: create a process to automatically pull all NCEI Land-based station weather data files for 2017, for NC/SC stations.
Version SAS 9.4M3, SAS Enterprise Guide, Windows local machine.
List of stations: ftp://ftp.ncdc.noaa.gov/pub/data/noaa/isd-history.txt
Hourly weather data folder: https://www.ncei.noaa.gov/data/global-hourly/access/2017/
This pulls all stations in the US. It works fine.
filename testurl ftp "isd-history.txt" cd="/pub/data/noaa/" host="ftp.ncdc.noaa.gov" user="anonymous"; data AUTORES.WEATHER_STATIONS; infile testurl FIRSTOBS=24; input USAF $ 1-6 WBAN $ 8-12 STATION_NAME $ 14-43 CTRY $ 44-45 ST $ 49-50 CALL $ 52-55 LAT 58-64 LON 66-73 ELEV 75-81 BEGIN 83-90 END 92-99 ; IF CTRY EQ "US"; run;
This creates a series of macro variables, "URL1" to "URL###", where ### is the number of stations to pull data for.
Variable values are the direct URL to the file to download.
This works as expected.
data _NULL_; set AUTORES.WEATHER_STATIONS end=eof; retain i; if _n_ = 1 then i = 1; if (ST EQ "NC" OR ST EQ "SC") AND END GT 20170000; thisurl = cats("""https://www.ncei.noaa.gov/data/global-hourly/access/2017/", USAF, WBAN, ".csv"""); urlnum = cats("URL", i); call symput(urlnum, thisurl); strI = i; call symput("urlcount", left(put(strI, 8.))); i = i + 1; run;
Here's where it goes south: the macro loops as expected, but always only pulls the data for the LAST URL created (in this specific case, URL172).
%MACRO GET_WEATHER_STATION_READINGS; %DO i = 1 %TO &urlcount; data WORK.NEW_WEATHER; filename WFile url &&URL&urlcount debug; infile WFile DLM=',' DSD MISSOVER firstobs=2; input STATION : $20. DATE : E8601DT20. SOURCE : $8. LATITUDE : 10.6 LONGITUDE : 10.6 ELEVATION : 10.6 NAME : $30. REPORT_TYPE : $10. CALL_SIGN : $10. QUALITY_CONTROL : $10. WND : $20. CIG : $20. VIS : $20. TMP : $20. DEW : $20. SLP : $20. AA1 : $20. AA2 : $20. AY1 : $20. AY2 : $20. GA1 : $20. GA2 : $20. GA3 : $20. GE1 : $20. GF1 : $20. MW1 : $20. REM : $20. EQD : $20. ; format DATE DATETIME20.; run; proc append base=AUTORES.WEATHER_HOURLY data=WORK.NEW_WEATHER; run; sleep(5, 1); %end; %MEND GET_WEATHER_STATION_READINGS; %GET_WEATHER_STATION_READINGS;
As you may guess, i need it to pull URL1 - URL172, each one once.
It's a macro value issue, I suspect. I've got some basic understanding of macros, but I'm not sure how to change this into what I need.
(If the FILENAME command would accept a variable for a URL path, I doubt I'd need macros at all!!)
I've been fighting with this for three days, and solved a lot of other errors along the way. I humbly ask you please do not propose solutions without testing your code. I do expect the final result file AUTORES.WEATHER_HOURLY to be a couple GB in size.
Thank you for your time.
10-16-2017 11:51 AM
Need further help from the community? Please ask a new question.