- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
Hi everyone,
I am using Viya and would need to download a zip file and extract the csv file for data.
I found some sample codes online and tried it out.
FILENAME REFFILE temp; PROC HTTP url="https://www150.statcan.gc.ca/n1/tbl/csv/18100004-eng.zip" method="GET" out=REFFILE; RUN; data _null_; length myout $ 200; file REFFILE filename=myout; stop; run;
And the output is
1 %studio_hide_wrapper; 82 FILENAME REFFILE temp; 83 84 PROC HTTP 85 url="https://www150.statcan.gc.ca/n1/tbl/csv/18100004-eng.zip" 86 method="GET" 87 out=REFFILE; 88 RUN; NOTE: PROCEDURE HTTP used (Total process time): real time 4.49 seconds cpu time 0.54 seconds 89 90 data _null_; 91 length myout $ 200; 92 file REFFILE filename=myout; 93 stop; 94 run; NOTE: The file REFFILE is: Filename=/var/sastmp/SAS_work939E00018EF3_rhlsvp01/#LN00223, Owner Name=cng1,Group Name=sassrusr, Access Permission=-rw-r--r--, Last Modified=14Jan2022:20:27:39 NOTE: 0 records were written to the file REFFILE. NOTE: DATA statement used (Total process time): real time 0.07 seconds cpu time 0.00 seconds 95 96 %studio_hide_wrapper; 107 108
I have a few questions:
1. how do i know if it was a successful download?
2. the file reference is stored at REFFILE. To extract the csv file from there, I will have to define another file ref ZIP, right? but how do I transfer the reference from REFFILE temp to a ZIP?
Thanks in advance.
Cokey
Accepted Solutions
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
%macro importStatCan(tableNum = , dset= );
filename out "%sysfunc(getoption(work))/&tableNum.-eng.zip";
proc http
url="https://www150.statcan.gc.ca/n1/tbl/csv/&tableNum.-eng.zip"
method="get"
out=out;
run;
filename out;
filename ext "%sysfunc(getoption(work))/&tableNum..csv";
filename inzip ZIP "%sysfunc(getoption(work))/&tableNum.-eng.zip";
data _null_;
/* using member syntax here */
infile inzip(&tableNum..csv)
lrecl=256 recfm=F length=length eof=eof unbuf;
file ext lrecl=256 recfm=N;
input;
put _infile_ $varying256. length;
return;
eof:
stop;
run;
proc import out=&dset. datafile=ext dbms=csv replace;
guessingrows=max;
run;
filename ext clear;
filename inzip clear;
%mend;
Here's a macro I wrote a few years back to download StatCan data.
https://gist.github.com/statgeek/5c8f4e769e898353d7e8115610f5ad5e
EDIT: Successfully tested for your request:
%importStatCan(tableNum=18100004, dset=want);
Data set has 1,039,302 rows and 15 columns (141MB text file) and worked successfully.
It takes a while with the GUESSINGROWS=MAX set in the macro. If you want, change that to 1000 to make it run faster.
@cokeyng wrote:
Hi everyone,
I am using Viya and would need to download a zip file and extract the csv file for data.
I found some sample codes online and tried it out.
FILENAME REFFILE temp; PROC HTTP url="https://www150.statcan.gc.ca/n1/tbl/csv/18100004-eng.zip" method="GET" out=REFFILE; RUN; data _null_; length myout $ 200; file REFFILE filename=myout; stop; run;And the output is
1 %studio_hide_wrapper; 82 FILENAME REFFILE temp; 83 84 PROC HTTP 85 url="https://www150.statcan.gc.ca/n1/tbl/csv/18100004-eng.zip" 86 method="GET" 87 out=REFFILE; 88 RUN; NOTE: PROCEDURE HTTP used (Total process time): real time 4.49 seconds cpu time 0.54 seconds 89 90 data _null_; 91 length myout $ 200; 92 file REFFILE filename=myout; 93 stop; 94 run; NOTE: The file REFFILE is: Filename=/var/sastmp/SAS_work939E00018EF3_rhlsvp01/#LN00223, Owner Name=cng1,Group Name=sassrusr, Access Permission=-rw-r--r--, Last Modified=14Jan2022:20:27:39 NOTE: 0 records were written to the file REFFILE. NOTE: DATA statement used (Total process time): real time 0.07 seconds cpu time 0.00 seconds 95 96 %studio_hide_wrapper; 107 108I have a few questions:
1. how do i know if it was a successful download?
2. the file reference is stored at REFFILE. To extract the csv file from there, I will have to define another file ref ZIP, right? but how do I transfer the reference from REFFILE temp to a ZIP?
Thanks in advance.
Cokey
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
You are downloaded the file and then immediately overwrote it by using the FILE statement instead of the INFILE statement.
If the file is a ZIP file then use the ZIP engine when reading.
Since you asked SAS to make up a name for the file use the PATHNAME() function to find what name it used.
Here is code to dump the first few lines of the first file in the ZIP file to the LOG so you can see if it looks right.
data _null_;
infile "%sysfunc(pathname(reffile))" zip member='*' ;
input;
list;
if _n_ >3 then stop;
run;
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
Thanks, Tom. I will try it out.
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
%macro importStatCan(tableNum = , dset= );
filename out "%sysfunc(getoption(work))/&tableNum.-eng.zip";
proc http
url="https://www150.statcan.gc.ca/n1/tbl/csv/&tableNum.-eng.zip"
method="get"
out=out;
run;
filename out;
filename ext "%sysfunc(getoption(work))/&tableNum..csv";
filename inzip ZIP "%sysfunc(getoption(work))/&tableNum.-eng.zip";
data _null_;
/* using member syntax here */
infile inzip(&tableNum..csv)
lrecl=256 recfm=F length=length eof=eof unbuf;
file ext lrecl=256 recfm=N;
input;
put _infile_ $varying256. length;
return;
eof:
stop;
run;
proc import out=&dset. datafile=ext dbms=csv replace;
guessingrows=max;
run;
filename ext clear;
filename inzip clear;
%mend;
Here's a macro I wrote a few years back to download StatCan data.
https://gist.github.com/statgeek/5c8f4e769e898353d7e8115610f5ad5e
EDIT: Successfully tested for your request:
%importStatCan(tableNum=18100004, dset=want);
Data set has 1,039,302 rows and 15 columns (141MB text file) and worked successfully.
It takes a while with the GUESSINGROWS=MAX set in the macro. If you want, change that to 1000 to make it run faster.
@cokeyng wrote:
Hi everyone,
I am using Viya and would need to download a zip file and extract the csv file for data.
I found some sample codes online and tried it out.
FILENAME REFFILE temp; PROC HTTP url="https://www150.statcan.gc.ca/n1/tbl/csv/18100004-eng.zip" method="GET" out=REFFILE; RUN; data _null_; length myout $ 200; file REFFILE filename=myout; stop; run;And the output is
1 %studio_hide_wrapper; 82 FILENAME REFFILE temp; 83 84 PROC HTTP 85 url="https://www150.statcan.gc.ca/n1/tbl/csv/18100004-eng.zip" 86 method="GET" 87 out=REFFILE; 88 RUN; NOTE: PROCEDURE HTTP used (Total process time): real time 4.49 seconds cpu time 0.54 seconds 89 90 data _null_; 91 length myout $ 200; 92 file REFFILE filename=myout; 93 stop; 94 run; NOTE: The file REFFILE is: Filename=/var/sastmp/SAS_work939E00018EF3_rhlsvp01/#LN00223, Owner Name=cng1,Group Name=sassrusr, Access Permission=-rw-r--r--, Last Modified=14Jan2022:20:27:39 NOTE: 0 records were written to the file REFFILE. NOTE: DATA statement used (Total process time): real time 0.07 seconds cpu time 0.00 seconds 95 96 %studio_hide_wrapper; 107 108I have a few questions:
1. how do i know if it was a successful download?
2. the file reference is stored at REFFILE. To extract the csv file from there, I will have to define another file ref ZIP, right? but how do I transfer the reference from REFFILE temp to a ZIP?
Thanks in advance.
Cokey
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
Thanks, Reeza. I will try it out.
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content