BookmarkSubscribeRSS Feed
🔒 This topic is solved and locked. Need further help from the community? Please sign in and ask a new question.
cokeyng
Obsidian | Level 7

Hi everyone,

 

I am using Viya and would need to download a zip file and extract the csv file for data.

 

I found some sample codes online and tried it out.

FILENAME REFFILE temp;

PROC HTTP
	url="https://www150.statcan.gc.ca/n1/tbl/csv/18100004-eng.zip"
	method="GET"
    out=REFFILE;
RUN;

data _null_;
    length myout $ 200;
    file REFFILE filename=myout;
    stop;
run;

And the output is 

1    %studio_hide_wrapper;
82   FILENAME REFFILE temp;
83   
84   PROC HTTP
85   url="https://www150.statcan.gc.ca/n1/tbl/csv/18100004-eng.zip"
86   method="GET"
87       out=REFFILE;
88   RUN;
NOTE: PROCEDURE HTTP used (Total process time):
      real time           4.49 seconds
      cpu time            0.54 seconds
      
89   
90   data _null_;
91       length myout $ 200;
92       file REFFILE filename=myout;
93       stop;
94   run;
NOTE: The file REFFILE is:
      Filename=/var/sastmp/SAS_work939E00018EF3_rhlsvp01/#LN00223,
      Owner Name=cng1,Group Name=sassrusr,
      Access Permission=-rw-r--r--,
      Last Modified=14Jan2022:20:27:39
NOTE: 0 records were written to the file REFFILE.
NOTE: DATA statement used (Total process time):
      real time           0.07 seconds
      cpu time            0.00 seconds
      
95   
96   %studio_hide_wrapper;
107  
108  

 I have a few questions:

1. how do i know if it was a successful download?

2. the file reference is stored at REFFILE. To extract the csv file from there, I will have to define another file ref ZIP, right? but how do I transfer the reference from REFFILE temp to a ZIP?

 

Thanks in advance.

 

Cokey

 

1 ACCEPTED SOLUTION

Accepted Solutions
Reeza
Super User
%macro importStatCan(tableNum = , dset= );

filename out "%sysfunc(getoption(work))/&tableNum.-eng.zip";

proc http
 url="https://www150.statcan.gc.ca/n1/tbl/csv/&tableNum.-eng.zip"
 method="get"
 out=out;
run;

filename out;

filename ext "%sysfunc(getoption(work))/&tableNum..csv";

filename inzip ZIP "%sysfunc(getoption(work))/&tableNum.-eng.zip";
 
data _null_;
   /* using member syntax here */
   infile inzip(&tableNum..csv) 
       lrecl=256 recfm=F length=length eof=eof unbuf;
   file ext lrecl=256 recfm=N;
   input;
   put _infile_ $varying256. length;
   return;
 eof:
   stop;
run;


proc import out=&dset. datafile=ext dbms=csv replace; 
guessingrows=max;
run;

filename ext clear;
filename inzip clear;

%mend;

Here's a macro I wrote a few years back to download StatCan data. 

 

https://gist.github.com/statgeek/5c8f4e769e898353d7e8115610f5ad5e

 

EDIT: Successfully tested for your request:


%importStatCan(tableNum=18100004, dset=want);

Data set has 1,039,302 rows and 15 columns (141MB text file) and worked successfully. 

It takes a while with the GUESSINGROWS=MAX set in the macro. If you want, change that to 1000 to make it run faster.

 


@cokeyng wrote:

Hi everyone,

 

I am using Viya and would need to download a zip file and extract the csv file for data.

 

I found some sample codes online and tried it out.

FILENAME REFFILE temp;

PROC HTTP
	url="https://www150.statcan.gc.ca/n1/tbl/csv/18100004-eng.zip"
	method="GET"
    out=REFFILE;
RUN;

data _null_;
    length myout $ 200;
    file REFFILE filename=myout;
    stop;
run;

And the output is 

1    %studio_hide_wrapper;
82   FILENAME REFFILE temp;
83   
84   PROC HTTP
85   url="https://www150.statcan.gc.ca/n1/tbl/csv/18100004-eng.zip"
86   method="GET"
87       out=REFFILE;
88   RUN;
NOTE: PROCEDURE HTTP used (Total process time):
      real time           4.49 seconds
      cpu time            0.54 seconds
      
89   
90   data _null_;
91       length myout $ 200;
92       file REFFILE filename=myout;
93       stop;
94   run;
NOTE: The file REFFILE is:
      Filename=/var/sastmp/SAS_work939E00018EF3_rhlsvp01/#LN00223,
      Owner Name=cng1,Group Name=sassrusr,
      Access Permission=-rw-r--r--,
      Last Modified=14Jan2022:20:27:39
NOTE: 0 records were written to the file REFFILE.
NOTE: DATA statement used (Total process time):
      real time           0.07 seconds
      cpu time            0.00 seconds
      
95   
96   %studio_hide_wrapper;
107  
108  

 I have a few questions:

1. how do i know if it was a successful download?

2. the file reference is stored at REFFILE. To extract the csv file from there, I will have to define another file ref ZIP, right? but how do I transfer the reference from REFFILE temp to a ZIP?

 

Thanks in advance.

 

Cokey

 


 

View solution in original post

5 REPLIES 5
Tom
Super User Tom
Super User

You are downloaded the file and then immediately overwrote it by using the FILE statement instead of the INFILE statement.

 

If the file is a ZIP file then use the ZIP engine when reading.

Since you asked SAS to make up a name for the file use the PATHNAME() function to find what name it used.

Here is code to dump the first few lines of the first file in the ZIP file to the LOG so you can see if it looks right.

data _null_;
  infile "%sysfunc(pathname(reffile))" zip member='*' ;
  input;
  list;
  if _n_ >3 then stop;
run;
cokeyng
Obsidian | Level 7

Thanks, Tom. I will try it out.

Reeza
Super User
%macro importStatCan(tableNum = , dset= );

filename out "%sysfunc(getoption(work))/&tableNum.-eng.zip";

proc http
 url="https://www150.statcan.gc.ca/n1/tbl/csv/&tableNum.-eng.zip"
 method="get"
 out=out;
run;

filename out;

filename ext "%sysfunc(getoption(work))/&tableNum..csv";

filename inzip ZIP "%sysfunc(getoption(work))/&tableNum.-eng.zip";
 
data _null_;
   /* using member syntax here */
   infile inzip(&tableNum..csv) 
       lrecl=256 recfm=F length=length eof=eof unbuf;
   file ext lrecl=256 recfm=N;
   input;
   put _infile_ $varying256. length;
   return;
 eof:
   stop;
run;


proc import out=&dset. datafile=ext dbms=csv replace; 
guessingrows=max;
run;

filename ext clear;
filename inzip clear;

%mend;

Here's a macro I wrote a few years back to download StatCan data. 

 

https://gist.github.com/statgeek/5c8f4e769e898353d7e8115610f5ad5e

 

EDIT: Successfully tested for your request:


%importStatCan(tableNum=18100004, dset=want);

Data set has 1,039,302 rows and 15 columns (141MB text file) and worked successfully. 

It takes a while with the GUESSINGROWS=MAX set in the macro. If you want, change that to 1000 to make it run faster.

 


@cokeyng wrote:

Hi everyone,

 

I am using Viya and would need to download a zip file and extract the csv file for data.

 

I found some sample codes online and tried it out.

FILENAME REFFILE temp;

PROC HTTP
	url="https://www150.statcan.gc.ca/n1/tbl/csv/18100004-eng.zip"
	method="GET"
    out=REFFILE;
RUN;

data _null_;
    length myout $ 200;
    file REFFILE filename=myout;
    stop;
run;

And the output is 

1    %studio_hide_wrapper;
82   FILENAME REFFILE temp;
83   
84   PROC HTTP
85   url="https://www150.statcan.gc.ca/n1/tbl/csv/18100004-eng.zip"
86   method="GET"
87       out=REFFILE;
88   RUN;
NOTE: PROCEDURE HTTP used (Total process time):
      real time           4.49 seconds
      cpu time            0.54 seconds
      
89   
90   data _null_;
91       length myout $ 200;
92       file REFFILE filename=myout;
93       stop;
94   run;
NOTE: The file REFFILE is:
      Filename=/var/sastmp/SAS_work939E00018EF3_rhlsvp01/#LN00223,
      Owner Name=cng1,Group Name=sassrusr,
      Access Permission=-rw-r--r--,
      Last Modified=14Jan2022:20:27:39
NOTE: 0 records were written to the file REFFILE.
NOTE: DATA statement used (Total process time):
      real time           0.07 seconds
      cpu time            0.00 seconds
      
95   
96   %studio_hide_wrapper;
107  
108  

 I have a few questions:

1. how do i know if it was a successful download?

2. the file reference is stored at REFFILE. To extract the csv file from there, I will have to define another file ref ZIP, right? but how do I transfer the reference from REFFILE temp to a ZIP?

 

Thanks in advance.

 

Cokey

 


 

cokeyng
Obsidian | Level 7

Thanks, Reeza. I will try it out.

cokeyng
Obsidian | Level 7
It works. Thanks.

sas-innovate-2024.png

Available on demand!

Missed SAS Innovate Las Vegas? Watch all the action for free! View the keynotes, general sessions and 22 breakouts on demand.

 

Register now!

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

Click image to register for webinarClick image to register for webinar

Classroom Training Available!

Select SAS Training centers are offering in-person courses. View upcoming courses for:

View all other training opportunities.

Discussion stats
  • 5 replies
  • 1718 views
  • 2 likes
  • 3 in conversation