BookmarkSubscribeRSS Feed
🔒 This topic is solved and locked. Need further help from the community? Please sign in and ask a new question.
athomson
Fluorite | Level 6

All,

 

I get multiple zipped folders that contain XML files. I want to "point" SAS into the zipped folder and parse out the XML files into records in a SAS dataset. 

 

Here's a diagram of what I'm dealing with:

data.zip
|__sub_folder |__ file1.xml |__ file2.xml
...
|__file500.xml

 

I know I can use a FILENAME with the ZIP option then read in the contents of a zipped folder:

filename inzip ZIP "C:\input_data\data.zip";

data folder_contents; length memname $200 isFolder 8; fid=dopen("inzip"); if fid=0 then stop; memcount=dnum(fid); do i=1 to memcount; /*Scans and gives only the name of the xml file*/ memname=scan(dread(fid,i), 2, '\'); /*check for trailing / in folder name */ output; end; rc=dclose(fid); run;

title "Files in the ZIP file";
proc print data=folder_contents noobs N;
run;

And I know I can use the LIBNAME engine with the XML option (in this case XMLV2) with an XML map (made from the XML Mapper) to read in XML files into a dataset: 

filename SXLEMAP "C:\xml_map\my_map2.map";
libname my_xml_file XMLV2 "C:\XML_files\test.xml" xmlmap = SXLEMAP;
libname out "C:\output"; data out.xml_contents;
set my_xml_file.test; run;

How do I get a LIBNAME pointing "inside" a zipped folder, if I have to use the FILENAME engine to look inside said folder?

 

Right now, I believe we can only use the ZIP options in the FILENAME engine, not the LIBNAME. 

 

I want to avoid unzipping since it takes a while to unzip all of the XML files (and IT at my agency has strong restrictions on using X commands. Otherwise, I'd use X commands to move the XML files from the zipped folder to a staging folder, then use a macro to import the XML files). 

1 ACCEPTED SOLUTION

Accepted Solutions
athomson
Fluorite | Level 6

Got it! 

 

Here is what I got: 

 

filename inzip ZIP "C:\input_data\data.zip";

data folder_contents;
 length memname $200 isFolder 8;
 fid=dopen("inzip");
 if fid=0 then
  stop;
 memcount=dnum(fid);
 do i=1 to memcount;
	/*Scans and gives only the name of the xml file*/
  	memname=scan(dread(fid,i), 2, '\');
  	output;
 end;
 rc=dclose(fid);
run;

/* create a report of the ZIP contents */
title "Files in the ZIP file";
proc print data=folder_contents noobs N;
run;

/* identify a temp folder in the WORK directory */
filename xl "%sysfunc(getoption(work))/file1.xml" ;
 
/* hat tip: "data _null_" on SAS-L */
data _null_;
   /* using member syntax here */
   infile inzip(sub_folder\file1.xml) 
       	lrecl=256 
		recfm=F 
		length=length 
		eof=eof unbuf;
   file   xl lrecl=256 recfm=N;
   input;
   put _infile_ $varying256. length;
   return;
 eof:
   stop;
run;

filename SXLEMAP "C:\xml_map\my_map2.map';
libname my_xml XMLV2 "%sysfunc(getoption(work))/file1.xml" xmlmap=sxlemap access=readonly;
libname out "C:\Output"; data out.xml_contents; set my_xml.test; run;

 It's the intermediate data _null_ step I'm a little shaky on. So data step basically tells SAS to pluck the xml file from inside the zipped folder and stick it in a temporary work directory? This then allows me to set a LIBNAME on that file in the temporary work directory and leverage the XML engine to parse out the XML?

View solution in original post

4 REPLIES 4
ChrisHemedinger
Community Manager

You're almost there.  You just need to add a middle step in which you use DATA step to read the XML file out of the zip archive and then write it to a temp space in your session.  Then you can use LIBNAME XML2 to read the XML as data.

 

I have a similar example with an Excel file in this blog post.

SAS Innovate 2026: Register now! April 27-30 in Grapevine TX -- it's the premier conference for SAS users!
athomson
Fluorite | Level 6

Got it! 

 

Here is what I got: 

 

filename inzip ZIP "C:\input_data\data.zip";

data folder_contents;
 length memname $200 isFolder 8;
 fid=dopen("inzip");
 if fid=0 then
  stop;
 memcount=dnum(fid);
 do i=1 to memcount;
	/*Scans and gives only the name of the xml file*/
  	memname=scan(dread(fid,i), 2, '\');
  	output;
 end;
 rc=dclose(fid);
run;

/* create a report of the ZIP contents */
title "Files in the ZIP file";
proc print data=folder_contents noobs N;
run;

/* identify a temp folder in the WORK directory */
filename xl "%sysfunc(getoption(work))/file1.xml" ;
 
/* hat tip: "data _null_" on SAS-L */
data _null_;
   /* using member syntax here */
   infile inzip(sub_folder\file1.xml) 
       	lrecl=256 
		recfm=F 
		length=length 
		eof=eof unbuf;
   file   xl lrecl=256 recfm=N;
   input;
   put _infile_ $varying256. length;
   return;
 eof:
   stop;
run;

filename SXLEMAP "C:\xml_map\my_map2.map';
libname my_xml XMLV2 "%sysfunc(getoption(work))/file1.xml" xmlmap=sxlemap access=readonly;
libname out "C:\Output"; data out.xml_contents; set my_xml.test; run;

 It's the intermediate data _null_ step I'm a little shaky on. So data step basically tells SAS to pluck the xml file from inside the zipped folder and stick it in a temporary work directory? This then allows me to set a LIBNAME on that file in the temporary work directory and leverage the XML engine to parse out the XML?

ChrisHemedinger
Community Manager

@athomson - Exactly!  Good job!

SAS Innovate 2026: Register now! April 27-30 in Grapevine TX -- it's the premier conference for SAS users!
athomson
Fluorite | Level 6

Hi Chris (and anyone else reading this post)!

 

We've made some pretty good progress so far in trying to "macro-tize" the above so that it will read into multiple zipped folders and then parse out multiple XML files into records in a dataset. 

 

See the post below to offer any thoughts how we can overcome a couple of problems, and even make the macro run more efficiently!

 

https://communities.sas.com/t5/General-SAS-Programming/Improving-a-macro-that-parses-out-multiple-XM...

 

Thanks! 

Catch up on SAS Innovate 2026

Dive into keynotes, announcements and breakthroughs on demand.

Explore Now →
How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

SAS Training: Just a Click Away

 Ready to level-up your skills? Choose your own adventure.

Browse our catalog!

Discussion stats
  • 4 replies
  • 5232 views
  • 3 likes
  • 2 in conversation