- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
Hey Everyone,
I am facing an issue. I have a folder named new files which contains multiple xml files of clinical trial data (10,000files.)Initially i download a zipped file of it and later i unzipped it. At first i tried using macro for importing zipped xml files in the folder but was unable to do so. I used the code available on sas community site. The code didn't display any result despite of no errors. so, i thought of unzipping the folder and importing it as multiple files instead of zipped folder, but has no success. Can anyone plz suggest me ho how to deal with this.
Below is my code that i used-
filename inzip ZIP "/folders/myfolders/xml ct.zip";
data folder_contents;
length memname $200 isFolder 8;
fid=dopen("inzip");
if fid=0 then
stop;
memcount=dnum(fid);
do i=1 to memcount;
/*Scans and gives only the name of the xml file*/
memname=scan(dread(fid, i), 2, '\');
/*check for trailing / in folder name */
output;
end;
rc=dclose(fid);
run;
title "Files in the ZIP XML file";
proc print data=folder_contents noobs N;
run;
filename SXLEMAP "/folders/myfolders/search1.map";
libname myxml XMLV2 "%sysfunc(getoption(work))/searchResultsct.xm" xmlmap = SXLEMAP;libname out "/folders/myfolders";
data out.xml_contents;
set myxml.contents ;
run;
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
Are the XML files all the same structure or different? How many do you have? Can you read one of the files? Can you attach an example file for us to test?
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
hey,
Yeah all the xml files are of same structure and i am able to read single xml file by importing it using xml mapper.and i have around 10,000 files in a folder.Below is the code I used-
filename datafile "/folders/myfolders/abc.xml";
filename mapfile "/folders/myfolders/abc.map";
libname datafile xmlv2 xmlmap=mapfile automap=replace;
proc copy in=datafile out=work;
run;
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
hey,
Yeah all the XML files are of same structure and i am able to read single XML file by importing it using XML mapper.and i have around 10,000 files in a folder.Below is the code I used-
filename datafile "/folders/myfolders/abc.xml";
filename mapfile "/folders/myfolders/abc.map";
libname datafile xmlv2 xmlmap=mapfile automap=replace;
proc copy in=datafile out=work;
run;
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
To simplify the code, I would unpack the zip file as a start, so the xml files are ready to process.
You need some kind of loop, where each file is read separately and the output appended to a result data set. The loop will have a file list as input. A loop over 10.000 files will take a long time and produce a very big log file, so I would recommend experimenting with a small loop count and then - when everything works - direct the log from the loop to an external file, that could be checked for errors with grep afterwards.
The following code could be a starting point. It is not tested, but uses a well-proven macro technique. It can be done in many ways, and other contributors might come up with solutions to avoid macros.
* get xml file list and keep number of files in macro var;
filename inxml "/folders/myfolders/xmlfiles";
data folder_contents;
length memname $200;
fid=dopen("inxml");
if fid=0 then stop;
memcount=dnum(fid);
do i=1 to memcount;
memname=dread(fid, i);
if scan(memname,-1,'.') = 'xml' then output;
end;
rc=dclose(fid);
call symputx('file_count',memcount);
run;
* initiate result file - structure as in XML map;
libname reslib "/folders/myfolders/data/....";
data reslib.result;
length a 8 b 8 ....;
stop;
run;
* Macro to read files and append result;
%macro readfile(file);
filename datafile "/folders/myfolders/xmlfiles/&file";
filename mapfile "/folders/myfolders/abc.map";
libname datafile xmlv2 xmlmap=mapfile automap=replace;
proc copy in=datafile out=work;
run;
filename datafile clear;
libname datafile clear;
proc append base=reslib.result data=work.dataset
run;
%mend;
* redirect log - to be used when everything works ;
proc printto log="//folders/myfolders/readlog.log";
run;
* limit loop count for testing;
%let file_count = 5;
* loop macro;
%macro readloop(test);
%do i = 1 %to &file_count;
data _null_; set folder_contents(firstobs=i obs=i);
call symput('thisfile',memname);
run;
%readfile("&thisfile");
%end;
%mend;
%readloop;
* direct log back - to be used when everything works ;
proc printto;
run;
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
Thanks alot for this. I am gonna try this code and see how it turns out to be.
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
@Appy18 wrote:
hey,
Yeah all the XML files are of same structure and i am able to read single XML file by importing it using XML mapper.and i have around 10,000 files in a folder.Below is the code I used-
filename datafile "/folders/myfolders/abc.xml";
filename mapfile "/folders/myfolders/abc.map";
libname datafile xmlv2 xmlmap=mapfile automap=replace;proc copy in=datafile out=work;
run;
@ErikLund_Jensen has posted a solution and that will work.
However, once you have working code, it's relatively easy to convert that to a macro to repeat a process. I've written a short tutorial here that you can try if you'd like with your code. The only difference between that and @ErikLund_Jensen solution would be the last step, where I recommend using CALL EXECUTE to call your macros rather than another macro loop.
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
Thankls alot . I am gonna try both the method and see what works for me.