BookmarkSubscribeRSS Feed
Appy18
Calcite | Level 5

Hey Everyone,

I am facing an issue. I have a folder named new files which contains multiple xml files of clinical trial data (10,000files.)Initially i download a zipped file of it and later i unzipped it. At first i tried using macro for importing zipped xml files in the folder but was unable to do so. I used the code available on sas community site. The code didn't display any result despite of no errors. so, i thought of unzipping the folder and importing it as multiple files instead of zipped folder, but has no success. Can anyone plz suggest me ho how to deal with this.

Below is my code that i used-

 

filename inzip ZIP "/folders/myfolders/xml ct.zip";

data folder_contents;
length memname $200 isFolder 8;
fid=dopen("inzip");

if fid=0 then
stop;
memcount=dnum(fid);

do i=1 to memcount;

/*Scans and gives only the name of the xml file*/
memname=scan(dread(fid, i), 2, '\');

/*check for trailing / in folder name */
output;
end;
rc=dclose(fid);
run;

title "Files in the ZIP XML file";

proc print data=folder_contents noobs N;
run;

filename SXLEMAP "/folders/myfolders/search1.map";
libname myxml XMLV2 "%sysfunc(getoption(work))/searchResultsct.xm" xmlmap = SXLEMAP;libname out "/folders/myfolders";

data out.xml_contents;
set myxml.contents ;
run;

7 REPLIES 7
Reeza
Super User
Forget macros.

Are the XML files all the same structure or different? How many do you have? Can you read one of the files? Can you attach an example file for us to test?
Appy18
Calcite | Level 5

hey,

Yeah all the xml files are of same structure and i am able to read single xml file by importing it using xml mapper.and i have around 10,000 files in a folder.Below is the code I used-

filename datafile "/folders/myfolders/abc.xml";
filename mapfile "/folders/myfolders/abc.map";
libname datafile xmlv2 xmlmap=mapfile automap=replace;

proc copy in=datafile out=work;
run;

Appy18
Calcite | Level 5

hey,

Yeah all the XML files are of same structure and i am able to read single XML file by importing it using XML mapper.and i have around 10,000 files in a folder.Below is the code I used-

filename datafile "/folders/myfolders/abc.xml";
filename mapfile "/folders/myfolders/abc.map";
libname datafile xmlv2 xmlmap=mapfile automap=replace;

proc copy in=datafile out=work;
run;

 

ErikLund_Jensen
Rhodochrosite | Level 12

To simplify the code, I would unpack the zip file as a start, so the xml files are ready to process.

 

You need some kind of loop, where each file is read separately and the output appended to a result data set. The loop will have a file list as input. A loop over 10.000 files will take a long time and produce a very big log file, so I would recommend experimenting with a small loop count and then - when everything works - direct the log from the loop to an external file, that could be checked for errors with grep afterwards.

 

The following code could be a starting point. It is not tested, but uses a well-proven macro technique. It can be done in many ways, and other contributors might come up with solutions to avoid macros.

 

* get xml file list and keep number of files in macro var;
filename inxml "/folders/myfolders/xmlfiles";
data folder_contents;
	length memname $200;
	fid=dopen("inxml");
	if fid=0 then stop;
	memcount=dnum(fid);
	do i=1 to memcount;
		memname=dread(fid, i);
		if scan(memname,-1,'.') = 'xml' then output;
	end;
	rc=dclose(fid);
	call symputx('file_count',memcount);
run;

* initiate result file - structure as in XML map;
libname reslib "/folders/myfolders/data/....";
data reslib.result;
	length a 8 b 8 ....;
	stop;
run;

* Macro to read files and append result;
%macro readfile(file);
	filename datafile "/folders/myfolders/xmlfiles/&file";
	filename mapfile "/folders/myfolders/abc.map";
	libname datafile xmlv2 xmlmap=mapfile automap=replace;
	proc copy in=datafile out=work;
	run;
	filename datafile clear;
	libname datafile clear;

	proc append base=reslib.result data=work.dataset
	run;
%mend;

* redirect log - to be used when everything works ;
proc printto log="//folders/myfolders/readlog.log";
run;

* limit loop count for testing;
%let file_count = 5;

* loop macro;
%macro readloop(test);
	%do i = 1 %to &file_count;

		data _null_; set folder_contents(firstobs=i obs=i);
			call symput('thisfile',memname);
		run;

		%readfile("&thisfile");
	%end;
%mend;
%readloop;

* direct log back - to be used when everything works ;
proc printto;
run;

Appy18
Calcite | Level 5

Thanks alot for this. I am gonna try this code and see how it turns out to be.

Reeza
Super User

@Appy18 wrote:

hey,

Yeah all the XML files are of same structure and i am able to read single XML file by importing it using XML mapper.and i have around 10,000 files in a folder.Below is the code I used-

filename datafile "/folders/myfolders/abc.xml";
filename mapfile "/folders/myfolders/abc.map";
libname datafile xmlv2 xmlmap=mapfile automap=replace;

proc copy in=datafile out=work;
run;

 


@ErikLund_Jensen  has posted a solution and that will work.

 

However, once you have working code, it's relatively easy to convert that to a macro to repeat a process. I've written a short tutorial here that you can try if you'd like with your code. The only difference between that and @ErikLund_Jensen solution would be the last step, where I recommend using CALL EXECUTE to call your macros rather than another macro loop. 

Appy18
Calcite | Level 5

Thankls alot . I am gonna try both the method and see what works for me.

Ready to join fellow brilliant minds for the SAS Hackathon?

Build your skills. Make connections. Enjoy creative freedom. Maybe change the world. Registration is now open through August 30th. Visit the SAS Hackathon homepage.

Register today!
How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

Click image to register for webinarClick image to register for webinar

Classroom Training Available!

Select SAS Training centers are offering in-person courses. View upcoming courses for:

View all other training opportunities.

Discussion stats
  • 7 replies
  • 2042 views
  • 0 likes
  • 3 in conversation