BookmarkSubscribeRSS Feed
🔒 This topic is solved and locked. Need further help from the community? Please sign in and ask a new question.
jaelee22
Fluorite | Level 6

Hello everyone. I need the help of experts here. 

 

I have created a XML to map to read the data and organize it in the way I need it but I have thousands of .xml files to read. I saw some codes in setting a macro to read the xml files but I can't figure out how to combine it with mine. So far this is what I have. Please help/ 

 

/*Environment*/
filename SXLELIB 'C:\Users\leejae\Desktop\xml testing\Burns^Gregory^^725907^19721109^Male.xml';
filename SXLEMAP 'C:\Users\leejae\Desktop\xml testing\map2.map';
libname SXLELIB xmlv2 xmlmap=SXLEMAP access=READONLY;
/*Catalog*/
proc datasets lib=SXLELIB; run;
/*Contents*/
proc contents data=SXLELIB.PATIENT varnum; run;
proc contents data=SXLELIB.SCAN varnum; run;
proc contents data=SXLELIB.TRACKINGDETAILS varnum; run;
/*Printing*/
title 'Table PATIENT';
proc print data=SXLELIB.PATIENT; run;
title;
title 'Table SCAN';
proc print data=SXLELIB.SCAN; run;
title;
title 'Table TRACKINGDETAILS';
proc print data=SXLELIB.TRACKINGDETAILS; run;
title;
/*Local Extraction*/
DATA PATIENT; SET SXLELIB.PATIENT; run;
DATA SCAN; SET SXLELIB.SCAN; run;
DATA TRACKINGDETAILS; SET SXLELIB.TRACKINGDETAILS; run;

1 ACCEPTED SOLUTION

Accepted Solutions
Tom
Super User Tom
Super User
740 +%read_one(filename="&pathWong^Denise^^1039423^19600330^Female.xml" );

You told SAS to look for a macro variable named PATHWONG and it is not finding it.  Probably because you never defined the macro variable PATH.

 

It might be easier for you to eliminate reference to the macro variable and just copy the actual path twice.  Note two copies of the path name is needed because of the way the DIR command on Windows returns only the terminal file names when you use the /B option even when you included the path in the call.  Maybe it is time to switch to Unix. 🙂

data filelist;
  infile 'dir "C:\Users\leejae\Desktop\xml testing\*.xml" /b' pipe truncover ;
  input filename $256.;
  filename=cats("C:\Users\leejae\Desktop\xml testing\",filename);
run;

 

 

View solution in original post

8 REPLIES 8
Tom
Super User Tom
Super User

Not sure what part of the process you are having trouble with.

Looks like you have code to read data from ONE xml file. 

I would suggest converting that into a macro.  If the overall goal is to combine the three datasets (PATIENT, SCAN, etc) across all of the studies then build PROC APPEND step into the macro.

%macro read_one(filename);
filename SXLELIB &filename ;
filename SXLEMAP 'C:\Users\leejae\Desktop\xml testing\map2.map';
libname SXLELIB xmlv2 xmlmap=SXLEMAP access=READONLY;

proc append base=all_patient data=sxlelib.patient force;
run;
proc append base=all_site data=sxlelib.site force;
run;
proc append base=all_TRACKINGDETAILS data=sxlelib.TRACKINGDETAILS force;
run;
%mend read_one;

Then use a list of the files to call it multiple times.  So assume you have the list in a dataset named FILELIST with a variable named FILENAME you could use a step like this to generate one call to the macro for each file.

filename code temp;
data _null_;
  set filelist;
  file code;
  put '%read_one(' filename= :$quote.  ');' ;
run;

Then you can use %INCLUDE to run the generated code. 

%include code / source2;

While you are debugging/developing this you might want to remove the target datasets before running it again for all of them to avoid getting duplicate records.

proc delete data=all_partient all_site all_trackingdetails;
run;
%include code / source2;

If the data does not have any fields that can be used to tell which XML file the records came from you might want to make your macro more complex so that it includes a way to add such a variable to the dataset before appending it to the aggregate datasets.

jaelee22
Fluorite | Level 6
My problem is combining the data set in to one, but I would like to read all of the xml files instead of one. I understand that you suggested to create a macro. I don't have a list of my data set. Let me try to find out how to create one and run it. Thank you
Tom
Super User Tom
Super User

So if you filenames look like:

C:\some directory\Burns^Gregory^^725907^19721109^Male.xml

Then it looks like you are using Windows.  If SAS is also running on windows the easiest way to get the list is to use the DIR command.

%let path=C:\some directory\;
data filelist;
  infile "dir &path.*.xml /b" pipe truncover ;
  input filename $256.;
  filename=cats("&path",filename);
run;

If you cannot use the PIPE engine then search on this site for the 20 gazillion times the question of how to get a list of files has been asked and answered.

jaelee22
Fluorite | Level 6

I been doing some reading and trying different things. So far with what you have shared with me, I have gotten this far.

 

data filelist;
  infile 'dir "C:\Users\leejae\Desktop\xml testing\*.xml" /b' pipe truncover ;
  input filename $256.;
  filename=cats("&path",filename);
run;


%macro read_one(filename);
filename SXLELIB &filename ;
filename SXLEMAP 'C:\Users\leejae\Desktop\xml testing\map2.map';
libname SXLELIB xmlv2 xmlmap=SXLEMAP access=READONLY;

proc datasets lib=SXLELIB; run;

proc contents data=SXLELIB.SCAN varnum; run;

title 'Table SCAN';
proc print data=SXLELIB.SCAN; run;
title;

proc append base=all_SCAN data=sxlelib.SCAN force;
run;
%mend read_one;

filename code temp;
data _null_;
  set filelist;
  file code;
  put '%read_one(' filename= :$quote.  ');' ;
run;

So far I am able to read all of my files, but I doesn't seem to store any of the data. I appreciate all help. Thank you

Tom
Super User Tom
Super User
That code will generate a list of macro calls to the file CODE. You should see a note in the log from that data step that says it wrote XX lines to the file.
To run the generated code use the %INCLUDE. You might want to set the MPRINT option on first to see what lines of code the macro calls generate.
jaelee22
Fluorite | Level 6
data filelist;
  infile 'dir "C:\Users\leejae\Desktop\xml testing\*.xml" /b' pipe truncover ;
  input filename $256.;
  filename=cats("&path",filename);
run;


%macro read_one(filename);
filename SXLELIB &filename ;
filename SXLEMAP 'C:\Users\leejae\Desktop\xml testing\map2.map';
libname SXLELIB xmlv2 xmlmap=SXLEMAP access=READONLY;

proc datasets lib=SXLELIB; run;

proc contents data=SXLELIB.SCAN varnum; run;

title 'Table SCAN';
proc print data=SXLELIB.SCAN; run;
title;

proc append base=all_SCAN data=sxlelib.SCAN force;
run;
%mend read_one;

filename code temp;
data _null_;
  set filelist;
  file code;
  put '%read_one(' filename= :$quote.  ');' ;
run;



%include code / source2;

I am not understanding this. I never learned SAS formally, so I have a steep learning curve. I apologize. If I run the code with one file, I am able to out put the table and save the data. But when I run the macro, the shows that there are 0 observations, but it is reading all the files. 

 

 

740 +%read_one(filename="&pathWong^Denise^^1039423^19600330^Female.xml" );
WARNING: Apparent symbolic reference PATHWONG not resolved.
WARNING: Apparent symbolic reference PATHWONG not resolved.
NOTE: Processing XMLMap version 2.1.
NOTE: Libref SXLELIB was successfully assigned as follows:
Engine: XMLV2
Physical Name: C:\Users\leejae\Desktop\xml testing\&pathWong^Denise^^1039423^19600330^Female.xml
NOTE: PROCEDURE DATASETS used (Total process time):
real time 0.00 seconds
cpu time 0.01 seconds

 

NOTE: PROCEDURE CONTENTS used (Total process time):
real time 0.02 seconds
cpu time 0.01 seconds

 

NOTE: Access by observation number not available. Observation numbers will be counted by PROC PRINT.
NOTE: No observations in data set SXLELIB.SCAN.
NOTE: There were 0 observations read from the data set SXLELIB.SCAN.
NOTE: PROCEDURE PRINT used (Total process time):
real time 0.00 seconds
cpu time 0.00 seconds

 

NOTE: Appending SXLELIB.SCAN to WORK.ALL_SCAN.
NOTE: There were 0 observations read from the data set SXLELIB.SCAN.
NOTE: 0 observations added.
NOTE: The data set WORK.ALL_SCAN has 0 observations and 25 variables.
NOTE: PROCEDURE APPEND used (Total process time):
real time 0.00 seconds
cpu time 0.00 seconds


741 +%read_one(filename="&pathXMLExport.xml" );
WARNING: Apparent symbolic reference PATHXMLEXPORT not resolved.
WARNING: Apparent symbolic reference PATHXMLEXPORT not resolved.
NOTE: Processing XMLMap version 2.1.
NOTE: Libref SXLELIB was successfully assigned as follows:
Engine: XMLV2
Physical Name: C:\Users\leejae\Desktop\xml testing\&pathXMLExport.xml
NOTE: PROCEDURE DATASETS used (Total process time):
real time 0.00 sec

 

I saw a different user post a code, but this one doesn't work for me either. I am confused and lost. 

*create macro to read all file names in folder*/
filename prueba  pipe 'dir "C:\Users\leejae\Desktop\xml testing\*.xml" /b';

data file_list; 
length fname $200; 
infile prueba truncover; /* infile statement for file names */
input fname $200.; /* read the file names from the directory */
call symput ('num_files',_n_); /* store the record number in a macro variable */
run; 


%macro fileread; 
%do j=1 %to &num_files; 

data _null_; 
set file_list; 
if _n_=&j; 
call symput ('filein',fname); 
run; 


/*code to retrieve xml map created*/
filename  SXLELIB 'C:\Users\leejae\Desktop\xml testing\&filein..xml';
filename  SXLEMAP 'C:\Users\leejae\Desktop\xml testing\map2.map';
libname   SXLELIB xmlv2 xmlmap=SXLEMAP access=READONLY ;

 
proc datasets lib=SXLELIB; run;

proc contents data=sxlelib._all_ varnum; run;

/*copy sxlelib to your library*/

proc copy in=sxlelib out=work;
run;

/*save each file read named 'alerta' as in xml mapper in a single sas dataset call data_all*/
%if &j=1 %then %do;
 data data_all;
 set SCAN;
 run;
%end;
%else %do;
 data data_all;
 set data_all
 SCAN;
 run;
%end;


%end; 
%mend fileread; 
%fileread; 
Tom
Super User Tom
Super User
740 +%read_one(filename="&pathWong^Denise^^1039423^19600330^Female.xml" );

You told SAS to look for a macro variable named PATHWONG and it is not finding it.  Probably because you never defined the macro variable PATH.

 

It might be easier for you to eliminate reference to the macro variable and just copy the actual path twice.  Note two copies of the path name is needed because of the way the DIR command on Windows returns only the terminal file names when you use the /B option even when you included the path in the call.  Maybe it is time to switch to Unix. 🙂

data filelist;
  infile 'dir "C:\Users\leejae\Desktop\xml testing\*.xml" /b' pipe truncover ;
  input filename $256.;
  filename=cats("C:\Users\leejae\Desktop\xml testing\",filename);
run;

 

 

jaelee22
Fluorite | Level 6
Thank you so much Tom. I worked.

sas-innovate-2024.png

Available on demand!

Missed SAS Innovate Las Vegas? Watch all the action for free! View the keynotes, general sessions and 22 breakouts on demand.

 

Register now!

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

Click image to register for webinarClick image to register for webinar

Classroom Training Available!

Select SAS Training centers are offering in-person courses. View upcoming courses for:

View all other training opportunities.

Discussion stats
  • 8 replies
  • 1475 views
  • 4 likes
  • 2 in conversation