Hello all,
May I ask how to import a series of XML files into SAS (the attached is a sample of the XML file)?
I have tried the two method and got the following results,
first method
filename xx temp; libname xx xmlv2 'F:\USPTOPatentExaminationDataSystem/1960.xml' automap=replace xmlmap=xx; proc copy in=xx out=xx; run;
Result
1 filename xx temp; 2 libname xx xmlv2 'F:\USPTOPatentExaminationDataSystem/1960.xml' automap=replace xmlmap=xx; ERROR: The creation of the XML Mapper file failed. ERROR: Error in the LIBNAME statement. 3 proc copy in=xx out=xx; NOTE: Writing HTML Body file: sashtml.htm 4 run; ERROR: Libref XX is not assigned. NOTE: Statements not processed because of errors noted above. NOTE: PROCEDURE COPY used (Total process time): real time 0.38 seconds cpu time 0.28 seconds NOTE: The SAS System stopped processing this step because of errors. NOTE: Parsing with high validation. WARNING: XMLMap parser encountered XML issue Exception class: org.xml.sax.SAXParseException ID: <null> Message: schema_reference.4: Failed to read schema document '../../main/resources/Schema/USPatent/Document/PatentBulkData_V8_0.xsd', because 1) could not find the document; 2) the document could not be read; 3) the root element of the document is not <xsd:schema>. Line: 2 Column: 456
second method
libname myxml xml 'F:\USPTOPatentExaminationDataSystem\1960-1979-pairbulk-full-20201213-xml\1960.xml'; libname dat 'I:\Data_in_SAS\USPTO'; data dat.subjects; set myxml.subjects; run; proc print data = dat.subjects noobs; run;
result
12 libname myxml xml 'F:\USPTOPatentExaminationDataSystem/1960.xml'; NOTE: Libref MYXML was successfully assigned as follows: Engine: XML Physical Name: F:\USPTOPatentExaminationDataSystem/1960.xml 13 libname dat 'I:\Data_in_SAS\USPTO'; NOTE: Libref DAT was successfully assigned as follows: Engine: V9 Physical Name: I:\Data_in_SAS\USPTO 14 data dat.subjects; 15 set myxml.subjects; ERROR: The XML element name <RegisteredPractitionerRegistrationNumber> is too long for a SAS variable name. ERROR: Encountered during XMLMap parsing at or near line 643, column 67. ERROR: XML describe error: Internal processing error. 16 run; NOTE: The SAS System stopped processing this step because of errors. WARNING: The data set DAT.SUBJECTS may be incomplete. When this step was stopped there were 0 observations and 0 variables. WARNING: Data set DAT.SUBJECTS was not replaced because this step was stopped. NOTE: DATA statement used (Total process time): real time 0.05 seconds cpu time 0.00 seconds 17 proc print data = dat.subjects noobs; 18 run; NOTE: No variables in data set DAT.SUBJECTS. NOTE: PROCEDURE PRINT used (Total process time): real time 0.00 seconds cpu time 0.00 seconds
Could you please give me some advice about this? Many thanks in advance.
That means your XML file is not correct .
What do you want to see FROM this xml ?
Post the table you want to get that would be better .
filename x'c:\temp\1960.xml' termstr=lf;
data have;
infile x truncover length=len;
input have $varying2000. len;
if strip(have) =: '<uscom:ApplicationNumberText' then group+1;
want=prxchange('s/\<.+?\>//',-1,have);
if not missing(want);
run;
proc transpose data=have out=want;
by group;
var want;
run;
Couldn't you simply pre-process the file to change the string "RegisteredPractitionerRegistrationNumber" ?
That means your XML file is not correct .
What do you want to see FROM this xml ?
Post the table you want to get that would be better .
filename x'c:\temp\1960.xml' termstr=lf;
data have;
infile x truncover length=len;
input have $varying2000. len;
if strip(have) =: '<uscom:ApplicationNumberText' then group+1;
want=prxchange('s/\<.+?\>//',-1,have);
if not missing(want);
run;
proc transpose data=have out=want;
by group;
var want;
run;
SAS Innovate 2025 is scheduled for May 6-9 in Orlando, FL. Sign up to be first to learn about the agenda and registration!
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.
Ready to level-up your skills? Choose your own adventure.