BookmarkSubscribeRSS Feed
🔒 This topic is solved and locked. Need further help from the community? Please sign in and ask a new question.
Alexxxxxxx
Pyrite | Level 9

Hello all,

 

May I ask how to import a series of  XML files into SAS (the attached is a sample of the XML file)?

 

I have tried the two method and got the following results,

first method

filename xx temp;
libname xx xmlv2 'F:\USPTOPatentExaminationDataSystem/1960.xml' automap=replace xmlmap=xx;
proc copy in=xx out=xx;
run;
Result
1 filename xx temp; 2 libname xx xmlv2 'F:\USPTOPatentExaminationDataSystem/1960.xml' automap=replace xmlmap=xx; ERROR: The creation of the XML Mapper file failed. ERROR: Error in the LIBNAME statement. 3 proc copy in=xx out=xx; NOTE: Writing HTML Body file: sashtml.htm 4 run; ERROR: Libref XX is not assigned. NOTE: Statements not processed because of errors noted above. NOTE: PROCEDURE COPY used (Total process time): real time 0.38 seconds cpu time 0.28 seconds NOTE: The SAS System stopped processing this step because of errors. NOTE: Parsing with high validation. WARNING: XMLMap parser encountered XML issue Exception class: org.xml.sax.SAXParseException ID: <null> Message: schema_reference.4: Failed to read schema document '../../main/resources/Schema/USPatent/Document/PatentBulkData_V8_0.xsd', because 1) could not find the document; 2) the document could not be read; 3) the root element of the document is not <xsd:schema>. Line: 2 Column: 456

second method

libname myxml xml 'F:\USPTOPatentExaminationDataSystem\1960-1979-pairbulk-full-20201213-xml\1960.xml';
libname dat 'I:\Data_in_SAS\USPTO';
data dat.subjects;
 set myxml.subjects;
run;
proc print data = dat.subjects noobs;
run; 

result

12   libname myxml xml 'F:\USPTOPatentExaminationDataSystem/1960.xml';
NOTE: Libref MYXML was successfully assigned as follows:
      Engine:        XML
      Physical Name: F:\USPTOPatentExaminationDataSystem/1960.xml
13   libname dat 'I:\Data_in_SAS\USPTO';
NOTE: Libref DAT was successfully assigned as follows:
      Engine:        V9
      Physical Name: I:\Data_in_SAS\USPTO
14   data dat.subjects;
15    set myxml.subjects;
ERROR: The XML element name <RegisteredPractitionerRegistrationNumber> is too long for a SAS variable
       name.

ERROR: Encountered during XMLMap parsing at or near line 643, column 67.
ERROR: XML describe error: Internal processing error.
16   run;

NOTE: The SAS System stopped processing this step because of errors.
WARNING: The data set DAT.SUBJECTS may be incomplete.  When this step was stopped there were 0
         observations and 0 variables.
WARNING: Data set DAT.SUBJECTS was not replaced because this step was stopped.
NOTE: DATA statement used (Total process time):
      real time           0.05 seconds
      cpu time            0.00 seconds


17   proc print data = dat.subjects noobs;
18   run;

NOTE: No variables in data set DAT.SUBJECTS.
NOTE: PROCEDURE PRINT used (Total process time):
      real time           0.00 seconds
      cpu time            0.00 seconds





Could you please give me some advice about this? Many thanks in advance.

 

 

1 ACCEPTED SOLUTION

Accepted Solutions
Ksharp
Super User

That means your XML file is not correct .

What do you want to see FROM this xml ?

Post the table you want to get that would be better . 

 

filename x'c:\temp\1960.xml' termstr=lf;
data have;
infile x truncover length=len;
input have $varying2000. len;
if strip(have) =: '<uscom:ApplicationNumberText' then group+1;
want=prxchange('s/\<.+?\>//',-1,have);
if not missing(want);
run;

proc transpose data=have out=want;
by group;
var want;
run;

View solution in original post

2 REPLIES 2
ChrisNZ
Tourmaline | Level 20

Couldn't you simply pre-process the file to change the string "RegisteredPractitionerRegistrationNumber"  ?

 

 

Ksharp
Super User

That means your XML file is not correct .

What do you want to see FROM this xml ?

Post the table you want to get that would be better . 

 

filename x'c:\temp\1960.xml' termstr=lf;
data have;
infile x truncover length=len;
input have $varying2000. len;
if strip(have) =: '<uscom:ApplicationNumberText' then group+1;
want=prxchange('s/\<.+?\>//',-1,have);
if not missing(want);
run;

proc transpose data=have out=want;
by group;
var want;
run;

SAS Innovate 2025: Save the Date

 SAS Innovate 2025 is scheduled for May 6-9 in Orlando, FL. Sign up to be first to learn about the agenda and registration!

Save the date!

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

SAS Training: Just a Click Away

 Ready to level-up your skills? Choose your own adventure.

Browse our catalog!

Discussion stats
  • 2 replies
  • 2677 views
  • 0 likes
  • 3 in conversation