BookmarkSubscribeRSS Feed
🔒 This topic is solved and locked. Need further help from the community? Please sign in and ask a new question.
Alexxxxxxx
Pyrite | Level 9

Hello all,

 

May I ask how to import a series of  XML files into SAS (the attached is a sample of the XML file)?

 

I have tried the two method and got the following results,

first method

filename xx temp;
libname xx xmlv2 'F:\USPTOPatentExaminationDataSystem/1960.xml' automap=replace xmlmap=xx;
proc copy in=xx out=xx;
run;
Result
1 filename xx temp; 2 libname xx xmlv2 'F:\USPTOPatentExaminationDataSystem/1960.xml' automap=replace xmlmap=xx; ERROR: The creation of the XML Mapper file failed. ERROR: Error in the LIBNAME statement. 3 proc copy in=xx out=xx; NOTE: Writing HTML Body file: sashtml.htm 4 run; ERROR: Libref XX is not assigned. NOTE: Statements not processed because of errors noted above. NOTE: PROCEDURE COPY used (Total process time): real time 0.38 seconds cpu time 0.28 seconds NOTE: The SAS System stopped processing this step because of errors. NOTE: Parsing with high validation. WARNING: XMLMap parser encountered XML issue Exception class: org.xml.sax.SAXParseException ID: <null> Message: schema_reference.4: Failed to read schema document '../../main/resources/Schema/USPatent/Document/PatentBulkData_V8_0.xsd', because 1) could not find the document; 2) the document could not be read; 3) the root element of the document is not <xsd:schema>. Line: 2 Column: 456

second method

libname myxml xml 'F:\USPTOPatentExaminationDataSystem\1960-1979-pairbulk-full-20201213-xml\1960.xml';
libname dat 'I:\Data_in_SAS\USPTO';
data dat.subjects;
 set myxml.subjects;
run;
proc print data = dat.subjects noobs;
run; 

result

12   libname myxml xml 'F:\USPTOPatentExaminationDataSystem/1960.xml';
NOTE: Libref MYXML was successfully assigned as follows:
      Engine:        XML
      Physical Name: F:\USPTOPatentExaminationDataSystem/1960.xml
13   libname dat 'I:\Data_in_SAS\USPTO';
NOTE: Libref DAT was successfully assigned as follows:
      Engine:        V9
      Physical Name: I:\Data_in_SAS\USPTO
14   data dat.subjects;
15    set myxml.subjects;
ERROR: The XML element name <RegisteredPractitionerRegistrationNumber> is too long for a SAS variable
       name.

ERROR: Encountered during XMLMap parsing at or near line 643, column 67.
ERROR: XML describe error: Internal processing error.
16   run;

NOTE: The SAS System stopped processing this step because of errors.
WARNING: The data set DAT.SUBJECTS may be incomplete.  When this step was stopped there were 0
         observations and 0 variables.
WARNING: Data set DAT.SUBJECTS was not replaced because this step was stopped.
NOTE: DATA statement used (Total process time):
      real time           0.05 seconds
      cpu time            0.00 seconds


17   proc print data = dat.subjects noobs;
18   run;

NOTE: No variables in data set DAT.SUBJECTS.
NOTE: PROCEDURE PRINT used (Total process time):
      real time           0.00 seconds
      cpu time            0.00 seconds





Could you please give me some advice about this? Many thanks in advance.

 

 

1 ACCEPTED SOLUTION

Accepted Solutions
Ksharp
Super User

That means your XML file is not correct .

What do you want to see FROM this xml ?

Post the table you want to get that would be better . 

 

filename x'c:\temp\1960.xml' termstr=lf;
data have;
infile x truncover length=len;
input have $varying2000. len;
if strip(have) =: '<uscom:ApplicationNumberText' then group+1;
want=prxchange('s/\<.+?\>//',-1,have);
if not missing(want);
run;

proc transpose data=have out=want;
by group;
var want;
run;

View solution in original post

2 REPLIES 2
ChrisNZ
Tourmaline | Level 20

Couldn't you simply pre-process the file to change the string "RegisteredPractitionerRegistrationNumber"  ?

 

 

Ksharp
Super User

That means your XML file is not correct .

What do you want to see FROM this xml ?

Post the table you want to get that would be better . 

 

filename x'c:\temp\1960.xml' termstr=lf;
data have;
infile x truncover length=len;
input have $varying2000. len;
if strip(have) =: '<uscom:ApplicationNumberText' then group+1;
want=prxchange('s/\<.+?\>//',-1,have);
if not missing(want);
run;

proc transpose data=have out=want;
by group;
var want;
run;
How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

SAS Training: Just a Click Away

 Ready to level-up your skills? Choose your own adventure.

Browse our catalog!

Discussion stats
  • 2 replies
  • 5170 views
  • 0 likes
  • 3 in conversation