I encountered a very difficult task,
In the local sql server, our IT people stored received XML file (sizes:2k~232k) in sql server xml columns. And I would like to export these files out and concatenate them into one single txt file, or let sas read the xml from each cell separately and parse them out.
I ran into a few questions:
1. How to export those xml files out and store into a single txt file? It seems sas allows only 32k in each cell.
2.How to maintain the length of the xml file in each cell? sas will truncate the length....
Here's what I understand from your post:
1) you either have XML files on disk OR you have some kind of XML information stored in an SQL server database
2) you want the XML files or information to become a single txt file
3) you want to do "convert" the files or information from XML to TXT using SAS
4) you seem to have done some experimentation of some kind because the issue of a 32K size limit is mentioned
5) last, you asked something confusing: "how to maintain the length of the xml FILE in each CELL" (This seems to imply that you have an ENTIRE XML file in one table cell or database cell??? That seems possible, but odd to me. You might have some XML data or data in XML form with XML tags stored in a data base.)
XML is generally stored in an ASCII text file. The file has a .XML extension instead of a .TXT extension. Generally, an XML file conforms to certain rules:
1) the XML file should have an XML processing instruction or identifying tag at the top of the file, something like this:
2) the XML file should have 1 and only 1 root tag
More advanced uses of XML exist -- such as shopping carts or on-line forms -- where you do something on a web page and your information is translated to XML form and then sent to a server program that accepts the XML "stream" and uses it as input to a program. It's possible that the XML info in this case is stored in a data base -- but it should be possible using data base utilities to get the XML from data form into ASCII text form.
Anyway, back to the simpler question of combining XML files, if this XML file is stored on my C: drive in this location: c:\temp\mygifts.xml
<?xml version="1.0" ?>
<NAME> alan </NAME>
<AGE> 15 </AGE>
<GIFTIDEA> CD </GIFTIDEA>
<NAME> bob </NAME>
<AGE> 16 </AGE>
<GIFTIDEA> Games </GIFTIDEA>
<NAME> carla </NAME>
<AGE> 15 </AGE>
<GIFTIDEA> Jewelry </GIFTIDEA>
<NAME> dave </NAME>
<AGE> 14 </AGE>
<GIFTIDEA> iTunes </GIFTIDEA>
AND, if this OTHER file is stored in c:\temp\moregifts.xml:
....Since these are both "regular", non-hierarchical XML files, I can use the SAS XML Libname engine to read the XML files into a single SAS dataset:
libname onefile xml 'c:\temp\mygifts.xml';
libname twofile xml 'c:\temp\moregifts.xml';
libname onefile clear;
libname twofile clear;
proc print data=work.giftidea;
title 'SAS dataset from XML file';
And then, once the files are together in a SAS dataset (4 obs from the first XML file and 3 obs from the second XML file), I could run a PROC EXPORT to export it to CSV or any other kind of file supported by PROC EXPORT. I could even write my own DATA step program to write the SAS dataset to a flat file. For valid, non-hierarchical XML file, I could have read the file with the XML Libname engine and then written it out in one step with the data step, but I preferred to show the intermediate stages along the way. Here's a sample PROC EXPORT:
proc export data=work.giftidea
If you have non-standard or hierarchical XML files, then you may have to use an XMLMAP to "map" from the non-standard XML to SAS column names. I know that if the XML files are in ASCII text files on disk, then SAS can either read them with the SAS XML Libname engine or it can read them with the Libname engine and an XMLMAP.
If your XML is not in identifiable .XML files and/or if you are asking whether SAS can read XML out of the SQL server database directly -- I don't know the answer to that. It seems that Tech Support might be your best resource to answer that question.
Cynthia gave you already lotz of interesting information.
From what you tell I assume that these XML files are stored in SQL in a field with type CLOB.
The first question you have to solve is how to read CLOB data >32K with SAS.
Well: I couldn't find an answer to that. I was thinking about streaming but couldn't find an example. Hopefully someone has an answer to that.
As Cynthia mentioned XML files are text. Would it be possible that your IT guys give you access to the XML files before they load it into SQL - or that they export these XML files later from SQL to a text file? A soon as you have these XML files as text files (that's what an XML file normally is: A text file with extension XML in a defined structure which can be interpreted by an XML parser - i.e. InternetExplorer) you can read/convert/interprete it with SAS. Cynthia pointed you in the right direction for that.
As I understand it this combining of XML files is more a combining of data these XML files contain. In my understanding it wouldn't make sense to concatenate the XML files - what you might want to do is to read all the XML files one after the other and concatenate the resuls into one single SAS dataset (proc append?).