BookmarkSubscribeRSS Feed
deleted_user
Not applicable
Guys,

I encountered a very difficult task,
In the local sql server, our IT people stored received XML file (sizes:2k~232k) in sql server xml columns. And I would like to export these files out and concatenate them into one single txt file, or let sas read the xml from each cell separately and parse them out.

I ran into a few questions:
1. How to export those xml files out and store into a single txt file? It seems sas allows only 32k in each cell.

2.How to maintain the length of the xml file in each cell? sas will truncate the length....


Thanks a lot for your insight, please help.
5 REPLIES 5
deleted_user
Not applicable
any suggestions are welcomed.
Cynthia_sas
SAS Super FREQ
Hi:
Here's what I understand from your post:
1) you either have XML files on disk OR you have some kind of XML information stored in an SQL server database
2) you want the XML files or information to become a single txt file
3) you want to do "convert" the files or information from XML to TXT using SAS
4) you seem to have done some experimentation of some kind because the issue of a 32K size limit is mentioned
5) last, you asked something confusing: "how to maintain the length of the xml FILE in each CELL" (This seems to imply that you have an ENTIRE XML file in one table cell or database cell??? That seems possible, but odd to me. You might have some XML data or data in XML form with XML tags stored in a data base.)

XML is generally stored in an ASCII text file. The file has a .XML extension instead of a .TXT extension. Generally, an XML file conforms to certain rules:
1) the XML file should have an XML processing instruction or identifying tag at the top of the file, something like this:
2) the XML file should have 1 and only 1 root tag

More advanced uses of XML exist -- such as shopping carts or on-line forms -- where you do something on a web page and your information is translated to XML form and then sent to a server program that accepts the XML "stream" and uses it as input to a program. It's possible that the XML info in this case is stored in a data base -- but it should be possible using data base utilities to get the XML from data form into ASCII text form.

Anyway, back to the simpler question of combining XML files, if this XML file is stored on my C: drive in this location: c:\temp\mygifts.xml
[pre]
<?xml version="1.0" ?>
<TABLE>
<MYGIFTS>
<NAME> alan </NAME>
<AGE> 15 </AGE>
<GIFTIDEA> CD </GIFTIDEA>
</MYGIFTS>
<MYGIFTS>
<NAME> bob </NAME>
<AGE> 16 </AGE>
<GIFTIDEA> Games </GIFTIDEA>
</MYGIFTS>
<MYGIFTS>
<NAME> carla </NAME>
<AGE> 15 </AGE>
<GIFTIDEA> Jewelry </GIFTIDEA>
</MYGIFTS>
<MYGIFTS>
<NAME> dave </NAME>
<AGE> 14 </AGE>
<GIFTIDEA> iTunes </GIFTIDEA>
</MYGIFTS>
</TABLE>
[/pre]

AND, if this OTHER file is stored in c:\temp\moregifts.xml:

[pre]
<?xml version="1.0" ?>
<TABLE>
<MORE>
<NAME> edna </NAME>
<AGE> 15 </AGE>
<GIFTIDEA> iTUNES </GIFTIDEA>
</MORE>
<MORE>
<NAME> frank </NAME>
<AGE> 14 </AGE>
<GIFTIDEA> CD </GIFTIDEA>
</MORE>
<MORE>
<NAME> gail </NAME>
<AGE> 15 </AGE>
<GIFTIDEA> Books </GIFTIDEA>
</MORE>
</TABLE>

[/pre]

....Since these are both "regular", non-hierarchical XML files, I can use the SAS XML Libname engine to read the XML files into a single SAS dataset:
[pre]
libname onefile xml 'c:\temp\mygifts.xml';
libname twofile xml 'c:\temp\moregifts.xml';

data work.giftidea;
set onefile.mygifts
twofile.more;
run;

libname onefile clear;
libname twofile clear;

ods listing;
proc print data=work.giftidea;
title 'SAS dataset from XML file';
run;
[/pre]

And then, once the files are together in a SAS dataset (4 obs from the first XML file and 3 obs from the second XML file), I could run a PROC EXPORT to export it to CSV or any other kind of file supported by PROC EXPORT. I could even write my own DATA step program to write the SAS dataset to a flat file. For valid, non-hierarchical XML file, I could have read the file with the XML Libname engine and then written it out in one step with the data step, but I preferred to show the intermediate stages along the way. Here's a sample PROC EXPORT:
[pre]
proc export data=work.giftidea
outfile='c:\temp\flatfile.txt'
dbms=csv replace;
run;
[/pre]

If you have non-standard or hierarchical XML files, then you may have to use an XMLMAP to "map" from the non-standard XML to SAS column names. I know that if the XML files are in ASCII text files on disk, then SAS can either read them with the SAS XML Libname engine or it can read them with the Libname engine and an XMLMAP.

If your XML is not in identifiable .XML files and/or if you are asking whether SAS can read XML out of the SQL server database directly -- I don't know the answer to that. It seems that Tech Support might be your best resource to answer that question.

cynthia
Patrick
Opal | Level 21
Hi Wayne66

Cynthia gave you already lotz of interesting information.
From what you tell I assume that these XML files are stored in SQL in a field with type CLOB.
The first question you have to solve is how to read CLOB data >32K with SAS.
Well: I couldn't find an answer to that. I was thinking about streaming but couldn't find an example. Hopefully someone has an answer to that.

As Cynthia mentioned XML files are text. Would it be possible that your IT guys give you access to the XML files before they load it into SQL - or that they export these XML files later from SQL to a text file? A soon as you have these XML files as text files (that's what an XML file normally is: A text file with extension XML in a defined structure which can be interpreted by an XML parser - i.e. InternetExplorer) you can read/convert/interprete it with SAS. Cynthia pointed you in the right direction for that.

As I understand it this combining of XML files is more a combining of data these XML files contain. In my understanding it wouldn't make sense to concatenate the XML files - what you might want to do is to read all the XML files one after the other and concatenate the resuls into one single SAS dataset (proc append?).

Just guessing! Is it that what you're after?

Cheers, Patrick
As soon as
deleted_user
Not applicable
Patrick,

You just pointed the core and I realize that is eventually what I have to end up with.

Thanks a lot,

Wayne
deleted_user
Not applicable
Cynthia,

Thanks a lot for all your inputs, all these matters.

Patrick described more of my case, and all I have to do now is to have IT get the txt file for me.

Great Thanks,

Wayne

sas-innovate-2024.png

Join us for SAS Innovate April 16-19 at the Aria in Las Vegas. Bring the team and save big with our group pricing for a limited time only.

Pre-conference courses and tutorials are filling up fast and are always a sellout. Register today to reserve your seat.

 

Register now!

What is Bayesian Analysis?

Learn the difference between classical and Bayesian statistical approaches and see a few PROC examples to perform Bayesian analysis in this video.

Find more tutorials on the SAS Users YouTube channel.

Click image to register for webinarClick image to register for webinar

Classroom Training Available!

Select SAS Training centers are offering in-person courses. View upcoming courses for:

View all other training opportunities.

Discussion stats
  • 5 replies
  • 1654 views
  • 0 likes
  • 3 in conversation