To create datasets by reading XML Metadata

Reply
N/A
Posts: 0

To create datasets by reading XML Metadata

My question is: I have the metadata to create SAS Datasets in XML format instead of CSV FILES. Should I use the XML LIBNAME engine for this purpose? Or how could this be acheived? Is their any other spl options for reading XML files thru SAS.

Thanks,
Priyash
SAS Super FREQ
Posts: 8,719

Re: To create datasets by reading XML Metadata

Hi:
If you worked for a company that made hammers and saws and asked this question, then I would just post the code below. However, since you posted to this forum for SAS and Clinical Trials -- if you are dealing with CDISC specification XML, then you might consider checking out PROC CDISC before you go all the way down the XML Libname Engine road.

If you have this kind of XML data:
[pre]
<?xml version="1.0" encoding="windows-1252" ?>
<TABLE>
<MyTable>
<name> Alvin </name>
<grp> aaa </grp>
<morestuff> 100 </morestuff>
</MyTable>
<MyTable>
<name> Bob </name>
<grp> bbb </grp>
<morestuff> 200 </morestuff>
</MyTable>
</TABLE>
[/pre]

Then, the code below will read the file into a SAS dataset and create a new variable from the MORESTUFF column. Let's say that the above file is saved as:
c:\temp\MyXData.xml -- then this LIBNAME statement would point to the file:
[pre]
libname xdata xml 'c:\temp\MyXData.xml';
[/pre]

Notice that every "set" of rows is contained within the <MyTable> and </MyTable> tags. So the 2 level name to reference the file would be: xdata.MyTable --which you could also find out by running this PROC CONTENTS step:
[pre]
proc contents data=xdata._all_;
title 'Proc Contents on XML Input File';
run;
[/pre]

Why do you need a PROC CONTENTS step?? It will tell you what SAS thinks the name of the TABLE is and what the column names are -- which is handy if you have a very big XML file or one where the people who made the XML file decided to trim out all the white space and the indents.

Anyway, when you run the PROC CONTENTS, you find that SAS can read 3 columns: NAME, GRP and MORESTUFF. So then you use this DATA step program to read the XML data into a SAS dataset and create a new variable:
[pre]
libname xdata xml 'c:\temp\MyXData.xml';

data work.MySASData;
set xdata.MyTable;
newvar = morestuff * 100;
run;

ods html file='c:\temp\readXML.html' style=sasweb;
proc print data=work.MySASData;
title 'To SAS From XML';
var name grp morestuff newvar;
run;
ods html close;

libname xdata clear;
[/pre]

As for your question about spl (special?) options with the LIBNAME engine -- there are more than 1 or 2 options. So this is the point at which I recommend the documentation and some of these links:
http://support.sas.com/rnd/base/xmlengine/index.html
http://support.sas.com/rnd/base/xmlengine/proccdisc/index.html
http://support.sas.com/rnd/base/xmlengine/sxle913/usersguide913.htm (this page has FAQ and some introductory material)

For example, if you have more complex XML, then you might need to use an XML .map file to tell SAS how to traverse through the XML paths in order to build the data set. Or, again, if you are using CDISC data, then you may want to use PROC CDISC to read your files.

cynthia
N/A
Posts: 0

Re: To create datasets by reading XML Metadata

Thanks, Cynthia.
But I guess I confused U with my question. I meant to have the specifications to create the datasets are in XML. So my SAS system should read the spec in XML and get the corresponding input from database. Convert Xml files to SAS datasets is very normal, I didn't mean that. I hope to get thru PROC CDISC to work out.
Thanks again for your prompt solution.
Ask a Question
Discussion stats
  • 2 replies
  • 337 views
  • 0 likes
  • 2 in conversation