BookmarkSubscribeRSS Feed
deleted_user
Not applicable
Hi,

I have a dataset that contains XML data as one of its columns. I would like to extract a data element from it and create a new dataset.

DatasetA
-------------

COL1--COL2
1 -------1234..
2 -------5678..
3 -------9111..
....

I want to extract the element D data for each row...

Using the above example, the result should have the following data

DatasetB
-------------

COL1----COL2
1------------3
2------------7
3------------1
...

Is there any XML statements that can be used in SAS to do this? If not, is there any way that SAS dataset be modified to obtain the above result?

Any suggestions welcome.

Thanks,
Nisha
4 REPLIES 4
Cynthia_sas
Diamond | Level 26
Hi:
You will want to read about the SAS XML Libname Engine. If you have very "regular" or rectangular XML, then you can use the simple form of the XML Libname engine. For example, if you have this file stored in:
c:\temp\mydata.xml

[pre]
<?xml version="1.0" ?>
<TABLE>
<MYDATA>
<NAME> alan </NAME>
<AGE> 15 </AGE>
<ID>10</ID>
<A>111</A>
<B>221</B>
<C>331</C>
<D>441</D>
</MYDATA>
<MYDATA>
<NAME> bob </NAME>
<AGE> 16 </AGE>
<ID>20</ID>
<A>112</A>
<B>222</B>
<C>332</C>
<D>442</D>
</MYDATA>
<MYDATA>
<NAME> carla </NAME>
<AGE> 15 </AGE>
<ID>30</ID>
<A>113</A>
<B>223</B>
<C>333</C>
<D>443</D>
</MYDATA>
<MYDATA>
<NAME> dave </NAME>
<AGE> 14 </AGE>
<ID>40</ID>
<A>114</A>
<B>224</B>
<C>334</C>
<D>444</D>
</MYDATA>
</TABLE>
[/pre]

Then this code will read the MYDATA.XML file using the SAS XML Libname Engine (SXLE):
[pre]
libname xmlin xml 'c:\temp\mydata.xml';

data allvars;
set xmlin.mydata;
run;

data somevars(keep=name id a d);
set xmlin.mydata;
run;

libname xmlin clear;

ods listing;
proc print data=work.allvars;
title 'From SXLE -- get all variables';
run;

proc print data=work.somevars;
title 'From SXLE -- get only some variables';
run;
[/pre]

For more complicated XML data or for XML data files with more hierarchical levels of tags, In order to import these kinds of XML file into SAS, you have 2 choices:
1) transform the XML file into a more standard form XML, as needed by SAS, using a 3rd party XML tool, such as XMLSpy, or an XSL transform, as described here:
http://support.sas.com/rnd/base/xmlengine/sxle82/prod82/index.html
or
2) build an XMLMAP file that tells the SAS XML Libname Engine how to traverse the hierarchical XML tags in order to extract the data for a SAS dataset.

There is a tool called XMLMapper that helps build the Map syntax. XMLMapper (the tool) was first available in SAS 9 and there are some good screenshots and a usage scenario in these papers:
http://www2.sas.com/proceedings/forum2008/099-2008.pdf
http://www2.sas.com/proceedings/sugi29/119-29.pdf

You could also contact Tech Support for more help in this regard.

cynthia
deleted_user
Not applicable
Thanks Cynthia, this is exactly what i was looking for.

But now i receive the following error when i run DATA..SET statements.
"ERROR: File XMLIN.SAMPLE.DATA does not exist."

My XML file is "SAMPLE.xml".

-Nisha
Cynthia_sas
Diamond | Level 26
Hi:
In my example, I did a misleading thing. Sorry about that. Notice that my data file is called "MYDATA.XML" and that -inside- the file, under the <TABLE> tag, my tag that delimits every row is the <MYDATA> tag.

If this had been the <WOMBAT> tag, then my SET statement would have been:
[pre]
set xmlin.wombat;
[/pre]

and -not- XMLIN.MYDATA.

SAS expects the data to be VERY regular. There must be a clear delimiter tag around every row that you want to turn into an observation. I suspect your ERROR message is because of a problem with either the LIBNAME statement or the SET statement.

I suggest you open a track with Tech Support on this issue, as they can look at your XML data file and your SAS code and help you with a resolution.

The problem with posting XML onto the forum is the the < and the > symbols around every tag sometimes cause the posting editor program to truncate a posting.

Again, if you have nonstandard XML -- in other words, if your XML is not regular, like my XML example, then you have to use the XMLMapper to "map" the XML tags to SAS columns.

cynthia
deleted_user
Not applicable
Thanks. It works now without errors. But I have hierarchical XML data, I am going to try now with XML mapper.

-Nisha

hackathon24-white-horiz.png

The 2025 SAS Hackathon has begun!

It's finally time to hack! Remember to visit the SAS Hacker's Hub regularly for news and updates.

Latest Updates

What is Bayesian Analysis?

Learn the difference between classical and Bayesian statistical approaches and see a few PROC examples to perform Bayesian analysis in this video.

Find more tutorials on the SAS Users YouTube channel.

SAS Training: Just a Click Away

 Ready to level-up your skills? Choose your own adventure.

Browse our catalog!

Discussion stats
  • 4 replies
  • 1305 views
  • 0 likes
  • 2 in conversation