BookmarkSubscribeRSS Feed
🔒 This topic is solved and locked. Need further help from the community? Please sign in and ask a new question.
Yura2301
Quartz | Level 8

Hi ,

I need to create XML document from SAS data step,this XML should have simple structure:

<Table>

    <RECORD id=1>

        <Table1>

               //columns tegs1

        </Table1>

        <Table2>

             //columns tegs2

        </Table2>

    </RECORD>

    <RECORD id=2>

        <Table1>

            //columns tegs1

        </Table1>

        <Table2>

             //columns tegs2

        </Table2>

    </RECORD>

...

</Table>

and it can be easy build by hard coding tags in data step:

filename out '...result.xml';

data _null_;

    set sashelp.class end=last;

    file out;

    if _N_=1 then do;

        put '<Table>';

    end;

    str=catt('    <RECORD id=',_N_,'>');put str;

        put '        <Table1>';

            str=catt('            <Name>',Name,'</Name>'); put str;

            str=catt('            <Sex>',Sex,'</Sex>'); put str;           

        put '        </Table1>';

        put '        <Table2>';

            str=catt('            <Age>',Age,'</Age>'); put str;

            str=catt('            <Height>',Height,'</Height>'); put str;   

            str=catt('            <Weight>',Weight,'</Weight>'); put str;               

        put '        </Table2>';

        put'    </RECORD>';

    if last then do;

        put '</Table>';

    end;

run;

:


So result XML should looks like

<Table>

    <RECORD id=1>

        <Table1>

            <Name>Alfred</Name>

            <Sex>M</Sex>

        </Table1>

        <Table2>

            <Age>14</Age>

            <Height>69</Height>

            <Weight>112.5</Weight>

        </Table2>

    </RECORD>

    <RECORD id=2>

        <Table1>

            <Name>Alice</Name>

            <Sex>F</Sex>

        </Table1>

        <Table2>

            <Age>13</Age>

            <Height>56.5</Height>

            <Weight>84</Weight>

        </Table2>

    </RECORD>

...

</Table>


I briefly read about proc template and XML and I found out that it had some issues with grouping etc., so I have small doubds that I whould be able build next XML via it,so my question is if proc template can build next XML, and if yes, if this functionality will work fuster then data step works.

Real source table will have more then 100 columns and thousandth of records, so speed of work is very important.

Thanks!

1 ACCEPTED SOLUTION

Accepted Solutions
Vince28_Statcan
Quartz | Level 8


Well, since you are stuck with SAS9.1.3, I'm affraid you will have to stick to the data step approach. You should be able to use macros and dictionary tables to at least relieve some of the coding burden of your table with 100 columns.

I'm not knowledge enough of proc template to tell you with certainty that it can be achieved or not but realistically though, I don't see how it could be faster than a data step with a single read on the data. I've spent a bit of time looking at proc template documentation yesterday as I had been meaning to learn the basics of it for a long time but could never find a project that required it.It turns out that even if you were to define a tagset specific to your task in proc template and then use proc print to do the output, proc print still has a lot of additionnal processing going on that does not occur in a data step. There are tons of Events that drive the output layout and since proc template is really meant to be a visual output/reporting/graphing tool much more than a flat file output, any custom template reuses the ODS-driven events that are bound to the procedures with output. Anyway in short, even if you did spend the time to build a custom tagset, it should still be slower than the data step solution you've posted originally as it will necessarily involve a lot of additionnal dynamic processing of ODS-driven procedure events.

Here's the line in "What's new in SAS9.2 XML libname engine" discussing the new XMLMap feature for exporting purpose (it worked for imports with the XML libname engine but export was only implemented for engine XML92)

New XMLMap
Functionality

XMLMap functionality has the following enhancements for the XML92 engine nickname:


  • You can now export an XML document from a SAS data set using the XMLMap that
    was created to import the XML document. The XMLMap tells the XML engine how to
    map the SAS format (variables and observations) into the specific XML document
    structure. See Exporting
    XML Documents Using an XMLMap
    .

View solution in original post

10 REPLIES 10
Cynthia_sas
SAS Super FREQ

Hi:

  Does your XML file have a schema? Theoretically, if your XML has a valid schema, then you could use the schema with an XML Map and the XML Libname Engine to write your XML. But the limitation of the XML LIbname Engine is that it is "one table in, one table out" -- so the default output table won't work for you in this case. You'll have to read the documentation on how to use the SAS XML Libname Engine to write custom XML.

  Depending on the version of SAS you have, the engine name would/might change in the LIBNAME statement below and if you are creating custom XML, then you'd need the XMLMAP= option to specify the map.

cynthia

  Here's an example of the code to create a default table using the SAS XML Libname Engine:

libname myclass xml 'c:\temp\myclass.xml';
 

data myclass.students;

  set sashelp.class;

  newvar = height*weight;

run;
 

libname myclass clear;

Yura2301
Quartz | Level 8

Hi Cynthia,

Thanks for reply, can you please describe what you mean about SAS XML limitation("one table in, one table out)?

In my case I have only one input table, and for each row(that is enclosed in separate tag) first subset of table columns should be enclosed in <table1> </table1> tags, and  rest columns(second subset ) should be enclosd in <table2> </table2> tags.

I just wonted to know some best practice in creating such type of XML from SAS data set, becouse I've created such XML for a few minutes using data step (example in post higher) and it ganarates quickly but I feel that it's not right way to form the XML, it has more-less simple structure and probably should be some simple way to also create it using proc template or smth. else, but I didn't succeed in it after brief read proc template functionality.

Thanks!

Vince28_Statcan
Quartz | Level 8

Not that this will make or break the feasability with xml engine or proc template but is there a specific reason to your xml tag naming? The way you are building RECORD1 RECORD2 etc is absolutely counterintuitive to the entire purpose of XML. It would be far more logic to have something such as <RECORD instance=1>subtags</RECORD> for example. That is, using tag attributes to increment such that both reading and writing does not need to do funky string manipulation over tag names to transfer the data around.

Yura2301
Quartz | Level 8

Hi Vince,

You are absolutely right about XML tag naming, I've just accidentally made mistake when trying quickly write example of XML in Data step.I've fixed example data step and result XML in my initial post.

Each SAS table record should be enclosed in separate tag similarly as you described in your post:

<RECORD instance=1>subtags</RECORD>

<RECORD instance=2>subtags</RECORD>

...

<RECORD instance=3>subtags</RECORD>

And source table columns are divided on two parts, first should be enclosed  in <table1> </table1> tags, and  rest columns(second subset ) should be enclosd in <table2> </table2> tags.

Thanks!

Vince28_Statcan
Quartz | Level 8

Hi Yura,

With this change, I believe it can be achieved through the xml libname engine and an appropriately defined .map file. I'll take a look at it during lunch break or if I am waiting on a program running through the day.

Vince28_Statcan
Quartz | Level 8

Alright I got to give it a go earlier than expected

Linked files are those relevant to your objectives. This was done in SAS9.2 and I don't think the xml engine was fully functional before then (at least not the xml92 engine). It further got improved with SAS9.3 if you need to create multiple files but I don't have it installed so you'd need to look at documentation as otherwise, you would need to create 1 libname statement per output file desired (even if they used the same map) whereas I believe from reading about it a while back the SAS9.3 improvements help alleviate that at least for input, maybe not for output though.

please notice the "thisnamemustmatch" both in the sas code and in the map file.

the <TABLE name="thisnamemustmatch"> tag in the map HAS to be the same as the data set output name that you used with your xml engine as otherwise it causes a client crash. You can name it however you want - it does not appear in the output xml anyway.

As to provide some brief indications to adapt it to your desired solutions on how to write the map:

TABLE-PATH is essentially what dictates how often records are created for extraction and thus conversely impacts how the xml output appears. For example, test the same code with the map only altering

<TABLE-PATH syntax="XPath">/Table/RECORD</TABLE-PATH>

to

<TABLE-PATH syntax="XPath">/Table</TABLE-PATH>

and see the difference in the output xml file.

As well, if you have hierarchical data that has columns "carried over" - think a household survey where you have one tag with all household level data and then many subtags with person-level data - If you were to create a person-level table that keeps all the household level data for each record, you'd have to add retain="YES" attribute to all of the COLUMN tags which use household-level data in the map. However, for export which is what your original request was, this is somewhat moot as per the requirement of 2D tables. You would've needed to merge a household table with a person table into a single table with repeated household data for each person before exporting.

Also, I don't think it is possible to use _N_ directly as a column to export. However, unless the existing sortation is absolutely essential to save, it is fairly typical not even to bother with an instance= attribute or the like as it is not required to import the data back into SAS. In particular, you could even create a new rowid at extraction time even if you did not create an instance= attribute using <INCREMENT-PATH> subtags of the <COLUMN> tag in the map. That is, the xml engine is able to mimic a _N_ column at extraction time and since it reads and writes the data sequentially, the numbers would be the same.

Hope this helps and that my comments aren't all too confusing

Vincent

Yura2301
Quartz | Level 8

Hi Vince,

Thanks for your efforts, I'll try it tommorow.

You mentioned that example that you provided made on SAS 9.2, but I'm actually working on SAS 9.1.3, that probably can cause some issue.

But before I made XML maps for querying XMLs loaded by proc metadata so XPATH and XML maps also works on SAS 9.1.3 but offcource can be some difference.

I'll back when will have some results.

Thanks one more time!

Yura2301
Quartz | Level 8

Hi again,

I've just tried and as I wrote higher there is a problem related with the fact that I'm using older version of SAS(SAS 9.1.3), there is no XML92 engine and error throws to the log file:

ERROR: The XML92 engine cannot be found.

ERROR: Error in the LIBNAME statement.

But in general I reviewed XML map and SAS code that you've made and it actually exactly what I need and expect, but I use SAS 9.1.3 and need some workaround, I tried replace XML92 to XML but another message was throwed to the log:

ERROR: XMLMap= has been specified on the XML Libname assignment. The output produced via this option will change in upcoming

       releases. Correct the XML Libname(remove XMLMap= option) and resubmit. Output generation aborted.


So ,as you mentioned on your post, your solution works but only starting from SAS 9.2.

I can probably initially load simple XML from SAS table, then made XML map similarly as you did and query this initial XML using this XML map with XPATH, but it's not optimal, in this case I'll make in fact two path throw the same data, that can make some performance issue(source table is huge and result XML will have a few hundreds mb.).

Anyway thanks!

Vince28_Statcan
Quartz | Level 8


Well, since you are stuck with SAS9.1.3, I'm affraid you will have to stick to the data step approach. You should be able to use macros and dictionary tables to at least relieve some of the coding burden of your table with 100 columns.

I'm not knowledge enough of proc template to tell you with certainty that it can be achieved or not but realistically though, I don't see how it could be faster than a data step with a single read on the data. I've spent a bit of time looking at proc template documentation yesterday as I had been meaning to learn the basics of it for a long time but could never find a project that required it.It turns out that even if you were to define a tagset specific to your task in proc template and then use proc print to do the output, proc print still has a lot of additionnal processing going on that does not occur in a data step. There are tons of Events that drive the output layout and since proc template is really meant to be a visual output/reporting/graphing tool much more than a flat file output, any custom template reuses the ODS-driven events that are bound to the procedures with output. Anyway in short, even if you did spend the time to build a custom tagset, it should still be slower than the data step solution you've posted originally as it will necessarily involve a lot of additionnal dynamic processing of ODS-driven procedure events.

Here's the line in "What's new in SAS9.2 XML libname engine" discussing the new XMLMap feature for exporting purpose (it worked for imports with the XML libname engine but export was only implemented for engine XML92)

New XMLMap
Functionality

XMLMap functionality has the following enhancements for the XML92 engine nickname:


  • You can now export an XML document from a SAS data set using the XMLMap that
    was created to import the XML document. The XMLMap tells the XML engine how to
    map the SAS format (variables and observations) into the specific XML document
    structure. See Exporting
    XML Documents Using an XMLMap
    .
Yura2301
Quartz | Level 8

Hi Vince,

Actually when I read documentations about proc template I also had similar feeling about it, but I've just spent hour on briefly overview proc template possibilities so I couldn't be sure that it can't easily handle my task, that's why I actually asked for help here.

So, if not only me considered that for this task data step is probably ok solution for SAS 9.1.3 - I'll choose this variant:)

Thanks one more time for quick and clever feedback's!

sas-innovate-2024.png

Join us for SAS Innovate April 16-19 at the Aria in Las Vegas. Bring the team and save big with our group pricing for a limited time only.

Pre-conference courses and tutorials are filling up fast and are always a sellout. Register today to reserve your seat.

 

Register now!

What is Bayesian Analysis?

Learn the difference between classical and Bayesian statistical approaches and see a few PROC examples to perform Bayesian analysis in this video.

Find more tutorials on the SAS Users YouTube channel.

Click image to register for webinarClick image to register for webinar

Classroom Training Available!

Select SAS Training centers are offering in-person courses. View upcoming courses for:

View all other training opportunities.

Discussion stats
  • 10 replies
  • 1536 views
  • 3 likes
  • 3 in conversation