<?xml version="1.0" encoding="UTF-8"?>
<rss xmlns:content="http://purl.org/rss/1.0/modules/content/" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" xmlns:taxo="http://purl.org/rss/1.0/modules/taxonomy/" version="2.0">
  <channel>
    <title>topic Re: Extract Hidden Data in XML file in SAS Programming</title>
    <link>https://communities.sas.com/t5/SAS-Programming/Extract-Hidden-Data-in-XML-file/m-p/401543#M97439</link>
    <description>&lt;P&gt;XML is just text. Read in and search the start/end tags -&amp;gt; &lt;STRONG&gt;&lt;SPAN class="html-attribute"&gt;&lt;SPAN class="html-attribute-value"&gt;&amp;lt;creditBureau&amp;gt;&amp;nbsp;&amp;lt;/creditBureau&amp;gt;&amp;nbsp;&lt;/SPAN&gt;&lt;/SPAN&gt;&lt;/STRONG&gt;&lt;SPAN class="html-attribute"&gt;&lt;SPAN class="html-attribute-value"&gt;and only keep the content in between the two.&amp;nbsp;&lt;/SPAN&gt;&lt;/SPAN&gt;&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;&lt;SPAN class="html-attribute"&gt;&lt;SPAN class="html-attribute-value"&gt;I think that's going to require using the _infile_ statement instead of XML mapper though.&amp;nbsp;&lt;/SPAN&gt;&lt;/SPAN&gt;&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;&lt;SPAN class="html-attribute"&gt;&lt;SPAN class="html-attribute-value"&gt;I would have thought the XML mapper would pull that out though, since it's not really hidden, it's between tags.&lt;/SPAN&gt;&lt;/SPAN&gt;&lt;/P&gt;</description>
    <pubDate>Thu, 05 Oct 2017 22:49:50 GMT</pubDate>
    <dc:creator>Reeza</dc:creator>
    <dc:date>2017-10-05T22:49:50Z</dc:date>
    <item>
      <title>Extract Hidden Data in XML file</title>
      <link>https://communities.sas.com/t5/SAS-Programming/Extract-Hidden-Data-in-XML-file/m-p/401537#M97435</link>
      <description>&lt;P&gt;Hi,&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;I've have a lot of XML files that have "hidden" credit bureau data that I want to extract.&amp;nbsp; Each XML file represents a customer of ours that contains info like&amp;nbsp;address info/employer info/deal structure info/etc. The credit bureau data is "hidden" in one of the attributes of one of the root elements.&amp;nbsp; This data is extremely useful to my company and I want to be able to extract it.&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;I tried&amp;nbsp;importing&amp;nbsp;the&amp;nbsp;entire&amp;nbsp;xml file (using XML mapper) into SAS but the credit bureau data gets truncated because it's much longer than the max characters aloud.&amp;nbsp; So, I can't even read the data into SAS to try to parse that field out later on.&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;Can someone give me some ideas on how to solve this issue?&amp;nbsp; Is there a way to read this field directly into SAS without the other surrounding data?&amp;nbsp; Not sure how to go about this.&amp;nbsp; Below is an example of the data I'm working with.&amp;nbsp; &amp;nbsp; The raw_xml data is the "hidden" data (in bold) (Often over 100,000 characters).&amp;nbsp; Removed 99% of the data but just wanted to give you an idea of the structure that I'm working with.&amp;nbsp;&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;Hope this makes sense.&amp;nbsp; Appreciate&amp;nbsp;any help/advice.&amp;nbsp;&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;I use Base SAS and xmlv2 to read in the data.&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;DIV class="collapsible"&gt;&lt;DIV class="expanded"&gt;&lt;DIV class="line"&gt;&lt;SPAN class="html-tag"&gt;&amp;lt;DealDetails&amp;gt;&lt;/SPAN&gt;&lt;/DIV&gt;&lt;DIV class="collapsible-content"&gt;&lt;DIV class="line"&gt;&lt;SPAN class="html-tag"&gt;&amp;lt;&lt;SPAN class="html-attribute"&gt;&lt;SPAN class="html-attribute-name"&gt;cash_down&lt;/SPAN&gt;="&lt;SPAN class="html-attribute-value"&gt;1000.00&lt;/SPAN&gt;"&lt;/SPAN&gt;&lt;SPAN class="html-attribute"&gt;&lt;SPAN&gt;&amp;nbsp;&lt;/SPAN&gt;&lt;SPAN class="html-attribute-name"&gt;net_trade&lt;/SPAN&gt;="&lt;SPAN class="html-attribute-value"&gt;0.00&lt;/SPAN&gt;"&lt;/SPAN&gt;&lt;SPAN class="html-attribute"&gt;&lt;SPAN&gt;&amp;nbsp;&lt;/SPAN&gt;&lt;/SPAN&gt;&lt;SPAN class="html-attribute"&gt;&lt;SPAN class="html-attribute-name"&gt;updated_by&lt;/SPAN&gt;=""&lt;/SPAN&gt;&lt;SPAN class="html-attribute"&gt;&lt;SPAN&gt;&amp;nbsp;&lt;/SPAN&gt;&lt;SPAN class="html-attribute-name"&gt;bundle_id&lt;/SPAN&gt;=""&lt;/SPAN&gt;/&amp;gt;&lt;/SPAN&gt;&lt;/DIV&gt;&lt;/DIV&gt;&lt;DIV class="line"&gt;&lt;SPAN class="html-tag"&gt;&amp;lt;/DealDetails&amp;gt;&lt;/SPAN&gt;&lt;/DIV&gt;&lt;/DIV&gt;&lt;/DIV&gt;&lt;DIV class="collapsible"&gt;&lt;DIV class="expanded"&gt;&lt;DIV class="line"&gt;&lt;SPAN class="html-tag"&gt;&amp;lt;Customers&amp;gt;&lt;/SPAN&gt;&lt;/DIV&gt;&lt;DIV class="collapsible-content"&gt;&lt;DIV class="collapsible"&gt;&lt;DIV class="expanded"&gt;&lt;DIV class="line"&gt;&lt;SPAN class="html-tag"&gt;&amp;lt;&lt;SPAN class="html-attribute"&gt;&lt;SPAN class="html-attribute-name"&gt;age&lt;/SPAN&gt;="&lt;SPAN class="html-attribute-value"&gt;28&lt;/SPAN&gt;"&lt;/SPAN&gt;&lt;SPAN class="html-attribute"&gt;&lt;SPAN&gt;&amp;nbsp;&lt;/SPAN&gt;&lt;SPAN class="html-attribute-name"&gt;create_date&lt;/SPAN&gt;="&lt;SPAN class="html-attribute-value"&gt;2017-09-30T11:04:47.0000000&lt;/SPAN&gt;"&lt;/SPAN&gt;&amp;gt;&lt;/SPAN&gt;&lt;/DIV&gt;&lt;DIV class="collapsible-content"&gt;&lt;DIV class="collapsible"&gt;&lt;DIV class="expanded"&gt;&lt;DIV class="line"&gt;&lt;SPAN class="html-tag"&gt;&amp;lt;Reports&amp;gt;&lt;/SPAN&gt;&lt;/DIV&gt;&lt;DIV class="collapsible-content"&gt;&lt;DIV class="line"&gt;&lt;SPAN class="html-tag"&gt;&amp;lt;Report&lt;SPAN class="html-attribute"&gt;&lt;SPAN&gt;&amp;nbsp;&lt;/SPAN&gt;&lt;SPAN class="html-attribute-name"&gt;active&lt;/SPAN&gt;="&lt;SPAN class="html-attribute-value"&gt;True&lt;/SPAN&gt;"&lt;/SPAN&gt;&lt;SPAN class="html-attribute"&gt;&lt;SPAN&gt;&amp;nbsp;&lt;/SPAN&gt;&lt;SPAN class="html-attribute-name"&gt;valid&lt;/SPAN&gt;="&lt;SPAN class="html-attribute-value"&gt;True&lt;/SPAN&gt;"&lt;/SPAN&gt;&lt;SPAN class="html-attribute"&gt;&lt;SPAN&gt;&amp;nbsp;&lt;/SPAN&gt;&lt;SPAN class="html-attribute-name"&gt;bureau&lt;/SPAN&gt;="&lt;SPAN class="html-attribute-value"&gt;TU&lt;/SPAN&gt;"&lt;/SPAN&gt;&lt;SPAN class="html-attribute"&gt;&lt;SPAN&gt;&amp;nbsp;&lt;/SPAN&gt;&lt;/SPAN&gt;&lt;STRONG&gt;&lt;SPAN class="html-attribute"&gt;&lt;SPAN class="html-attribute-name"&gt;raw_xml&lt;/SPAN&gt;="&lt;SPAN class="html-attribute-value"&gt;&amp;lt;Response Score="700"&amp;gt;&amp;lt;creditBureau&amp;gt;&amp;lt;document&amp;gt;response&amp;lt;/document&amp;gt;&amp;lt;version&amp;gt;2.18&amp;lt;/version&amp;gt;&amp;lt;transactionControl&amp;gt;&amp;lt;&amp;nbsp;&lt;/SPAN&gt;&lt;/SPAN&gt;&lt;/STRONG&gt;&lt;/SPAN&gt;&lt;/DIV&gt;&lt;DIV class="line"&gt;&lt;STRONG&gt;&lt;SPAN class="html-tag"&gt;&lt;SPAN class="html-attribute"&gt;&lt;SPAN class="html-attribute-value"&gt;.&lt;/SPAN&gt;&lt;/SPAN&gt;&lt;/SPAN&gt;&lt;/STRONG&gt;&lt;/DIV&gt;&lt;DIV class="line"&gt;&lt;STRONG&gt;&lt;SPAN class="html-tag"&gt;&lt;SPAN class="html-attribute"&gt;&lt;SPAN class="html-attribute-value"&gt;.&lt;/SPAN&gt;&lt;/SPAN&gt;&lt;/SPAN&gt;&lt;/STRONG&gt;&lt;/DIV&gt;&lt;DIV class="line"&gt;&lt;STRONG&gt;&lt;SPAN class="html-tag"&gt;&lt;SPAN class="html-attribute"&gt;&lt;SPAN class="html-attribute-value"&gt;.&lt;/SPAN&gt;&lt;/SPAN&gt;&lt;/SPAN&gt;&lt;/STRONG&gt;&lt;/DIV&gt;&lt;DIV class="line"&gt;&lt;STRONG&gt;&lt;SPAN class="html-tag"&gt;&lt;SPAN class="html-attribute"&gt;&lt;SPAN class="html-attribute-value"&gt;.&lt;/SPAN&gt;&lt;/SPAN&gt;&lt;/SPAN&gt;&lt;/STRONG&gt;&lt;/DIV&gt;&lt;DIV class="line"&gt;&lt;STRONG&gt;&lt;SPAN class="html-tag"&gt;&lt;SPAN class="html-attribute"&gt;&lt;SPAN class="html-attribute-value"&gt;&amp;lt;/printImage&amp;gt;&amp;lt;/product&amp;gt;&amp;lt;/creditBureau&amp;gt;&amp;lt;/Response&amp;gt;"&lt;SPAN&gt;&amp;nbsp;&lt;/SPAN&gt;&lt;SPAN class="html-attribute-name"&gt;create_date&lt;/SPAN&gt;=""&lt;SPAN&gt;&amp;nbsp;&lt;/SPAN&gt;&lt;SPAN class="html-attribute-name"&gt;update_date&lt;/SPAN&gt;=""&lt;SPAN&gt;&amp;nbsp;&lt;/SPAN&gt;&lt;SPAN class="html-attribute-name"&gt;updated_by&lt;/SPAN&gt;="System [No User Available]"&lt;SPAN class="html-attribute-name"&gt;deal_detail_id&lt;/SPAN&gt;=""&lt;SPAN&gt;&amp;nbsp;&lt;/SPAN&gt;&lt;SPAN class="html-attribute-name"&gt;bundle_id&lt;/SPAN&gt;=""&lt;SPAN&gt;/&amp;gt;&lt;/SPAN&gt;&lt;/SPAN&gt;&lt;/SPAN&gt;&lt;/SPAN&gt;&lt;/STRONG&gt;&lt;/DIV&gt;&lt;DIV class="line"&gt;&lt;SPAN class="html-tag"&gt;&lt;SPAN class="html-attribute"&gt;&lt;SPAN class="html-attribute-value"&gt;&lt;SPAN&gt;&amp;lt;/Reports&lt;/SPAN&gt;&lt;SPAN&gt;&amp;gt;&lt;/SPAN&gt;&lt;/SPAN&gt;&lt;/SPAN&gt;&lt;/SPAN&gt;&lt;/DIV&gt;&lt;/DIV&gt;&lt;/DIV&gt;&lt;/DIV&gt;&lt;/DIV&gt;&lt;/DIV&gt;&lt;/DIV&gt;&lt;/DIV&gt;&lt;/DIV&gt;&lt;/DIV&gt;</description>
      <pubDate>Thu, 05 Oct 2017 22:26:37 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Programming/Extract-Hidden-Data-in-XML-file/m-p/401537#M97435</guid>
      <dc:creator>wwelch</dc:creator>
      <dc:date>2017-10-05T22:26:37Z</dc:date>
    </item>
    <item>
      <title>Re: Extract Hidden Data in XML file</title>
      <link>https://communities.sas.com/t5/SAS-Programming/Extract-Hidden-Data-in-XML-file/m-p/401543#M97439</link>
      <description>&lt;P&gt;XML is just text. Read in and search the start/end tags -&amp;gt; &lt;STRONG&gt;&lt;SPAN class="html-attribute"&gt;&lt;SPAN class="html-attribute-value"&gt;&amp;lt;creditBureau&amp;gt;&amp;nbsp;&amp;lt;/creditBureau&amp;gt;&amp;nbsp;&lt;/SPAN&gt;&lt;/SPAN&gt;&lt;/STRONG&gt;&lt;SPAN class="html-attribute"&gt;&lt;SPAN class="html-attribute-value"&gt;and only keep the content in between the two.&amp;nbsp;&lt;/SPAN&gt;&lt;/SPAN&gt;&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;&lt;SPAN class="html-attribute"&gt;&lt;SPAN class="html-attribute-value"&gt;I think that's going to require using the _infile_ statement instead of XML mapper though.&amp;nbsp;&lt;/SPAN&gt;&lt;/SPAN&gt;&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;&lt;SPAN class="html-attribute"&gt;&lt;SPAN class="html-attribute-value"&gt;I would have thought the XML mapper would pull that out though, since it's not really hidden, it's between tags.&lt;/SPAN&gt;&lt;/SPAN&gt;&lt;/P&gt;</description>
      <pubDate>Thu, 05 Oct 2017 22:49:50 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Programming/Extract-Hidden-Data-in-XML-file/m-p/401543#M97439</guid>
      <dc:creator>Reeza</dc:creator>
      <dc:date>2017-10-05T22:49:50Z</dc:date>
    </item>
    <item>
      <title>Re: Extract Hidden Data in XML file</title>
      <link>https://communities.sas.com/t5/SAS-Programming/Extract-Hidden-Data-in-XML-file/m-p/401588#M97461</link>
      <description>You need to extract the data of raw_xml, an attribute of &amp;lt;report&amp;gt;, save it to a new file and use an XML mapper on that file. To extract the text, use the method &lt;a href="https://communities.sas.com/t5/user/viewprofilepage/user-id/13879"&gt;@Reeza&lt;/a&gt; described.</description>
      <pubDate>Fri, 06 Oct 2017 03:02:07 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Programming/Extract-Hidden-Data-in-XML-file/m-p/401588#M97461</guid>
      <dc:creator>error_prone</dc:creator>
      <dc:date>2017-10-06T03:02:07Z</dc:date>
    </item>
    <item>
      <title>Re: Extract Hidden Data in XML file</title>
      <link>https://communities.sas.com/t5/SAS-Programming/Extract-Hidden-Data-in-XML-file/m-p/401924#M97544</link>
      <description>So, I'm a little confused. I can't only read in the raw_xml attribute between the Reports tags? I have 15,000 xml files. I can't manually do this.&lt;BR /&gt;</description>
      <pubDate>Fri, 06 Oct 2017 20:29:39 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Programming/Extract-Hidden-Data-in-XML-file/m-p/401924#M97544</guid>
      <dc:creator>wwelch</dc:creator>
      <dc:date>2017-10-06T20:29:39Z</dc:date>
    </item>
    <item>
      <title>Re: Extract Hidden Data in XML file</title>
      <link>https://communities.sas.com/t5/SAS-Programming/Extract-Hidden-Data-in-XML-file/m-p/401945#M97549</link>
      <description>&lt;BLOCKQUOTE&gt;&lt;HR /&gt;&lt;a href="https://communities.sas.com/t5/user/viewprofilepage/user-id/59372"&gt;@wwelch&lt;/a&gt; wrote:&lt;BR /&gt;. I can't only read in the raw_xml attribute between the Reports tags?&amp;nbsp;&lt;BR /&gt;&lt;HR /&gt;&lt;/BLOCKQUOTE&gt;
&lt;P&gt;Why not?&lt;BR /&gt;&lt;BR /&gt;&lt;/P&gt;
&lt;BLOCKQUOTE&gt;&amp;nbsp;I have 15,000 xml files. I can't manually do this.&lt;BR /&gt;&lt;HR /&gt;&lt;/BLOCKQUOTE&gt;
&lt;P&gt;&lt;BR /&gt;That's why you write a program. Once you have it working for one you can figure out how to do it for 15000. I don't recall seeing a 'manual' suggestion anywhere in the answers.&amp;nbsp;&lt;/P&gt;</description>
      <pubDate>Fri, 06 Oct 2017 20:53:35 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Programming/Extract-Hidden-Data-in-XML-file/m-p/401945#M97549</guid>
      <dc:creator>Reeza</dc:creator>
      <dc:date>2017-10-06T20:53:35Z</dc:date>
    </item>
    <item>
      <title>Re: Extract Hidden Data in XML file</title>
      <link>https://communities.sas.com/t5/SAS-Programming/Extract-Hidden-Data-in-XML-file/m-p/401991#M97579</link>
      <description>Ok thanks Reeza!!!&lt;BR /&gt;</description>
      <pubDate>Fri, 06 Oct 2017 22:33:39 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Programming/Extract-Hidden-Data-in-XML-file/m-p/401991#M97579</guid>
      <dc:creator>wwelch</dc:creator>
      <dc:date>2017-10-06T22:33:39Z</dc:date>
    </item>
  </channel>
</rss>

