I want to be able to see if the types of data we are sending changes over time.
I have a series of records with a range of namespaces that are sent.
Have
Record XMLData
1 <keyData siteId="1" id="" lastName="Caser" />
2 <keyData siteId="1" email="do_08@awef.com" />
...
Want
Record Type
1 ID
1 lastName
2 ID
2 email
Sometimes I'll have ID, sometimes I'll have ID, Email, and SSN -- it varies by the record.
Does anyone have experience trying to shred out XML using SAS?
The XML libname doesn't work for you?
Or XML Mapper?
http://support.sas.com/documentation/cdl/en/engxml/64990/HTML/default/viewer.htm#titlepage.htm
Post some more data or example XML file would be better .
data have; input Record XMLData $50.; cards; 12 ; run; data want; set have; pid=prxparse('/\w+(?==")/o'); s=1;e=length(xmldata); call prxnext(pid,s,e,xmldata,p,l); do while(p gt 0); type=substr(xmldata,p,l);output; call prxnext(pid,s,e,xmldata,p,l); end; drop pid s e p l; run;
In your example each line is a set of space-separated words. The last word of each is "/>" and you apparently want a part of the next-to-last word, namely the part to the left of the = sign. You also want an output record with type='ID' for each incoming record number, regardless of content.
This code is not "xml-aware" in any way, but it does the particular task as you have described it:
data want (keep=record type);
input record xmltext &$80.;
length type $20;
type='ID'; output;
next_to_last_word=scan(xmltext,-2,' ');
type=scan(next_to_last_word,1,'=');
output;
datalines;
1 <keyData siteId="1" id="" lastName="Caser" />
2 <keyData siteId="1" email="do_08@awef.com" />
run;
SAS Innovate 2025 is scheduled for May 6-9 in Orlando, FL. Sign up to be first to learn about the agenda and registration!
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.
Ready to level-up your skills? Choose your own adventure.