DATA Step, Macro, Functions and more

Pulling all Namespaces from XML

Reply
Contributor JS
Contributor
Posts: 38

Pulling all Namespaces from XML

I want to be able to see if the types of data we are sending changes over time.

 

I have a series of records with a range of namespaces that are sent.

 

Have

Record XMLData

1 <keyData siteId="1" id="" lastName="Caser" />

2 <keyData siteId="1" email="do_08@awef.com" />

...

 

Want

Record Type

1 ID

1 lastName

2 ID

2 email

 

Sometimes I'll have ID, sometimes I'll have ID, Email, and SSN -- it varies by the record.

 

Does anyone have experience trying to shred out XML using SAS?

Super User
Posts: 19,770

Re: Pulling all Namespaces from XML

Super User
Posts: 10,018

Re: Pulling all Namespaces from XML

Post some more data or example XML file would be better .


Super User
Posts: 10,018

Re: Pulling all Namespaces from XML



data have;
input Record XMLData $50.;
cards;
1 
2 
;
run;

data want;
 set have;
 pid=prxparse('/\w+(?==")/o');
 s=1;e=length(xmldata);
 call prxnext(pid,s,e,xmldata,p,l);
 do while(p gt 0);
  type=substr(xmldata,p,l);output;
  call prxnext(pid,s,e,xmldata,p,l);
 end;
 drop pid s e p l;
run;


Trusted Advisor
Posts: 1,018

Re: Pulling all Namespaces from XML

In your example each line is a set of space-separated words.  The last word of each is "/>" and you apparently want a part of the next-to-last word, namely the part to the left of the = sign.  You also want an output record with type='ID' for each incoming record number, regardless of content.

 

This code is not "xml-aware" in any way, but it does the particular task as you have described it:

 

data want (keep=record type);
  input record xmltext &$80.;
  length type $20;

  type='ID'; output;

  next_to_last_word=scan(xmltext,-2,' ');
  type=scan(next_to_last_word,1,'=');
  output;
datalines;
1 <keyData siteId="1" id="" lastName="Caser" />
2 <keyData siteId="1" email="do_08@awef.com" />
run;

 

 

Ask a Question
Discussion stats
  • 4 replies
  • 127 views
  • 0 likes
  • 4 in conversation