DATA Step, Macro, Functions and more

Find and select unknown name in XML based character field

Accepted Solution Solved
Reply
Contributor
Posts: 28
Accepted Solution

Find and select unknown name in XML based character field

Hi Guys,

 

I am stuck (again) and could do with some help please.

 

I have a character variable which is taken from an XML file.  Within this variableis a name, i dont know what the is, or where it is located within th variable.  I nned to select the name and then place it into a new variable.

 

this is an edited example of the variable (to remove any propietry data) - 

 

...2001/XMLSchema"><Test><Overall>98</Overall><Score>98</Score><Source>Data</Source><Data>TRUMP, DONALD&lt;br/&gt;19xx-xx-xx&lt;br/&gt;&lt;br/&gt;&lt;br/&gt;</Data><Cause>&lt;a href="http://...

 

the name is always marked by <Data> (only instance in variable) and is always ended by the "&lt;br/&gt" marker (there are more instances of this marker later in the field).

 

I have been using the code below, it can select the name, but i cant get it to stop selecting after the name, so i end up with all the XML stuff after the name.

 

Can anyone point me in the right direction?

 

thanks in advance, 

 

paul

 

data want;
set have;
iname = index(var,'<Data>');
name = substr(var, iname + 6, '&lt;br/&gt');
drop iname;
run;

 


Accepted Solutions
Solution
‎11-16-2017 11:22 AM
Super User
Posts: 9,886

Re: Find and select unknown name in XML based character field

[ Edited ]
Posted in reply to pandhandj

A quick example:

data test;
x1 = '<Source>Data</Source><Data>TRUMP, DONALD&lt;br/&gt;19xx-xx-xx&lt;br/&gt;&lt;br/&gt;&lt;br/&gt;</Data>';
begin = index(x1,'<Data>')+6;
end = index(x1,'&lt;br/&gt;');
x2 = substr(x1,begin,end - begin);
run;

If the reference string for the end

&lt;br/&gt;

appears anywhere before the reference string for the beginning

<Data>

you will have to substr x1 first so that it actually starts at "begin".

---------------------------------------------------------------------------------------------
Maxims of Maximally Efficient SAS Programmers
How to convert datasets to data steps
How to post code

View solution in original post


All Replies
Super User
Posts: 9,886

Re: Find and select unknown name in XML based character field

Posted in reply to pandhandj

The third argument to substr() (the length) needs to be a number, so you must calculate that by subtracting the position of the first character of the name from the position of '&lt;br/&gt' in your string.

---------------------------------------------------------------------------------------------
Maxims of Maximally Efficient SAS Programmers
How to convert datasets to data steps
How to post code
Contributor
Posts: 28

Re: Find and select unknown name in XML based character field

Posted in reply to KurtBremser

HI Kurt,

 

thanks for your answer, unfortunatley i dont know the name, so i cant work out the lenght.

 

I tried using scan to select individual words from the name, but i also dont know the naming format (first name, last name or last name, middle name, first name, second name, or any combination), which then gives me problems when i try to re-combine the words...

 

paul

 

Contributor
Posts: 28

Re: Find and select unknown name in XML based character field

Posted in reply to pandhandj

HI Kurt, 

 

I have just worked out what you mean!

 

I have the start position of the name, so if i work out the position of the & and then substr from start to the &, i will be left with the name?

 

I will give that a try,

 

thanks,

 

paul

Solution
‎11-16-2017 11:22 AM
Super User
Posts: 9,886

Re: Find and select unknown name in XML based character field

[ Edited ]
Posted in reply to pandhandj

A quick example:

data test;
x1 = '<Source>Data</Source><Data>TRUMP, DONALD&lt;br/&gt;19xx-xx-xx&lt;br/&gt;&lt;br/&gt;&lt;br/&gt;</Data>';
begin = index(x1,'<Data>')+6;
end = index(x1,'&lt;br/&gt;');
x2 = substr(x1,begin,end - begin);
run;

If the reference string for the end

&lt;br/&gt;

appears anywhere before the reference string for the beginning

<Data>

you will have to substr x1 first so that it actually starts at "begin".

---------------------------------------------------------------------------------------------
Maxims of Maximally Efficient SAS Programmers
How to convert datasets to data steps
How to post code
Contributor
Posts: 28

Re: Find and select unknown name in XML based character field

Posted in reply to KurtBremser

Hi Kurt,

 

 I have got it going Smiley Happy

 

If we ever meet up, a few glasses of your favourite tipple are on me!

 

all the best,

 

paul

☑ This topic is solved.

Need further help from the community? Please ask a new question.

Discussion stats
  • 5 replies
  • 98 views
  • 0 likes
  • 2 in conversation