Hi Guys,
I am stuck (again) and could do with some help please.
I have a character variable which is taken from an XML file. Within this variableis a name, i dont know what the is, or where it is located within th variable. I nned to select the name and then place it into a new variable.
this is an edited example of the variable (to remove any propietry data) -
...2001/XMLSchema"><Test><Overall>98</Overall><Score>98</Score><Source>Data</Source><Data>TRUMP, DONALD<br/>19xx-xx-xx<br/><br/><br/></Data><Cause><a href="http://...
the name is always marked by <Data> (only instance in variable) and is always ended by the "<br/>" marker (there are more instances of this marker later in the field).
I have been using the code below, it can select the name, but i cant get it to stop selecting after the name, so i end up with all the XML stuff after the name.
Can anyone point me in the right direction?
thanks in advance,
paul
data want;
set have;
iname = index(var,'<Data>');
name = substr(var, iname + 6, '<br/>');
drop iname;
run;
A quick example:
data test;
x1 = '<Source>Data</Source><Data>TRUMP, DONALD<br/>19xx-xx-xx<br/><br/><br/></Data>';
begin = index(x1,'<Data>')+6;
end = index(x1,'<br/>');
x2 = substr(x1,begin,end - begin);
run;
If the reference string for the end
<br/>
appears anywhere before the reference string for the beginning
<Data>
you will have to substr x1 first so that it actually starts at "begin".
The third argument to substr() (the length) needs to be a number, so you must calculate that by subtracting the position of the first character of the name from the position of '<br/>' in your string.
HI Kurt,
thanks for your answer, unfortunatley i dont know the name, so i cant work out the lenght.
I tried using scan to select individual words from the name, but i also dont know the naming format (first name, last name or last name, middle name, first name, second name, or any combination), which then gives me problems when i try to re-combine the words...
paul
HI Kurt,
I have just worked out what you mean!
I have the start position of the name, so if i work out the position of the & and then substr from start to the &, i will be left with the name?
I will give that a try,
thanks,
paul
A quick example:
data test;
x1 = '<Source>Data</Source><Data>TRUMP, DONALD<br/>19xx-xx-xx<br/><br/><br/></Data>';
begin = index(x1,'<Data>')+6;
end = index(x1,'<br/>');
x2 = substr(x1,begin,end - begin);
run;
If the reference string for the end
<br/>
appears anywhere before the reference string for the beginning
<Data>
you will have to substr x1 first so that it actually starts at "begin".
Hi Kurt,
 I have got it going 
If we ever meet up, a few glasses of your favourite tipple are on me!
all the best,
paul
It's finally time to hack! Remember to visit the SAS Hacker's Hub regularly for news and updates.
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.
Ready to level-up your skills? Choose your own adventure.
