Hi Guys,
I am stuck (again) and could do with some help please.
I have a character variable which is taken from an XML file. Within this variableis a name, i dont know what the is, or where it is located within th variable. I nned to select the name and then place it into a new variable.
this is an edited example of the variable (to remove any propietry data) -
...2001/XMLSchema"><Test><Overall>98</Overall><Score>98</Score><Source>Data</Source><Data>TRUMP, DONALD<br/>19xx-xx-xx<br/><br/><br/></Data><Cause><a href="http://...
the name is always marked by <Data> (only instance in variable) and is always ended by the "<br/>" marker (there are more instances of this marker later in the field).
I have been using the code below, it can select the name, but i cant get it to stop selecting after the name, so i end up with all the XML stuff after the name.
Can anyone point me in the right direction?
thanks in advance,
paul
data want;
set have;
iname = index(var,'<Data>');
name = substr(var, iname + 6, '<br/>');
drop iname;
run;
A quick example:
data test;
x1 = '<Source>Data</Source><Data>TRUMP, DONALD<br/>19xx-xx-xx<br/><br/><br/></Data>';
begin = index(x1,'<Data>')+6;
end = index(x1,'<br/>');
x2 = substr(x1,begin,end - begin);
run;
If the reference string for the end
<br/>
appears anywhere before the reference string for the beginning
<Data>
you will have to substr x1 first so that it actually starts at "begin".
The third argument to substr() (the length) needs to be a number, so you must calculate that by subtracting the position of the first character of the name from the position of '<br/>' in your string.
HI Kurt,
thanks for your answer, unfortunatley i dont know the name, so i cant work out the lenght.
I tried using scan to select individual words from the name, but i also dont know the naming format (first name, last name or last name, middle name, first name, second name, or any combination), which then gives me problems when i try to re-combine the words...
paul
HI Kurt,
I have just worked out what you mean!
I have the start position of the name, so if i work out the position of the & and then substr from start to the &, i will be left with the name?
I will give that a try,
thanks,
paul
A quick example:
data test;
x1 = '<Source>Data</Source><Data>TRUMP, DONALD<br/>19xx-xx-xx<br/><br/><br/></Data>';
begin = index(x1,'<Data>')+6;
end = index(x1,'<br/>');
x2 = substr(x1,begin,end - begin);
run;
If the reference string for the end
<br/>
appears anywhere before the reference string for the beginning
<Data>
you will have to substr x1 first so that it actually starts at "begin".
Hi Kurt,
I have got it going
If we ever meet up, a few glasses of your favourite tipple are on me!
all the best,
paul
SAS Innovate 2025 is scheduled for May 6-9 in Orlando, FL. Sign up to be first to learn about the agenda and registration!
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.
Ready to level-up your skills? Choose your own adventure.