BookmarkSubscribeRSS Feed
🔒 This topic is solved and locked. Need further help from the community? Please sign in and ask a new question.
pandhandj
Obsidian | Level 7

Hi Guys,

 

I am stuck (again) and could do with some help please.

 

I have a character variable which is taken from an XML file.  Within this variableis a name, i dont know what the is, or where it is located within th variable.  I nned to select the name and then place it into a new variable.

 

this is an edited example of the variable (to remove any propietry data) - 

 

...2001/XMLSchema"><Test><Overall>98</Overall><Score>98</Score><Source>Data</Source><Data>TRUMP, DONALD&lt;br/&gt;19xx-xx-xx&lt;br/&gt;&lt;br/&gt;&lt;br/&gt;</Data><Cause>&lt;a href="http://...

 

the name is always marked by <Data> (only instance in variable) and is always ended by the "&lt;br/&gt" marker (there are more instances of this marker later in the field).

 

I have been using the code below, it can select the name, but i cant get it to stop selecting after the name, so i end up with all the XML stuff after the name.

 

Can anyone point me in the right direction?

 

thanks in advance, 

 

paul

 

data want;
set have;
iname = index(var,'<Data>');
name = substr(var, iname + 6, '&lt;br/&gt');
drop iname;
run;

 

1 ACCEPTED SOLUTION

Accepted Solutions
Kurt_Bremser
Super User

A quick example:

data test;
x1 = '<Source>Data</Source><Data>TRUMP, DONALD&lt;br/&gt;19xx-xx-xx&lt;br/&gt;&lt;br/&gt;&lt;br/&gt;</Data>';
begin = index(x1,'<Data>')+6;
end = index(x1,'&lt;br/&gt;');
x2 = substr(x1,begin,end - begin);
run;

If the reference string for the end

&lt;br/&gt;

appears anywhere before the reference string for the beginning

<Data>

you will have to substr x1 first so that it actually starts at "begin".

View solution in original post

5 REPLIES 5
Kurt_Bremser
Super User

The third argument to substr() (the length) needs to be a number, so you must calculate that by subtracting the position of the first character of the name from the position of '&lt;br/&gt' in your string.

pandhandj
Obsidian | Level 7

HI Kurt,

 

thanks for your answer, unfortunatley i dont know the name, so i cant work out the lenght.

 

I tried using scan to select individual words from the name, but i also dont know the naming format (first name, last name or last name, middle name, first name, second name, or any combination), which then gives me problems when i try to re-combine the words...

 

paul

 

pandhandj
Obsidian | Level 7

HI Kurt, 

 

I have just worked out what you mean!

 

I have the start position of the name, so if i work out the position of the & and then substr from start to the &, i will be left with the name?

 

I will give that a try,

 

thanks,

 

paul

Kurt_Bremser
Super User

A quick example:

data test;
x1 = '<Source>Data</Source><Data>TRUMP, DONALD&lt;br/&gt;19xx-xx-xx&lt;br/&gt;&lt;br/&gt;&lt;br/&gt;</Data>';
begin = index(x1,'<Data>')+6;
end = index(x1,'&lt;br/&gt;');
x2 = substr(x1,begin,end - begin);
run;

If the reference string for the end

&lt;br/&gt;

appears anywhere before the reference string for the beginning

<Data>

you will have to substr x1 first so that it actually starts at "begin".

pandhandj
Obsidian | Level 7

Hi Kurt,

 

 I have got it going Smiley Happy

 

If we ever meet up, a few glasses of your favourite tipple are on me!

 

all the best,

 

paul

hackathon24-white-horiz.png

2025 SAS Hackathon: There is still time!

Good news: We've extended SAS Hackathon registration until Sept. 12, so you still have time to be part of our biggest event yet – our five-year anniversary!

Register Now

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

SAS Training: Just a Click Away

 Ready to level-up your skills? Choose your own adventure.

Browse our catalog!

Discussion stats
  • 5 replies
  • 1780 views
  • 0 likes
  • 2 in conversation