BookmarkSubscribeRSS Feed
🔒 This topic is solved and locked. Need further help from the community? Please sign in and ask a new question.
veerand
Calcite | Level 5

I work with SAS Data Integration Studio and need extract multiple addresses from text field which contains xml. For security reasons my SAS enviroment is quite closed, I cant use PIPE or FINDNTH(). I dont have access to original XML file aswell.
So far I managed with simple functions as SUBSTR and INDEX to shorten original text field and managed to get first address from textfield but as there more addresses I cant find solution for extracting second or sixth or n-th addresses. 
My table looks like and count is how many strings <movementOfGoods> are in text field as how many rows I need for ID. 

veerand_0-1592554119300.png

XML field contains:
<movementOfGoods>
<qualifierOfTheIdentification>Z</qualifierOfTheIdentification>
<legalBaseCode>A</legalBaseCode>
<holderEORINumber>
<eORINumber>EEReplacedCodeForSecurityReasons</eORINumber>
</holderEORINumber>
<country>EE</country>
<typeOfLocationCode>B</typeOfLocationCode>
<freeTextDescription>
<name>ReplacedCodeForSecurityReasons/001</name>
<streetAndNumber>Tähe tn 106a</streetAndNumber>
<postcode>50107</postcode>
<city>Tartu</city>
<country>EE</country>
</freeTextDescription>
</movementOfGoods>
<movementOfGoods>
<qualifierOfTheIdentification>Z</qualifierOfTheIdentification>
<legalBaseCode>A</legalBaseCode>
<holderEORINumber>
<eORINumber>EEReplacedCodeForSecurityReasons</eORINumber>
</holderEORINumber>
<country>EE</country>
<typeOfLocationCode>B</typeOfLocationCode>
<freeTextDescription>
<name>ReplacedCodeForSecurityReasons/007</name>
<streetAndNumber>Jaama pst 26</streetAndNumber>
<postcode>44311</postcode>
<city>Rakvere</city>
<country>EE</country>
</freeTextDescription>
</movementOfGoods>
<movementOfGoods>
<qualifierOfTheIdentification>Z</qualifierOfTheIdentification>
<legalBaseCode>A</legalBaseCode>
<holderEORINumber>
<eORINumber>EEReplacedCodeForSecurityReasons</eORINumber>
</holderEORINumber>
<country>EE</country>
<typeOfLocationCode>B</typeOfLocationCode>
<freeTextDescription>
<name>ReplacedCodeForSecurityReasons/008</name>
<streetAndNumber>Linnase tn 6</streetAndNumber>
<postcode>44208</postcode>
<city>Rakvere vald, Aluvere</city>
<country>EE</country>
</freeTextDescription>
</movementOfGoods>
<movementOfGoods>
<qualifierOfTheIdentification>Z</qualifierOfTheIdentification>
<legalBaseCode>A</legalBaseCode>
<holderEORINumber>
<eORINumber>EEReplacedCodeForSecurityReasons</eORINumber>
</holderEORINumber>
<country>EE</country>
<typeOfLocationCode>B</typeOfLocationCode>
<freeTextDescription>
<name>ReplacedCodeForSecurityReasons/010</name>
<streetAndNumber>Kooli tn 10</streetAndNumber>
<postcode>76102</postcode>
<city>Rummu, Harjumaa</city>
<country>EE</country>
</freeTextDescription>
</movementOfGoods>
</temporaryStorageCustomsDecisionInformation>
</decisionRelatedInformation>
<customsDecisionStatus>01</customsDecisionStatus>
<customsDecisionStatusDate>2019-12-04</customsDecisionStatusDate>
<publicationAgreement>
<agreementForPublication>1</agreementForPublication>
</publicationAgreement>
</customsDecisionAuthorisationData>

 

At the end I need table which look like

QUALIFICATION_IDENTIFICATION streetAndNumber,city as address
EETSTEE1000EE-2019-INR53390 Tähe tn 106a, Tartu
EETSTEE1000EE-2019-INR53390 Jaama pst 26, Rakvere
EETSTEE1000EE-2019-INR53390 Linnase tn 6, Rakvere vald, Aluvere
EETSTEE1000EE-2019-INR53390 Kooli tn 10, Rummu, Harjumaa

 

Please help me out

Andres Veer

1 ACCEPTED SOLUTION

Accepted Solutions
s_lassen
Meteorite | Level 14

Here is a way to do it:

data want;
  set have;
  length streetAndNumber city $200;
  pos=1;
  do until(0);
    pos=find(XML,'<movementOfGoods>',pos);
    if pos=0 then leave;
    streetAndNumber=scan(substr(XML,find(XML,'<streetAndNumber>',pos)),2,'><');
    city=scan(substr(XML,find(XML,'<city>',pos)),2,'><');
    output;
    pos=find(XML,'</movementOfGoods>',pos);
    end;
  keep QUALIFICATION_IDENTIFICATION streetAndNumber city;
run;

View solution in original post

4 REPLIES 4
Jagadishkatam
Amethyst | Level 16

Please try the below code , I assumed that the xml data is in a txt file

 

filename sam '~/sample.txt';

proc import datafile=sam out=sample dbms=tab replace;
getnames=no;
run;

data want;
set sample;
retain streetAndNumber;
if prxmatch('m/streetAndNumber/oi',var1)>0 then streetAndNumber=prxchange('s/(.*>)(.*)(<.*)/$2/oi',-1,var1);
if prxmatch('m/city/oi',var1)>0 then cityasaddress=prxchange('s/(.*>)(.*)(<.*)/$2/oi',-1,var1);
QUALIFICATION_IDENTIFICATION='EETSTEE1000EE-2019-INR53390';
if cityasaddress ne '';
drop var1;
run;
Thanks,
Jag
veerand
Calcite | Level 5
tx I'll try, but xml data is in sas table field not as txt file
s_lassen
Meteorite | Level 14

Here is a way to do it:

data want;
  set have;
  length streetAndNumber city $200;
  pos=1;
  do until(0);
    pos=find(XML,'<movementOfGoods>',pos);
    if pos=0 then leave;
    streetAndNumber=scan(substr(XML,find(XML,'<streetAndNumber>',pos)),2,'><');
    city=scan(substr(XML,find(XML,'<city>',pos)),2,'><');
    output;
    pos=find(XML,'</movementOfGoods>',pos);
    end;
  keep QUALIFICATION_IDENTIFICATION streetAndNumber city;
run;
veerand
Calcite | Level 5
great it worked, thank you very much s_lassen

SAS Innovate 2025: Call for Content

Are you ready for the spotlight? We're accepting content ideas for SAS Innovate 2025 to be held May 6-9 in Orlando, FL. The call is open until September 25. Read more here about why you should contribute and what is in it for you!

Submit your idea!

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

Click image to register for webinarClick image to register for webinar

Classroom Training Available!

Select SAS Training centers are offering in-person courses. View upcoming courses for:

View all other training opportunities.

Discussion stats
  • 4 replies
  • 747 views
  • 0 likes
  • 3 in conversation