BookmarkSubscribeRSS Feed
🔒 This topic is solved and locked. Need further help from the community? Please sign in and ask a new question.
veerand
Calcite | Level 5

I work with SAS Data Integration Studio and need extract multiple addresses from text field which contains xml. For security reasons my SAS enviroment is quite closed, I cant use PIPE or FINDNTH(). I dont have access to original XML file aswell.
So far I managed with simple functions as SUBSTR and INDEX to shorten original text field and managed to get first address from textfield but as there more addresses I cant find solution for extracting second or sixth or n-th addresses. 
My table looks like and count is how many strings <movementOfGoods> are in text field as how many rows I need for ID. 

veerand_0-1592554119300.png

XML field contains:
<movementOfGoods>
<qualifierOfTheIdentification>Z</qualifierOfTheIdentification>
<legalBaseCode>A</legalBaseCode>
<holderEORINumber>
<eORINumber>EEReplacedCodeForSecurityReasons</eORINumber>
</holderEORINumber>
<country>EE</country>
<typeOfLocationCode>B</typeOfLocationCode>
<freeTextDescription>
<name>ReplacedCodeForSecurityReasons/001</name>
<streetAndNumber>Tähe tn 106a</streetAndNumber>
<postcode>50107</postcode>
<city>Tartu</city>
<country>EE</country>
</freeTextDescription>
</movementOfGoods>
<movementOfGoods>
<qualifierOfTheIdentification>Z</qualifierOfTheIdentification>
<legalBaseCode>A</legalBaseCode>
<holderEORINumber>
<eORINumber>EEReplacedCodeForSecurityReasons</eORINumber>
</holderEORINumber>
<country>EE</country>
<typeOfLocationCode>B</typeOfLocationCode>
<freeTextDescription>
<name>ReplacedCodeForSecurityReasons/007</name>
<streetAndNumber>Jaama pst 26</streetAndNumber>
<postcode>44311</postcode>
<city>Rakvere</city>
<country>EE</country>
</freeTextDescription>
</movementOfGoods>
<movementOfGoods>
<qualifierOfTheIdentification>Z</qualifierOfTheIdentification>
<legalBaseCode>A</legalBaseCode>
<holderEORINumber>
<eORINumber>EEReplacedCodeForSecurityReasons</eORINumber>
</holderEORINumber>
<country>EE</country>
<typeOfLocationCode>B</typeOfLocationCode>
<freeTextDescription>
<name>ReplacedCodeForSecurityReasons/008</name>
<streetAndNumber>Linnase tn 6</streetAndNumber>
<postcode>44208</postcode>
<city>Rakvere vald, Aluvere</city>
<country>EE</country>
</freeTextDescription>
</movementOfGoods>
<movementOfGoods>
<qualifierOfTheIdentification>Z</qualifierOfTheIdentification>
<legalBaseCode>A</legalBaseCode>
<holderEORINumber>
<eORINumber>EEReplacedCodeForSecurityReasons</eORINumber>
</holderEORINumber>
<country>EE</country>
<typeOfLocationCode>B</typeOfLocationCode>
<freeTextDescription>
<name>ReplacedCodeForSecurityReasons/010</name>
<streetAndNumber>Kooli tn 10</streetAndNumber>
<postcode>76102</postcode>
<city>Rummu, Harjumaa</city>
<country>EE</country>
</freeTextDescription>
</movementOfGoods>
</temporaryStorageCustomsDecisionInformation>
</decisionRelatedInformation>
<customsDecisionStatus>01</customsDecisionStatus>
<customsDecisionStatusDate>2019-12-04</customsDecisionStatusDate>
<publicationAgreement>
<agreementForPublication>1</agreementForPublication>
</publicationAgreement>
</customsDecisionAuthorisationData>

 

At the end I need table which look like

QUALIFICATION_IDENTIFICATION streetAndNumber,city as address
EETSTEE1000EE-2019-INR53390 Tähe tn 106a, Tartu
EETSTEE1000EE-2019-INR53390 Jaama pst 26, Rakvere
EETSTEE1000EE-2019-INR53390 Linnase tn 6, Rakvere vald, Aluvere
EETSTEE1000EE-2019-INR53390 Kooli tn 10, Rummu, Harjumaa

 

Please help me out

Andres Veer

1 ACCEPTED SOLUTION

Accepted Solutions
s_lassen
Meteorite | Level 14

Here is a way to do it:

data want;
  set have;
  length streetAndNumber city $200;
  pos=1;
  do until(0);
    pos=find(XML,'<movementOfGoods>',pos);
    if pos=0 then leave;
    streetAndNumber=scan(substr(XML,find(XML,'<streetAndNumber>',pos)),2,'><');
    city=scan(substr(XML,find(XML,'<city>',pos)),2,'><');
    output;
    pos=find(XML,'</movementOfGoods>',pos);
    end;
  keep QUALIFICATION_IDENTIFICATION streetAndNumber city;
run;

View solution in original post

4 REPLIES 4
Jagadishkatam
Amethyst | Level 16

Please try the below code , I assumed that the xml data is in a txt file

 

filename sam '~/sample.txt';

proc import datafile=sam out=sample dbms=tab replace;
getnames=no;
run;

data want;
set sample;
retain streetAndNumber;
if prxmatch('m/streetAndNumber/oi',var1)>0 then streetAndNumber=prxchange('s/(.*>)(.*)(<.*)/$2/oi',-1,var1);
if prxmatch('m/city/oi',var1)>0 then cityasaddress=prxchange('s/(.*>)(.*)(<.*)/$2/oi',-1,var1);
QUALIFICATION_IDENTIFICATION='EETSTEE1000EE-2019-INR53390';
if cityasaddress ne '';
drop var1;
run;
Thanks,
Jag
veerand
Calcite | Level 5
tx I'll try, but xml data is in sas table field not as txt file
s_lassen
Meteorite | Level 14

Here is a way to do it:

data want;
  set have;
  length streetAndNumber city $200;
  pos=1;
  do until(0);
    pos=find(XML,'<movementOfGoods>',pos);
    if pos=0 then leave;
    streetAndNumber=scan(substr(XML,find(XML,'<streetAndNumber>',pos)),2,'><');
    city=scan(substr(XML,find(XML,'<city>',pos)),2,'><');
    output;
    pos=find(XML,'</movementOfGoods>',pos);
    end;
  keep QUALIFICATION_IDENTIFICATION streetAndNumber city;
run;
veerand
Calcite | Level 5
great it worked, thank you very much s_lassen

SAS Innovate 2025: Save the Date

 SAS Innovate 2025 is scheduled for May 6-9 in Orlando, FL. Sign up to be first to learn about the agenda and registration!

Save the date!

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

SAS Training: Just a Click Away

 Ready to level-up your skills? Choose your own adventure.

Browse our catalog!

Discussion stats
  • 4 replies
  • 794 views
  • 0 likes
  • 3 in conversation