BookmarkSubscribeRSS Feed
🔒 This topic is solved and locked. Need further help from the community? Please sign in and ask a new question.
veerand
Calcite | Level 5

I work with SAS Data Integration Studio and need extract multiple addresses from text field which contains xml. For security reasons my SAS enviroment is quite closed, I cant use PIPE or FINDNTH(). I dont have access to original XML file aswell.
So far I managed with simple functions as SUBSTR and INDEX to shorten original text field and managed to get first address from textfield but as there more addresses I cant find solution for extracting second or sixth or n-th addresses. 
My table looks like and count is how many strings <movementOfGoods> are in text field as how many rows I need for ID. 

veerand_0-1592554119300.png

XML field contains:
<movementOfGoods>
<qualifierOfTheIdentification>Z</qualifierOfTheIdentification>
<legalBaseCode>A</legalBaseCode>
<holderEORINumber>
<eORINumber>EEReplacedCodeForSecurityReasons</eORINumber>
</holderEORINumber>
<country>EE</country>
<typeOfLocationCode>B</typeOfLocationCode>
<freeTextDescription>
<name>ReplacedCodeForSecurityReasons/001</name>
<streetAndNumber>Tähe tn 106a</streetAndNumber>
<postcode>50107</postcode>
<city>Tartu</city>
<country>EE</country>
</freeTextDescription>
</movementOfGoods>
<movementOfGoods>
<qualifierOfTheIdentification>Z</qualifierOfTheIdentification>
<legalBaseCode>A</legalBaseCode>
<holderEORINumber>
<eORINumber>EEReplacedCodeForSecurityReasons</eORINumber>
</holderEORINumber>
<country>EE</country>
<typeOfLocationCode>B</typeOfLocationCode>
<freeTextDescription>
<name>ReplacedCodeForSecurityReasons/007</name>
<streetAndNumber>Jaama pst 26</streetAndNumber>
<postcode>44311</postcode>
<city>Rakvere</city>
<country>EE</country>
</freeTextDescription>
</movementOfGoods>
<movementOfGoods>
<qualifierOfTheIdentification>Z</qualifierOfTheIdentification>
<legalBaseCode>A</legalBaseCode>
<holderEORINumber>
<eORINumber>EEReplacedCodeForSecurityReasons</eORINumber>
</holderEORINumber>
<country>EE</country>
<typeOfLocationCode>B</typeOfLocationCode>
<freeTextDescription>
<name>ReplacedCodeForSecurityReasons/008</name>
<streetAndNumber>Linnase tn 6</streetAndNumber>
<postcode>44208</postcode>
<city>Rakvere vald, Aluvere</city>
<country>EE</country>
</freeTextDescription>
</movementOfGoods>
<movementOfGoods>
<qualifierOfTheIdentification>Z</qualifierOfTheIdentification>
<legalBaseCode>A</legalBaseCode>
<holderEORINumber>
<eORINumber>EEReplacedCodeForSecurityReasons</eORINumber>
</holderEORINumber>
<country>EE</country>
<typeOfLocationCode>B</typeOfLocationCode>
<freeTextDescription>
<name>ReplacedCodeForSecurityReasons/010</name>
<streetAndNumber>Kooli tn 10</streetAndNumber>
<postcode>76102</postcode>
<city>Rummu, Harjumaa</city>
<country>EE</country>
</freeTextDescription>
</movementOfGoods>
</temporaryStorageCustomsDecisionInformation>
</decisionRelatedInformation>
<customsDecisionStatus>01</customsDecisionStatus>
<customsDecisionStatusDate>2019-12-04</customsDecisionStatusDate>
<publicationAgreement>
<agreementForPublication>1</agreementForPublication>
</publicationAgreement>
</customsDecisionAuthorisationData>

 

At the end I need table which look like

QUALIFICATION_IDENTIFICATION streetAndNumber,city as address
EETSTEE1000EE-2019-INR53390 Tähe tn 106a, Tartu
EETSTEE1000EE-2019-INR53390 Jaama pst 26, Rakvere
EETSTEE1000EE-2019-INR53390 Linnase tn 6, Rakvere vald, Aluvere
EETSTEE1000EE-2019-INR53390 Kooli tn 10, Rummu, Harjumaa

 

Please help me out

Andres Veer

1 ACCEPTED SOLUTION

Accepted Solutions
s_lassen
Meteorite | Level 14

Here is a way to do it:

data want;
  set have;
  length streetAndNumber city $200;
  pos=1;
  do until(0);
    pos=find(XML,'<movementOfGoods>',pos);
    if pos=0 then leave;
    streetAndNumber=scan(substr(XML,find(XML,'<streetAndNumber>',pos)),2,'><');
    city=scan(substr(XML,find(XML,'<city>',pos)),2,'><');
    output;
    pos=find(XML,'</movementOfGoods>',pos);
    end;
  keep QUALIFICATION_IDENTIFICATION streetAndNumber city;
run;

View solution in original post

4 REPLIES 4
Jagadishkatam
Amethyst | Level 16

Please try the below code , I assumed that the xml data is in a txt file

 

filename sam '~/sample.txt';

proc import datafile=sam out=sample dbms=tab replace;
getnames=no;
run;

data want;
set sample;
retain streetAndNumber;
if prxmatch('m/streetAndNumber/oi',var1)>0 then streetAndNumber=prxchange('s/(.*>)(.*)(<.*)/$2/oi',-1,var1);
if prxmatch('m/city/oi',var1)>0 then cityasaddress=prxchange('s/(.*>)(.*)(<.*)/$2/oi',-1,var1);
QUALIFICATION_IDENTIFICATION='EETSTEE1000EE-2019-INR53390';
if cityasaddress ne '';
drop var1;
run;
Thanks,
Jag
veerand
Calcite | Level 5
tx I'll try, but xml data is in sas table field not as txt file
s_lassen
Meteorite | Level 14

Here is a way to do it:

data want;
  set have;
  length streetAndNumber city $200;
  pos=1;
  do until(0);
    pos=find(XML,'<movementOfGoods>',pos);
    if pos=0 then leave;
    streetAndNumber=scan(substr(XML,find(XML,'<streetAndNumber>',pos)),2,'><');
    city=scan(substr(XML,find(XML,'<city>',pos)),2,'><');
    output;
    pos=find(XML,'</movementOfGoods>',pos);
    end;
  keep QUALIFICATION_IDENTIFICATION streetAndNumber city;
run;
veerand
Calcite | Level 5
great it worked, thank you very much s_lassen
How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

SAS Training: Just a Click Away

 Ready to level-up your skills? Choose your own adventure.

Browse our catalog!

Discussion stats
  • 4 replies
  • 1323 views
  • 0 likes
  • 3 in conversation