DATA Step, Macro, Functions and more

Convert long string into variable EDIFACT

Accepted Solution Solved
Reply
Contributor
Posts: 65
Accepted Solution

Convert long string into variable EDIFACT

[ Edited ]

Hi,

I am dealing with a dataset which has a field containing a so-called EDIFACT message. It menas a long streng containing different inormation. I want to parse and divide the long text string into sas variables. This should be based on char string characters - se example below.

 

Any ideas or code to a nice and easy solution from SAS???

 

Thanks in advance,

 

 Input data: one long string

 

NAD+PERSONnumber+3456789010'DTM+091:170126:101'NAD+ENTITY+027'NAD+KLA+11'
NAD+YNR+'NAD+MOK+405'TAF+55:45:53:52:51:61:62:63:64:65:85:84:83:82:81:71:72:73:74:75'
TOB+85:OKK:0'TOB+75:OKK:0’TOB+63:MES:4'QTY+DYB:X'

 

I want to create these new variables:

PERSONnumber=3456789010
DTM=091:170126
ENTITY=027
KLA=11
YNR=+
MOK=405
TAF=55:45:53:52:51:61:62:63:64:65:85:84:83:82:81:71:72:73:74:75'
TOB85=OKK:0
TOB75=OKK:0
TOB63= MES:4'QTY+DYB:X'

 


Accepted Solutions
Solution
a week ago
Super User
Super User
Posts: 6,844

Re: Convert long string into variable EDIFACT

Looks to me like you can get very close by 

  1. Removing the NAD+ from the input string.
  2. Splitting the string on single quote and period
data test ;
  str=
   "NAD+PERSONnumber+3456789010'DTM+091:170126:101'"
||"NAD+ENTITY+027'NAD+KLA+11'"
||"NAD+YNR+'NAD+MOK+405'"
||"TAF+55:45:53:52:51:61:62:63:64:65:85:84:83:82:81:71:72:73:74:75'"
||"TOB+85:OKK:0'TOB+75:OKK:0.TOB+63:MES:4'QTY+DYB:X'"
  ;
  length i 8 term $100 ;
 * Remove NAD+ and then Split on ticks and periods ;
  str=compress(tranwrd(str,'NAD+',' '),' ');
  do i=1 by 1 until (term=' ');
    term=scan(str,i,"'.");
    output;
    put i= term= ;
  end;
  drop str;
run;
i=1 term=PERSONnumber+3456789010
i=2 term=DTM+091:170126:101
i=3 term=ENTITY+027
i=4 term=KLA+11
i=5 term=YNR+
i=6 term=MOK+405
i=7 term=TAF+55:45:53:52:51:61:62:63:64:65:85:84:83:82:81:71:72:73:74:75
i=8 term=TOB+85:OKK:0
i=9 term=TOB+75:OKK:0
i=10 term=TOB+63:MES:4
i=11 term=QTY+DYB:X
i=12 term=

View solution in original post


All Replies
Super User
Posts: 19,099

Re: Convert long string into variable EDIFACT

Try SCAN() with the + and ' as your delimiters. 

Solution
a week ago
Super User
Super User
Posts: 6,844

Re: Convert long string into variable EDIFACT

Looks to me like you can get very close by 

  1. Removing the NAD+ from the input string.
  2. Splitting the string on single quote and period
data test ;
  str=
   "NAD+PERSONnumber+3456789010'DTM+091:170126:101'"
||"NAD+ENTITY+027'NAD+KLA+11'"
||"NAD+YNR+'NAD+MOK+405'"
||"TAF+55:45:53:52:51:61:62:63:64:65:85:84:83:82:81:71:72:73:74:75'"
||"TOB+85:OKK:0'TOB+75:OKK:0.TOB+63:MES:4'QTY+DYB:X'"
  ;
  length i 8 term $100 ;
 * Remove NAD+ and then Split on ticks and periods ;
  str=compress(tranwrd(str,'NAD+',' '),' ');
  do i=1 by 1 until (term=' ');
    term=scan(str,i,"'.");
    output;
    put i= term= ;
  end;
  drop str;
run;
i=1 term=PERSONnumber+3456789010
i=2 term=DTM+091:170126:101
i=3 term=ENTITY+027
i=4 term=KLA+11
i=5 term=YNR+
i=6 term=MOK+405
i=7 term=TAF+55:45:53:52:51:61:62:63:64:65:85:84:83:82:81:71:72:73:74:75
i=8 term=TOB+85:OKK:0
i=9 term=TOB+75:OKK:0
i=10 term=TOB+63:MES:4
i=11 term=QTY+DYB:X
i=12 term=
Contributor
Posts: 65

Re: Convert long string into variable EDIFACT

I used this solution. This works very well. Thanks a lot for your input.

 

Contributor
Posts: 65

Re: Convert long string into variable EDIFACT

I used this solution. This works very well. Thanks a lot for your input.

Contributor
Posts: 65

Re: Convert long string into variable EDIFACT

I used this solution. This works very well. Thanks a lot for your input.
PROC Star
Posts: 282

Re: Convert long string into variable EDIFACT

[ Edited ]

if you values are pretty fixed here is the another approach. In first step  where we use prxchange what we do is

we pick the pattern and extract what we want

for example for variable person number as shown below code snippet

 

PERSONnumber =prxchange('s/.+?PERSONnumber(.+?)''.+/$1/i', -1, str);

 

for example for variable person number as code shown above

(.+?) is the value we capture and contains anything after personnumber  and before first single quote.

This captured value is  $1 and replaces everything else. Same approach for every other variable. In the next step + is not replaced, when you have only +, otherwise + is removed. Hope I am clear in my explanation

 

 

 

data test ;
  str=
"NAD+PERSONnumber+3456789010'DTM+091:170126:101'NAD+ENTITY+027'NAD+KLA+11'
NAD+YNR+'NAD+MOK+405'TAF+55:45:53:52:51:61:62:63:64:65:85:84:83:82:81:71:72:73:74:75'
TOB+85:OKK:0'TOB+75:OKK:0'TOB+63:MES:4'QTY+DYB:X'"
  ;
  
  
  data test2(drop =str);
  set test;
  PERSONnumber =prxchange('s/.+?PERSONnumber(.+?)''.+/$1/i', -1, str);
  DTM =prxchange('s/.+?DTM(.+?)''.+/$1/i', -1, str);
  ENTITY=prxchange('s/.+?ENTITY(.+?)''.+/$1/i', -1, str);
  KLA=prxchange('s/.+?KLA(.+?)''.+/$1/i', -1, str);
  YNR=prxchange('s/.+?YNR(.+?)''.+/$1/i', -1, str);
  MOK=prxchange('s/.+?MOK(.+?)''.+/$1/i', -1, str);
  TAF=prxchange('s/.+?TAF(.+?)''.+/$1/i', -1, str);
  TOB85=prxchange('s/.+?TOB\+85\:(.+?)''.+/$1/i', -1, str);
  TOB75=prxchange('s/.+?TOB\+75\:(.+?)''.+/$1/i', -1, str);
  TOB63=prxchange('s/.+?TOB\+63\:(.+)/$1/i', -1, str);
  run;
  
  data test3;
  set test2;
 array vars{*} _character_;
 
   do i=1 to dim(vars);
     if vars{i} = '+' then vars{i} = vars{i};
     	else vars{i}  = substr(vars{i},2);
   end;
run;

 

 

Contributor
Posts: 65

Re: Convert long string into variable EDIFACT

Ny values are NOT fixed. The styring and fontene is variable. So somehow i ned to find out hos many variables there is. 

 

Any smart Way of doping that? 

 

I Will try the solutins latter. Looks vers interesting. 

 

Thanks. 

Contributor
Posts: 65

Re: Convert long string into variable EDIFACT

The fontene and nummer and nane of variables Will be different. So somehow i ned to use the test separators in the original styring to James the outputtet variables in ny final dataset. 

Respected Advisor
Posts: 4,132

Re: Convert long string into variable EDIFACT

@ANLYNG

Instead of trying to write your own parser you could also investigate the Internet if there is already something out there doing this job for you - and then spend the time to figure out how to interface with such a parser.

Doing a quick Internet search it appears such parsers exists - ie. converting EDIFACT to XML. If so then you could use the SAS XMLV2 engine together with automap to then read such an XML file.  

☑ This topic is solved.

Need further help from the community? Please ask a new question.

Discussion stats
  • 9 replies
  • 164 views
  • 0 likes
  • 5 in conversation