BookmarkSubscribeRSS Feed
Alexxxxxxx
Pyrite | Level 9

Dear all,

 

 

when I find all strings between (),[],and {} (such as <BR>, [FONT],{BODY},'A',"JUICE") and split them in a new variable

(for example, for the 'JUICE<BR>apple<footer>',I expect to add a blank between 'JUICE' and 'apple')

 

However, 

the value

HARDY(FRNS.)'A'

cannot be processed by the code 

I expect to get 

nameCOMPANY_NAME_inBCOMPANY_NAME_noB
HARDY(FRNS.)'A'FRNS.HARDY
HARDY(FRNS.)'A'AHARDY

However, I only get 

nameCOMPANY_NAME_inBCOMPANY_NAME_noB
HARDY(FRNS.)'A'AHARDY(FRNS.)

 

Could you please give me some suggestions?

data have ;
  infile datalines truncover;
  input name $100.;
  datalines;
JUICE<BR>apple[footer] 
juice <BR> apple 
juice<BODY> 'apple'
<figure> "juice" LTD
HARDY(FRNS.)'A'
HAFSLUND 'B' (XSQ)
;

data want;
   set have;
   RegExID = prxparse('/<\w*>|\(\w*\)|\[\w*\]|\(\w*\)|"\w*"|''\w*''/');
   start=1;
   stop=length(name);
   call prxnext(RegExID, start, stop, name, pos, length);
    do while (pos > 0);
         COMPANY_NAME_inB = substr(name, pos+1, length-2);
         COMPANY_NAME_noB = prxchange('s/<\w*>|\(\w*\)|\[\w*\]|\(\w*\)|"\w*"|''\w*''/ /', -1, name);
		 output;
         call prxnext(RegExID, start, stop, name, pos, length);
      end;
	  drop RegExID pos length start stop;
run;


proc print data=want;
run;

 

 

 

1 REPLY 1
ChrisNZ
Tourmaline | Level 20

Like this?

data WANT;
  set HAVE;
  RegExID = prxparse('/<[^>]*>|\([^\)]*\)|\[[^\]]*\]|"[^"]*"|''[^'']*''/');
  START=1;
  STOP=length(NAME);
  call prxnext(RegExID, START, STOP, NAME, POS, LENGTH);
  do while (POS > 0);
    COMPANY_NAME_inB = substr(NAME, POS+1, LENGTH-2);
    COMPANY_NAME_noB = prxchange('s/<[^>]*>|\([^\)]*\)|\[[^\]]*\]|"[^"]*"|''[^'']*''/ /', -1, NAME);
    output;                                 
    call prxnext(RegExID, START, STOP, NAME, POS, LENGTH);
  end;
  drop RegExID POS LENGTH START STOP;
run;
Obs name COMPANY_NAME_inB COMPANY_NAME_noB
1 JUICE<BR>apple[footer] BR JUICE apple
2 JUICE<BR>apple[footer] footer JUICE apple
3 juice <BR> apple BR juice apple
4 juice<BODY> 'apple' BODY juice
5 juice<BODY> 'apple' apple juice
6 <figure> "juice" LTD figure LTD
7 <figure> "juice" LTD juice LTD
8 HARDY(FRNS.)'A' FRNS. HARDY
9 HARDY(FRNS.)'A' A HARDY
10 HAFSLUND 'B' (XSQ) B HAFSLUND
11 HAFSLUND 'B' (XSQ) XSQ HAFSLUND

 

sas-innovate-wordmark-2025-midnight.png

Register Today!

Join us for SAS Innovate 2025, our biggest and most exciting global event of the year, in Orlando, FL, from May 6-9. Sign up by March 14 for just $795.


Register now!

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

SAS Training: Just a Click Away

 Ready to level-up your skills? Choose your own adventure.

Browse our catalog!

Discussion stats
  • 1 reply
  • 524 views
  • 0 likes
  • 2 in conversation