BookmarkSubscribeRSS Feed
Alexxxxxxx
Pyrite | Level 9

Dear all,

 

 

when I find all strings between (),[],and {} (such as <BR>, [FONT],{BODY},'A',"JUICE") and split them in a new variable

(for example, for the 'JUICE<BR>apple<footer>',I expect to add a blank between 'JUICE' and 'apple')

 

However, 

the value

HARDY(FRNS.)'A'

cannot be processed by the code 

I expect to get 

nameCOMPANY_NAME_inBCOMPANY_NAME_noB
HARDY(FRNS.)'A'FRNS.HARDY
HARDY(FRNS.)'A'AHARDY

However, I only get 

nameCOMPANY_NAME_inBCOMPANY_NAME_noB
HARDY(FRNS.)'A'AHARDY(FRNS.)

 

Could you please give me some suggestions?

data have ;
  infile datalines truncover;
  input name $100.;
  datalines;
JUICE<BR>apple[footer] 
juice <BR> apple 
juice<BODY> 'apple'
<figure> "juice" LTD
HARDY(FRNS.)'A'
HAFSLUND 'B' (XSQ)
;

data want;
   set have;
   RegExID = prxparse('/<\w*>|\(\w*\)|\[\w*\]|\(\w*\)|"\w*"|''\w*''/');
   start=1;
   stop=length(name);
   call prxnext(RegExID, start, stop, name, pos, length);
    do while (pos > 0);
         COMPANY_NAME_inB = substr(name, pos+1, length-2);
         COMPANY_NAME_noB = prxchange('s/<\w*>|\(\w*\)|\[\w*\]|\(\w*\)|"\w*"|''\w*''/ /', -1, name);
		 output;
         call prxnext(RegExID, start, stop, name, pos, length);
      end;
	  drop RegExID pos length start stop;
run;


proc print data=want;
run;

 

 

 

1 REPLY 1
ChrisNZ
Tourmaline | Level 20

Like this?

data WANT;
  set HAVE;
  RegExID = prxparse('/<[^>]*>|\([^\)]*\)|\[[^\]]*\]|"[^"]*"|''[^'']*''/');
  START=1;
  STOP=length(NAME);
  call prxnext(RegExID, START, STOP, NAME, POS, LENGTH);
  do while (POS > 0);
    COMPANY_NAME_inB = substr(NAME, POS+1, LENGTH-2);
    COMPANY_NAME_noB = prxchange('s/<[^>]*>|\([^\)]*\)|\[[^\]]*\]|"[^"]*"|''[^'']*''/ /', -1, NAME);
    output;                                 
    call prxnext(RegExID, START, STOP, NAME, POS, LENGTH);
  end;
  drop RegExID POS LENGTH START STOP;
run;
Obs name COMPANY_NAME_inB COMPANY_NAME_noB
1 JUICE<BR>apple[footer] BR JUICE apple
2 JUICE<BR>apple[footer] footer JUICE apple
3 juice <BR> apple BR juice apple
4 juice<BODY> 'apple' BODY juice
5 juice<BODY> 'apple' apple juice
6 <figure> "juice" LTD figure LTD
7 <figure> "juice" LTD juice LTD
8 HARDY(FRNS.)'A' FRNS. HARDY
9 HARDY(FRNS.)'A' A HARDY
10 HAFSLUND 'B' (XSQ) B HAFSLUND
11 HAFSLUND 'B' (XSQ) XSQ HAFSLUND

 

SAS Innovate 2025: Save the Date

 SAS Innovate 2025 is scheduled for May 6-9 in Orlando, FL. Sign up to be first to learn about the agenda and registration!

Save the date!

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

SAS Training: Just a Click Away

 Ready to level-up your skills? Choose your own adventure.

Browse our catalog!

Discussion stats
  • 1 reply
  • 474 views
  • 0 likes
  • 2 in conversation