BookmarkSubscribeRSS Feed
🔒 This topic is solved and locked. Need further help from the community? Please sign in and ask a new question.
Alexxxxxxx
Pyrite | Level 9

Dear all,

 

How can I replace strings among (), [], {}, ' ', " " as the same number of Xs

for the following words,

data HAVE;
input NAME :& $800.;
cards;
ALLCELLS BIOLOGICAL TECHNOLOGY (SHAN) COMPANY
ALLEGRO ASIA TECHNOLOGY (D.B.A.)
ALLEN BROTHERS 'FITTINGS'
ALLEN R. NELSON ENGINEERING (1997)
ALLENS MESHCO (PROPRIETARY & MANUFACTURING)
ALLERGY THERAPEUTICS {UK}
ALLES GOED [PTY] LIMITED, TRADING AS PIPE & PLANT INSULATION
run; 

I expect to get 

NAMEExpect
ALLCELLS BIOLOGICAL TECHNOLOGY (SHAN) COMPANYALLCELLS BIOLOGICAL TECHNOLOGY (XXXX) COMPANY
ALLEGRO ASIA TECHNOLOGY (D.B.A.)ALLEGRO ASIA TECHNOLOGY (X.X.X.)
ALLEN BROTHERS 'FITTINGS'ALLEN BROTHERS 'XXXXXXXX'
ALLEN R. NELSON ENGINEERING (1997)ALLEN R. NELSON ENGINEERING (XXXX)
ALLENS MESHCO (PROPRIETARY & MANUFACTURING)ALLENS MESHCO (XXXXXXXXXXX & XXXXXXXXXXXXX)
ALLERGY THERAPEUTICS {UK}ALLERGY THERAPEUTICS {XX}
ALLES GOED [PTY] LIMITED, TRADING AS PIPE & PLANT INSULATIONALLES GOED [XXX] LIMITED, TRADING AS PIPE & PLANT INSULATION

Could you please give me some suggestions about this?

 

thanks in advance.

1 ACCEPTED SOLUTION

Accepted Solutions
SASKiwi
PROC Star

This is pretty ugly but it appears to work. It is using the ability of the SUBSTR function to edit part of a character variable.

 

data HAVE;
input NAME :& $800.;
length char1 char2 $1;
do chars = '()', "''", '{}', '[]';
  char1 = substr(chars,1,1);
  char2 = substr(chars,2,1);
  if findc(NAME,char1) and findc(NAME,char2) then 
    substr(name, findc(NAME,char1) + 1, findc(NAME,char2, findc(NAME,char1)+1) - findc(NAME,char1)-1)
    = repeat('X', findc(NAME,char2, findc(NAME,char1)+1) - findc(NAME,char1)-1) ; 
end;
put name = ;
cards;
ALLCELLS BIOLOGICAL TECHNOLOGY (SHAN) COMPANY
ALLEGRO ASIA TECHNOLOGY (D.B.A.)
ALLEN BROTHERS 'FITTINGS'
ALLEN R. NELSON ENGINEERING (1997)
ALLENS MESHCO (PROPRIETARY & MANUFACTURING)
ALLERGY THERAPEUTICS {UK}
ALLES GOED [PTY] LIMITED, TRADING AS PIPE & PLANT INSULATION
;
run;

 

View solution in original post

8 REPLIES 8
SASKiwi
PROC Star

This is pretty ugly but it appears to work. It is using the ability of the SUBSTR function to edit part of a character variable.

 

data HAVE;
input NAME :& $800.;
length char1 char2 $1;
do chars = '()', "''", '{}', '[]';
  char1 = substr(chars,1,1);
  char2 = substr(chars,2,1);
  if findc(NAME,char1) and findc(NAME,char2) then 
    substr(name, findc(NAME,char1) + 1, findc(NAME,char2, findc(NAME,char1)+1) - findc(NAME,char1)-1)
    = repeat('X', findc(NAME,char2, findc(NAME,char1)+1) - findc(NAME,char1)-1) ; 
end;
put name = ;
cards;
ALLCELLS BIOLOGICAL TECHNOLOGY (SHAN) COMPANY
ALLEGRO ASIA TECHNOLOGY (D.B.A.)
ALLEN BROTHERS 'FITTINGS'
ALLEN R. NELSON ENGINEERING (1997)
ALLENS MESHCO (PROPRIETARY & MANUFACTURING)
ALLERGY THERAPEUTICS {UK}
ALLES GOED [PTY] LIMITED, TRADING AS PIPE & PLANT INSULATION
;
run;

 

Ksharp
Super User

If there is only one pattern in an obs, that would be easy.

 

data HAVE;
input NAME :& $800.;
call scan(name,2,p,l,'()[]{}''"','m');
temp=prxchange('s/\w/X/i',-1,substr(name,p,l));
substr(name,p,l)=temp;
cards;
ALLCELLS BIOLOGICAL TECHNOLOGY (SHAN) COMPANY
ALLEGRO ASIA TECHNOLOGY (D.B.A.)
ALLEN BROTHERS 'FITTINGS'
ALLEN R. NELSON ENGINEERING (1997)
ALLENS MESHCO (PROPRIETARY & MANUFACTURING)
ALLERGY THERAPEUTICS {UK}
ALLES GOED [PTY] LIMITED, TRADING AS PIPE & PLANT INSULATION
;
run;

proc print;run;
Alexxxxxxx
Pyrite | Level 9
Thanks for your suggestion
ChrisNZ
Tourmaline | Level 20

Other options:

data WANT1;
  set HAVE;
  TMP=NAME;
  do while(1);
    NEW=prxchange('s/ (\[.*?) [^X] (.*?\]) /$1X$2/x', -1, TMP);
    NEW=prxchange('s/ ( {.*?) [^X] (.*?} ) /$1X$2/x', -1, NEW);
    NEW=prxchange('s/ (\(.*?) [^X] (.*?\)) /$1X$2/x', -1, NEW);
    NEW=prxchange('s/ (''.*?) [^X] (.*?'') /$1X$2/x', -1, NEW);
    if TMP=NEW then leave ;
    else TMP=NEW;
  end;
  keep NAME NEW;
run;
         
data WANT2; 
  set HAVE;                              
  do PAIRS = '()', "''", '{}', '[]';
    POS1=findc(NAME,char(PAIRS,1));
    POS2=findc(NAME,char(PAIRS,2),POS1+1);     
    if 0 < POS1 < POS2-1 then 
      substr(NAME, POS1 + 1, POS2 - POS1 -1) = repeat('X', POS2 - POS1 - 2) ; 
  end;
run;    

 

ChrisNZ
Tourmaline | Level 20

@Ksharp Do you know why the DO UNTIL loop is needed in my example, and the -1 recursion in prxchange is not enough?

sas-innovate-2024.png

Available on demand!

Missed SAS Innovate Las Vegas? Watch all the action for free! View the keynotes, general sessions and 22 breakouts on demand.

 

Register now!

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

Click image to register for webinarClick image to register for webinar

Classroom Training Available!

Select SAS Training centers are offering in-person courses. View upcoming courses for:

View all other training opportunities.

Discussion stats
  • 8 replies
  • 663 views
  • 2 likes
  • 4 in conversation