BookmarkSubscribeRSS Feed
🔒 This topic is solved and locked. Need further help from the community? Please sign in and ask a new question.
Alexxxxxxx
Pyrite | Level 9

Dear all,

 

How can I replace strings among (), [], {}, ' ', " " as the same number of Xs

for the following words,

data HAVE;
input NAME :& $800.;
cards;
ALLCELLS BIOLOGICAL TECHNOLOGY (SHAN) COMPANY
ALLEGRO ASIA TECHNOLOGY (D.B.A.)
ALLEN BROTHERS 'FITTINGS'
ALLEN R. NELSON ENGINEERING (1997)
ALLENS MESHCO (PROPRIETARY & MANUFACTURING)
ALLERGY THERAPEUTICS {UK}
ALLES GOED [PTY] LIMITED, TRADING AS PIPE & PLANT INSULATION
run; 

I expect to get 

NAMEExpect
ALLCELLS BIOLOGICAL TECHNOLOGY (SHAN) COMPANYALLCELLS BIOLOGICAL TECHNOLOGY (XXXX) COMPANY
ALLEGRO ASIA TECHNOLOGY (D.B.A.)ALLEGRO ASIA TECHNOLOGY (X.X.X.)
ALLEN BROTHERS 'FITTINGS'ALLEN BROTHERS 'XXXXXXXX'
ALLEN R. NELSON ENGINEERING (1997)ALLEN R. NELSON ENGINEERING (XXXX)
ALLENS MESHCO (PROPRIETARY & MANUFACTURING)ALLENS MESHCO (XXXXXXXXXXX & XXXXXXXXXXXXX)
ALLERGY THERAPEUTICS {UK}ALLERGY THERAPEUTICS {XX}
ALLES GOED [PTY] LIMITED, TRADING AS PIPE & PLANT INSULATIONALLES GOED [XXX] LIMITED, TRADING AS PIPE & PLANT INSULATION

Could you please give me some suggestions about this?

 

thanks in advance.

1 ACCEPTED SOLUTION

Accepted Solutions
SASKiwi
PROC Star

This is pretty ugly but it appears to work. It is using the ability of the SUBSTR function to edit part of a character variable.

 

data HAVE;
input NAME :& $800.;
length char1 char2 $1;
do chars = '()', "''", '{}', '[]';
  char1 = substr(chars,1,1);
  char2 = substr(chars,2,1);
  if findc(NAME,char1) and findc(NAME,char2) then 
    substr(name, findc(NAME,char1) + 1, findc(NAME,char2, findc(NAME,char1)+1) - findc(NAME,char1)-1)
    = repeat('X', findc(NAME,char2, findc(NAME,char1)+1) - findc(NAME,char1)-1) ; 
end;
put name = ;
cards;
ALLCELLS BIOLOGICAL TECHNOLOGY (SHAN) COMPANY
ALLEGRO ASIA TECHNOLOGY (D.B.A.)
ALLEN BROTHERS 'FITTINGS'
ALLEN R. NELSON ENGINEERING (1997)
ALLENS MESHCO (PROPRIETARY & MANUFACTURING)
ALLERGY THERAPEUTICS {UK}
ALLES GOED [PTY] LIMITED, TRADING AS PIPE & PLANT INSULATION
;
run;

 

View solution in original post

8 REPLIES 8
SASKiwi
PROC Star

This is pretty ugly but it appears to work. It is using the ability of the SUBSTR function to edit part of a character variable.

 

data HAVE;
input NAME :& $800.;
length char1 char2 $1;
do chars = '()', "''", '{}', '[]';
  char1 = substr(chars,1,1);
  char2 = substr(chars,2,1);
  if findc(NAME,char1) and findc(NAME,char2) then 
    substr(name, findc(NAME,char1) + 1, findc(NAME,char2, findc(NAME,char1)+1) - findc(NAME,char1)-1)
    = repeat('X', findc(NAME,char2, findc(NAME,char1)+1) - findc(NAME,char1)-1) ; 
end;
put name = ;
cards;
ALLCELLS BIOLOGICAL TECHNOLOGY (SHAN) COMPANY
ALLEGRO ASIA TECHNOLOGY (D.B.A.)
ALLEN BROTHERS 'FITTINGS'
ALLEN R. NELSON ENGINEERING (1997)
ALLENS MESHCO (PROPRIETARY & MANUFACTURING)
ALLERGY THERAPEUTICS {UK}
ALLES GOED [PTY] LIMITED, TRADING AS PIPE & PLANT INSULATION
;
run;

 

Ksharp
Super User

If there is only one pattern in an obs, that would be easy.

 

data HAVE;
input NAME :& $800.;
call scan(name,2,p,l,'()[]{}''"','m');
temp=prxchange('s/\w/X/i',-1,substr(name,p,l));
substr(name,p,l)=temp;
cards;
ALLCELLS BIOLOGICAL TECHNOLOGY (SHAN) COMPANY
ALLEGRO ASIA TECHNOLOGY (D.B.A.)
ALLEN BROTHERS 'FITTINGS'
ALLEN R. NELSON ENGINEERING (1997)
ALLENS MESHCO (PROPRIETARY & MANUFACTURING)
ALLERGY THERAPEUTICS {UK}
ALLES GOED [PTY] LIMITED, TRADING AS PIPE & PLANT INSULATION
;
run;

proc print;run;
Alexxxxxxx
Pyrite | Level 9
Thanks for your suggestion
ChrisNZ
Tourmaline | Level 20

Other options:

data WANT1;
  set HAVE;
  TMP=NAME;
  do while(1);
    NEW=prxchange('s/ (\[.*?) [^X] (.*?\]) /$1X$2/x', -1, TMP);
    NEW=prxchange('s/ ( {.*?) [^X] (.*?} ) /$1X$2/x', -1, NEW);
    NEW=prxchange('s/ (\(.*?) [^X] (.*?\)) /$1X$2/x', -1, NEW);
    NEW=prxchange('s/ (''.*?) [^X] (.*?'') /$1X$2/x', -1, NEW);
    if TMP=NEW then leave ;
    else TMP=NEW;
  end;
  keep NAME NEW;
run;
         
data WANT2; 
  set HAVE;                              
  do PAIRS = '()', "''", '{}', '[]';
    POS1=findc(NAME,char(PAIRS,1));
    POS2=findc(NAME,char(PAIRS,2),POS1+1);     
    if 0 < POS1 < POS2-1 then 
      substr(NAME, POS1 + 1, POS2 - POS1 -1) = repeat('X', POS2 - POS1 - 2) ; 
  end;
run;    

 

ChrisNZ
Tourmaline | Level 20

@Ksharp Do you know why the DO UNTIL loop is needed in my example, and the -1 recursion in prxchange is not enough?

SAS Innovate 2025: Save the Date

 SAS Innovate 2025 is scheduled for May 6-9 in Orlando, FL. Sign up to be first to learn about the agenda and registration!

Save the date!

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

SAS Training: Just a Click Away

 Ready to level-up your skills? Choose your own adventure.

Browse our catalog!

Discussion stats
  • 8 replies
  • 989 views
  • 2 likes
  • 4 in conversation