Dear all,
How can I replace strings among (), [], {}, ' ', " " as the same number of Xs
for the following words,
data HAVE;
input NAME :& $800.;
cards;
ALLCELLS BIOLOGICAL TECHNOLOGY (SHAN) COMPANY
ALLEGRO ASIA TECHNOLOGY (D.B.A.)
ALLEN BROTHERS 'FITTINGS'
ALLEN R. NELSON ENGINEERING (1997)
ALLENS MESHCO (PROPRIETARY & MANUFACTURING)
ALLERGY THERAPEUTICS {UK}
ALLES GOED [PTY] LIMITED, TRADING AS PIPE & PLANT INSULATION
run;
I expect to get
NAME | Expect |
ALLCELLS BIOLOGICAL TECHNOLOGY (SHAN) COMPANY | ALLCELLS BIOLOGICAL TECHNOLOGY (XXXX) COMPANY |
ALLEGRO ASIA TECHNOLOGY (D.B.A.) | ALLEGRO ASIA TECHNOLOGY (X.X.X.) |
ALLEN BROTHERS 'FITTINGS' | ALLEN BROTHERS 'XXXXXXXX' |
ALLEN R. NELSON ENGINEERING (1997) | ALLEN R. NELSON ENGINEERING (XXXX) |
ALLENS MESHCO (PROPRIETARY & MANUFACTURING) | ALLENS MESHCO (XXXXXXXXXXX & XXXXXXXXXXXXX) |
ALLERGY THERAPEUTICS {UK} | ALLERGY THERAPEUTICS {XX} |
ALLES GOED [PTY] LIMITED, TRADING AS PIPE & PLANT INSULATION | ALLES GOED [XXX] LIMITED, TRADING AS PIPE & PLANT INSULATION |
Could you please give me some suggestions about this?
thanks in advance.
This is pretty ugly but it appears to work. It is using the ability of the SUBSTR function to edit part of a character variable.
data HAVE;
input NAME :& $800.;
length char1 char2 $1;
do chars = '()', "''", '{}', '[]';
char1 = substr(chars,1,1);
char2 = substr(chars,2,1);
if findc(NAME,char1) and findc(NAME,char2) then
substr(name, findc(NAME,char1) + 1, findc(NAME,char2, findc(NAME,char1)+1) - findc(NAME,char1)-1)
= repeat('X', findc(NAME,char2, findc(NAME,char1)+1) - findc(NAME,char1)-1) ;
end;
put name = ;
cards;
ALLCELLS BIOLOGICAL TECHNOLOGY (SHAN) COMPANY
ALLEGRO ASIA TECHNOLOGY (D.B.A.)
ALLEN BROTHERS 'FITTINGS'
ALLEN R. NELSON ENGINEERING (1997)
ALLENS MESHCO (PROPRIETARY & MANUFACTURING)
ALLERGY THERAPEUTICS {UK}
ALLES GOED [PTY] LIMITED, TRADING AS PIPE & PLANT INSULATION
;
run;
This is pretty ugly but it appears to work. It is using the ability of the SUBSTR function to edit part of a character variable.
data HAVE;
input NAME :& $800.;
length char1 char2 $1;
do chars = '()', "''", '{}', '[]';
char1 = substr(chars,1,1);
char2 = substr(chars,2,1);
if findc(NAME,char1) and findc(NAME,char2) then
substr(name, findc(NAME,char1) + 1, findc(NAME,char2, findc(NAME,char1)+1) - findc(NAME,char1)-1)
= repeat('X', findc(NAME,char2, findc(NAME,char1)+1) - findc(NAME,char1)-1) ;
end;
put name = ;
cards;
ALLCELLS BIOLOGICAL TECHNOLOGY (SHAN) COMPANY
ALLEGRO ASIA TECHNOLOGY (D.B.A.)
ALLEN BROTHERS 'FITTINGS'
ALLEN R. NELSON ENGINEERING (1997)
ALLENS MESHCO (PROPRIETARY & MANUFACTURING)
ALLERGY THERAPEUTICS {UK}
ALLES GOED [PTY] LIMITED, TRADING AS PIPE & PLANT INSULATION
;
run;
If there is only one pattern in an obs, that would be easy.
data HAVE;
input NAME :& $800.;
call scan(name,2,p,l,'()[]{}''"','m');
temp=prxchange('s/\w/X/i',-1,substr(name,p,l));
substr(name,p,l)=temp;
cards;
ALLCELLS BIOLOGICAL TECHNOLOGY (SHAN) COMPANY
ALLEGRO ASIA TECHNOLOGY (D.B.A.)
ALLEN BROTHERS 'FITTINGS'
ALLEN R. NELSON ENGINEERING (1997)
ALLENS MESHCO (PROPRIETARY & MANUFACTURING)
ALLERGY THERAPEUTICS {UK}
ALLES GOED [PTY] LIMITED, TRADING AS PIPE & PLANT INSULATION
;
run;
proc print;run;
Other options:
data WANT1;
set HAVE;
TMP=NAME;
do while(1);
NEW=prxchange('s/ (\[.*?) [^X] (.*?\]) /$1X$2/x', -1, TMP);
NEW=prxchange('s/ ( {.*?) [^X] (.*?} ) /$1X$2/x', -1, NEW);
NEW=prxchange('s/ (\(.*?) [^X] (.*?\)) /$1X$2/x', -1, NEW);
NEW=prxchange('s/ (''.*?) [^X] (.*?'') /$1X$2/x', -1, NEW);
if TMP=NEW then leave ;
else TMP=NEW;
end;
keep NAME NEW;
run;
data WANT2;
set HAVE;
do PAIRS = '()', "''", '{}', '[]';
POS1=findc(NAME,char(PAIRS,1));
POS2=findc(NAME,char(PAIRS,2),POS1+1);
if 0 < POS1 < POS2-1 then
substr(NAME, POS1 + 1, POS2 - POS1 -1) = repeat('X', POS2 - POS1 - 2) ;
end;
run;
@Ksharp Do you know why the DO UNTIL loop is needed in my example, and the -1 recursion in prxchange is not enough?
SAS Innovate 2025 is scheduled for May 6-9 in Orlando, FL. Sign up to be first to learn about the agenda and registration!
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.
Ready to level-up your skills? Choose your own adventure.