Hi experts,
the data set I have bunch of post-fixes in the end of the company, example (post-fixes):
(canada) |
(india) |
ab |
ag |
bv |
co |
co ltd |
gmbh |
inc |
india |
limited |
llc |
ltd |
mbh |
nv |
plc |
private limited |
pte ltd |
pty ltd |
pvt ltd |
s.a |
sa |
sl |
spa |
srl |
usa |
corp |
sro |
a/s |
what I want to achieve is, remove those from the comapnies names.
the postfixes and actual comapnies names, their are a single blank within. example:
180 Medical Inc
1ST Bio Therapeutics Inc
1ST Bio Therapeutics INC
1abtik ag
Camurus AB
Korning AG
PlasmaChem GmbH
PlasmaChem GMBH
how do I remove those post-fixes from original data set?
and some post fixes are written upper-case, and some are in lower-case, some are mixed.
Hi @jimmychoi
Here is an attempt to do this. Does that meet your expectations?
data word_list;
input word $20.;
if findc(word,"/") then word=tranwrd(word,"/",".");
datalines;
(canada)
(india)
ab
ag
bv
co
co ltd
gmbh
inc
india
limited
llc
ltd
mbh
nv
plc
private limited
pte ltd
pty ltd
pvt ltd
s.a
sa
sl
spa
srl
usa
corp
sro
a/s
;
run;
data have;
input company $80.;
datalines;
180 Medical Inc
1ST Bio Therapeutics Inc
1ST Bio Therapeutics INC
1abtik ag
Camurus AB
Korning AG
PlasmaChem GmbH
PlasmaChem GMBH
;
run;
proc sql noprint;
select quote(trim(word),"'")
into:word_list separated by ","
from word_list
order by countw(word) desc;
quit;
data want;
set have;
company = tranwrd(company,"/",".");
array _list(&sqlobs) $ _temporary_ (&word_list);
do i=1 to dim(_list);
if find(company,_list(i),"i")>0 then do;
company = prxchange(cats("s/(\(?\b",_list(i),"\b\)?\s*)$//i"),-1,company);
end;
end;
drop i;
run;
Hi @jimmychoi
Here is an attempt to do this. Does that meet your expectations?
data word_list;
input word $20.;
if findc(word,"/") then word=tranwrd(word,"/",".");
datalines;
(canada)
(india)
ab
ag
bv
co
co ltd
gmbh
inc
india
limited
llc
ltd
mbh
nv
plc
private limited
pte ltd
pty ltd
pvt ltd
s.a
sa
sl
spa
srl
usa
corp
sro
a/s
;
run;
data have;
input company $80.;
datalines;
180 Medical Inc
1ST Bio Therapeutics Inc
1ST Bio Therapeutics INC
1abtik ag
Camurus AB
Korning AG
PlasmaChem GmbH
PlasmaChem GMBH
;
run;
proc sql noprint;
select quote(trim(word),"'")
into:word_list separated by ","
from word_list
order by countw(word) desc;
quit;
data want;
set have;
company = tranwrd(company,"/",".");
array _list(&sqlobs) $ _temporary_ (&word_list);
do i=1 to dim(_list);
if find(company,_list(i),"i")>0 then do;
company = prxchange(cats("s/(\(?\b",_list(i),"\b\)?\s*)$//i"),-1,company);
end;
end;
drop i;
run;
Are you ready for the spotlight? We're accepting content ideas for SAS Innovate 2025 to be held May 6-9 in Orlando, FL. The call is open until September 25. Read more here about why you should contribute and what is in it for you!
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.