Hello,
I am attempting to remove certain strings from company names in my data. That is, I would like to remove things like 'LTD', 'CO', 'INC', 'LLC', etc. I'm also looking to get rid of leading and trailing blanks and punctuation. So, I am able to do this for one use case as you see below. But I don't want to keep writing new lines for each tranwrd as I am unable to put a list of strings in tranwrd. What I'd like to do is have some sort of loop that sends a list of strings through the tranwrd function (applying to the same field - company_name). Like define a list of strings as a macro variable and send that through a loop within a macro or something like that.
data MyCompanies; set CompanyListing; Company_Name=tranwrd(strip(compress(CompanyName,,"p")),"LTD",''); run;
Like this?
data MYCOMPANIES;
* set COMPANYLISTING;
COMPANYNAME='Incredible Company Ltd.';
COMPANY_NAME=prxchange('s/(\b(LTD|CO|INC|LLC)\b)//i',1,left(compress(COMPANYNAME,,'p')));
putlog COMPANY_NAME=;
run;
COMPANY_NAME=Incredible Company
Like this?
data MYCOMPANIES;
* set COMPANYLISTING;
COMPANYNAME='Incredible Company Ltd.';
COMPANY_NAME=prxchange('s/(\b(LTD|CO|INC|LLC)\b)//i',1,left(compress(COMPANYNAME,,'p')));
putlog COMPANY_NAME=;
run;
COMPANY_NAME=Incredible Company
@JediApprentice wrote:
What I'd like to do is have some sort of loop that sends a list of strings through the tranwrd function (applying to the same field - company_name). Like define a list of strings as a macro variable and send that through a loop within a macro or something like that.
data MyCompanies; set CompanyListing; Company_Name=tranwrd(strip(compress(CompanyName,,"p")),"LTD",''); run;
You would use an ARRAY, not a macro here. Use a temporary array to store the words you want to remove and then use a loop to remove them. You aren't factoring in variable CaSE though, so you may want to do that as well, eg. ltD, LTD, ltd, Ltd, Limited are all different and TRANWRD will not replace them.
data mycomp;
set listing;
array _remove(*) _temporary_ $ ('LTD', 'INC', 'CO');
do i=1 to dim(_remove);
company_name = tranwrd(strip(compress(companyName, ,"p")), _remove(i), '');
end;
run;
Hi @Reeza , when I try this on my data, I get this error (the structure is the same as yours)
array rmve_strs(*) _temporary_ $ ('LTD','LLC','INC','CORPORATION','COMPANY','LP');
_
22
200
ERROR: The non-variable based array rmve_strs has been defined with zero elements.
ERROR 22-322: Syntax error, expecting one of the following: (, ;.
ERROR 200-322: The symbol is not recognized and will be ignored.
1. It needs a dimension apparently so replace the * with the number of elements
2. Move the $ before the keyword _TEMPORARY_
@JediApprentice wrote:
Hi @Reeza , when I try this on my data, I get this error (the structure is the same as yours)
array rmve_strs(*) _temporary_ $ ('LTD','LLC','INC','CORPORATION','COMPANY','LP'); _ 22 200 ERROR: The non-variable based array rmve_strs has been defined with zero elements. ERROR 22-322: Syntax error, expecting one of the following: (, ;. ERROR 200-322: The symbol is not recognized and will be ignored.
@Reeza Hmm it runs without errors or warnings but still doesn't seem to be working, not removing those strings. The only thing it removes is the punctuation. In the output it shows the value of i as 7 all the way down, like it's stuck on one iteration outside the dimensions of the array. There is an info message in the log, not sure if it's relevant: Character variables have defaulted to a length of 200 at the places given by: (Line):(Column). Truncation can result. That is for company_name.
this is the current code:
data test;
set CompanyListing;
array rmve_strs(6) $ _temporary_ ('LTD','LLC','INC','CORPORATION','COMPANY','LP');
do i=1 to dim(rmve_strs);
company_name = tranwrd(strip(compress(CompanyName, ,"p")), rmve_strs(i), '');
end;
run;
Just like @ChrisNZ said, but taking care of spaces and punctuation. Adjust to your needs:
data MYCOMPANIES;
do COMPANYNAME = 'Big Corp., Local Division', 'Incredible Company Ltd.';
COMPANYNAME=prxchange('s/\s*\b(LTD|CO|INC|LLC|CORP)\b\.?\s*//io', 1, COMPANYNAME);
putlog COMPANYNAME=;
end;
run;
My understanding is that OP wants
I.B.M.
to become
IBM
and
Big Corp., Local Division
to become
Big Local Division
Join us for SAS Innovate 2025, our biggest and most exciting global event of the year, in Orlando, FL, from May 6-9. Sign up by March 14 for just $795.
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.
Ready to level-up your skills? Choose your own adventure.