Dear all,
How can I clean the company name in the below table, namely, delete LTD LTD. LIMITED PLC PLC.?
table 1
data company_name;
infile datalines truncover;
input city $100.;
datalines;
ALTERNATIVE ASSET OPPS PCC LTD
ALTADIS SA
ALTANA AG
ALTERA CORPORATION
ALTERIAN LIMITED
ALTERNATIVE INVESTMENTS STRATEGIES LIMITED
ALTERNATIVE LIQUIDITY FUND LTD
DAISY CORPORATE SERVICES TRADING LTD
ALTIN AG
ALTITUDE GROUP PLC
ALTONA ENERGY PLC
ALTRIA GROUP, INC.
ALTUS RESOURCE CAPITAL LTD
ALTUS STRATEGIES PLC
ALTYN PLC
ALUSUISSE GROUP AG
ALUSUISSE GROUP AG
ALVIS PLC
AM N.V.
AMADEUS GLOBAL TRAVEL DISTRIBUTION, S.A.
ABSA GROUP LTD
AMALGAMATED METAL CORPORATION PLC
AMARA MINING PLC
ALT.ASST.OPPS.PCC LTD.
ALTADIS SA
ALTANA AG
ALTERA CORP.
ALTERIAN PLC.
ALT.INV.STGIS.LTD.
ALT.INV.STGIS.LTD.
ALTERNATIVE LQY.LTD.
ALTERNATIVE NETWKS.PLC.
ALTIFUND CAPITAL
ALTIFUND INCOME
ALTIN AG
ALTITUDE GROUP PLC.
ALTONA ENERGY PLC.
ALTRIA GROUP INCO.
ALTUS RESOURCE CAP.LTD.
;
run;
logic:
1splite each observation based on the blank.
2delete the character if 1)that character is the last character of the company name, and 2) equals 'LTD' 'LTD.' 'LIMITED' 'PLC' 'PLC.'.
3merge the rest of them together
for example,
for company_name '
ALTRIA GROUP, INC.
in the first step, it should be split into three variables, which like
ALTRIA | GROUP, | INC. |
in the second step, the 'PLC' which is the last character of the company_name variable should be deleted, like
ALTRIA | GROUP |
in the third step, the rest character should be merged together, like
ALTRIA GROUP, |
could you please give me some suggestion about this.
thanks in advance.
prxchange() is ideal for this:
data company_name;
infile datalines truncover;
input name $100.;
datalines;
ALTERNATIVE ASSET OPPS PCC LTD
ALTADIS SA
ALTANA AG
ALTERA CORPORATION
ALTERIAN LIMITED
ALTERNATIVE INVESTMENTS STRATEGIES LIMITED
ALTERNATIVE LIQUIDITY FUND LTD
DAISY CORPORATE SERVICES TRADING LTD
ALTIN AG
ALTITUDE GROUP PLC
ALTONA ENERGY PLC
ALTRIA GROUP, INC.
ALTUS RESOURCE CAPITAL LTD
ALTUS STRATEGIES PLC
ALTYN PLC
ALUSUISSE GROUP AG
ALUSUISSE GROUP AG
ALVIS PLC
AM N.V.
AMADEUS GLOBAL TRAVEL DISTRIBUTION, S.A.
ABSA GROUP LTD
AMALGAMATED METAL CORPORATION PLC
AMARA MINING PLC
ALT.ASST.OPPS.PCC LTD.
ALTADIS SA
ALTANA AG
ALTERA CORP.
ALTERIAN PLC.
ALT.INV.STGIS.LTD.
ALT.INV.STGIS.LTD.
ALTERNATIVE LQY.LTD.
ALTERNATIVE NETWKS.PLC.
ALTIFUND CAPITAL
ALTIFUND INCOME
ALTIN AG
ALTITUDE GROUP PLC.
ALTONA ENERGY PLC.
ALTRIA GROUP INCO.
ALTUS RESOURCE CAP.LTD.
;
data revised_company_name;
set company_name;
new_name = prxChange("s/,?\s*(incorporated|inco|inc|corporation|corp|limited|ltd|plc|ag|s\.a|sa)\.?\s*$//io",1,name);
run;
proc print; run;
see if this works plz
data company_name;
infile datalines truncover;
input city $100.;
datalines;
ALTERNATIVE ASSET OPPS PCC LTD
ALTADIS SA
ALTANA AG
ALTERA CORPORATION
ALTERIAN LIMITED
ALTERNATIVE INVESTMENTS STRATEGIES LIMITED
ALTERNATIVE LIQUIDITY FUND LTD
DAISY CORPORATE SERVICES TRADING LTD
ALTIN AG
ALTITUDE GROUP PLC
ALTONA ENERGY PLC
ALTRIA GROUP, INC.
ALTUS RESOURCE CAPITAL LTD
ALTUS STRATEGIES PLC
ALTYN PLC
ALUSUISSE GROUP AG
ALUSUISSE GROUP AG
ALVIS PLC
AM N.V.
AMADEUS GLOBAL TRAVEL DISTRIBUTION, S.A.
ABSA GROUP LTD
AMALGAMATED METAL CORPORATION PLC
AMARA MINING PLC
ALT.ASST.OPPS.PCC LTD.
ALTADIS SA
ALTANA AG
ALTERA CORP.
ALTERIAN PLC.
ALT.INV.STGIS.LTD.
ALT.INV.STGIS.LTD.
ALTERNATIVE LQY.LTD.
ALTERNATIVE NETWKS.PLC.
ALTIFUND CAPITAL
ALTIFUND INCOME
ALTIN AG
ALTITUDE GROUP PLC.
ALTONA ENERGY PLC.
ALTRIA GROUP INCO.
ALTUS RESOURCE CAP.LTD.
;
run;
data want;
set company_name;
call scan(city, -1, position, length,' ');
if scan(city,-1,' ') in ('LTD' ,'LTD.' ,'LIMITED' ,'PLC', 'PLC.') then substr(city,position)=' ';
drop position length;
run;
or just without blank ' ' charlist specification to handle
ALT.INV.STGIS.LTD.
ALT.INV.STGIS.LTD.
where delim is not blank
scan with default charlist as opposed to previous post of mine that did with ' '
data want;
set company_name;
call scan(city, -1, position, length);
if scan(city,-1) in ('LTD' ,'LTD.' ,'LIMITED' ,'PLC', 'PLC.') then substr(city,position)=' ';
drop position length;
run;
prxchange() is ideal for this:
data company_name;
infile datalines truncover;
input name $100.;
datalines;
ALTERNATIVE ASSET OPPS PCC LTD
ALTADIS SA
ALTANA AG
ALTERA CORPORATION
ALTERIAN LIMITED
ALTERNATIVE INVESTMENTS STRATEGIES LIMITED
ALTERNATIVE LIQUIDITY FUND LTD
DAISY CORPORATE SERVICES TRADING LTD
ALTIN AG
ALTITUDE GROUP PLC
ALTONA ENERGY PLC
ALTRIA GROUP, INC.
ALTUS RESOURCE CAPITAL LTD
ALTUS STRATEGIES PLC
ALTYN PLC
ALUSUISSE GROUP AG
ALUSUISSE GROUP AG
ALVIS PLC
AM N.V.
AMADEUS GLOBAL TRAVEL DISTRIBUTION, S.A.
ABSA GROUP LTD
AMALGAMATED METAL CORPORATION PLC
AMARA MINING PLC
ALT.ASST.OPPS.PCC LTD.
ALTADIS SA
ALTANA AG
ALTERA CORP.
ALTERIAN PLC.
ALT.INV.STGIS.LTD.
ALT.INV.STGIS.LTD.
ALTERNATIVE LQY.LTD.
ALTERNATIVE NETWKS.PLC.
ALTIFUND CAPITAL
ALTIFUND INCOME
ALTIN AG
ALTITUDE GROUP PLC.
ALTONA ENERGY PLC.
ALTRIA GROUP INCO.
ALTUS RESOURCE CAP.LTD.
;
data revised_company_name;
set company_name;
new_name = prxChange("s/,?\s*(incorporated|inco|inc|corporation|corp|limited|ltd|plc|ag|s\.a|sa)\.?\s*$//io",1,name);
run;
proc print; run;
Are you ready for the spotlight? We're accepting content ideas for SAS Innovate 2025 to be held May 6-9 in Orlando, FL. The call is open until September 25. Read more here about why you should contribute and what is in it for you!
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.