BookmarkSubscribeRSS Feed
Alexxxxxxx
Pyrite | Level 9

Hello,

 

How to exclude the 'place_name' from the company name?

 

for the following company name,

NAME
454 LIFE SCIENCES, A ROCHE COMPANY
A & A MATERIAL CORPORATION
A A HISTROM CORPORATION
A & A MANUFACTURING COMPANY
A. AHLSTROM / A FINNISH CORPORATION
A. STUCKI COMPANY, A DELAWARE CORPORATION
AB VOLVO, A SWEDISH BODY CORPORATE
ACCESS DATA CORPORATION A BROADRIDGE COMPANY
AMERITECH PLASTICS INCORPORATED (A DELAWARE CORPORATION)
ANEMOSTAT PRODUCTS DIVISION, DYNAMICS CORP. OF AMERCA, A NEW YORK CORPORATION
ANEST IWATA CORPORATION (A JAPANESE CORPORATION)
ANHUI HE AN INFORMATION TECHNOLOGY COMPANY
ANHUI HUAI AN CHEMICAL GROUP COMPANY
ANHUI JIN AN KANG BIOTECHNOLOGY COMPANY

 I would like to

1. change words like 'A JAPANESE CORPORATION' to 'CORP' ,

 

and 2 delete words like '(A JAPANESE CORPORATION)', as they are included in the '( )',.

 

However, as some of the word in the example are not place_name (for example 'A ROCHE COMPANY', 'A MATERIAL CORPORATION' are not place name), I try to use the following codes

data want;
set have;	
NAME=prxchange('s/(\.|\s)AN?\s(BELGIAN|BRITISH|BVI|CA(LIFORNIA)?|CAYMAN\sISLAND|DELAWARE|FINNISH|FRENCH|HONG\sKONG|GERMAN|IRISH|ISRAEL|JAPAN(ESE)?|KENTUCKY|LOUISIANA|MACHINERYNJ|NEW\sHAMPSHIRE|NEW\sYORK|NEVADA|NJ|OREGON|PENNSYLVANIA|RHODE\sISLANDS?|RICHMOND|SINGAPORE|SPANISH|SWEDISH|SWISS|TEXAS|UTAH|VIRGINIA|WASHINGTON)\sCOMPANY$/ CO/',-1,cat(strip(compbl(NAME))));
NAME=prxchange('s/\(AN?\s(BELGIAN|BRITISH|BVI|CA(LIFORNIA)?|CAYMAN\sISLAND|DELAWARE|FINNISH|FRENCH|HONG\sKONG|GERMAN|IRISH|ISRAEL|JAPAN(ESE)?|KENTUCKY|LOUISIANA|MACHINERYNJ|NEW\sHAMPSHIRE|NEW\sYORK|NEVADA|NJ|OREGON|PENNSYLVANIA|RHODE\sISLANDS?|RICHMOND|SINGAPORE|SPANISH|SWEDISH|SWISS|TEXAS|UTAH|VIRGINIA|WASHINGTON)\sCOMPANY\)/ /',-1,cat(strip(compbl(NAME))));


NAME=prxchange('s/(\.|\s|\()AN?\s(BELGIAN|BRITISH|BVI|CA(LIFORNIA)?|CAYMAN\sISLAND|DELAWARE|FINNISH|FRENCH|HONG\sKONG|GERMAN|IRISH|ISRAEL|JAPAN(ESE)?|KENTUCKY|LOUISIANA|MACHINERYNJ|NEW\sHAMPSHIRE|NEW\sYORK|NEVADA|NJ|OREGON|PENNSYLVANIA|RHODE\sISLANDS?|RICHMOND|SINGAPORE|SPANISH|SWEDISH|SWISS|TEXAS|UTAH|VIRGINIA|WASHINGTON)\s(BODY\s)?CORPORAT(ION|E)\)?/ CORP/',-1,cat(strip(compbl(NAME))));
NAME=prxchange('s/\(AN?\s(BELGIAN|BRITISH|BVI|CA(LIFORNIA)?|CAYMAN\sISLAND|DELAWARE|FINNISH|FRENCH|HONG\sKONG|GERMAN|IRISH|ISRAEL|JAPAN(ESE)?|KENTUCKY|LOUISIANA|MACHINERYNJ|NEW\sHAMPSHIRE|NEW\sYORK|NEVADA|NJ|OREGON|PENNSYLVANIA|RHODE\sISLANDS?|RICHMOND|SINGAPORE|SPANISH|SWEDISH|SWISS|TEXAS|UTAH|VIRGINIA|WASHINGTON)\s(BODY\s)?CORPORAT(ION|E)\)/ /',-1,cat(strip(compbl(NAME))));

NAME=prxchange('s/(\.|\s|\()AN?\s(BELGIAN|BRITISH|BVI|CA(LIFORNIA)?|CAYMAN\sISLAND|DELAWARE|FINNISH|FRENCH|HONG\sKONG|GERMAN|IRISH|ISRAEL|JAPAN(ESE)?|KENTUCKY|LOUISIANA|MACHINERYNJ|NEW\sHAMPSHIRE|NEW\sYORK|NEVADA|NJ|OREGON|PENNSYLVANIA|RHODE\sISLANDS?|RICHMOND|SINGAPORE|SPANISH|SWEDISH|SWISS|TEXAS|UTAH|VIRGINIA|WASHINGTON)\s(BODY\s)?LIMITED\sCOMPANY$/ CO/',-1,cat(strip(compbl(NAME))));
NAME=prxchange('s/\(AN?\s(BELGIAN|BRITISH|BVI|CA(LIFORNIA)?|CAYMAN\sISLAND|DELAWARE|FINNISH|FRENCH|HONG\sKONG|GERMAN|IRISH|ISRAEL|JAPAN(ESE)?|KENTUCKY|LOUISIANA|MACHINERYNJ|NEW\sHAMPSHIRE|NEW\sYORK|NEVADA|NJ|OREGON|PENNSYLVANIA|RHODE\sISLANDS?|RICHMOND|SINGAPORE|SPANISH|SWEDISH|SWISS|TEXAS|UTAH|VIRGINIA|WASHINGTON)\s(BODY\s)?LIMITED\sCOMPANY\)/ /',-1,cat(strip(compbl(NAME))));

NAME=prxchange('s/(\.|\s|\()AN?\s(BELGIAN|BRITISH|BVI|CA(LIFORNIA)?|CAYMAN\sISLAND|DELAWARE|FINNISH|FRENCH|HONG\sKONG|GERMAN|IRISH|ISRAEL|JAPAN(ESE)?|KENTUCKY|LOUISIANA|MACHINERYNJ|NEW\sHAMPSHIRE|NEW\sYORK|NEVADA|NJ|OREGON|PENNSYLVANIA|RHODE\sISLANDS?|RICHMOND|SINGAPORE|SPANISH|SWEDISH|SWISS|TEXAS|UTAH|VIRGINIA|WASHINGTON)\s(BODY\s)?LIMITED\sLIABILITY\sCOMPANY$/ LLC/',-1,cat(strip(compbl(NAME))));
NAME=prxchange('s/\(AN?\s(BELGIAN|BRITISH|BVI|CALIFORNIA|CAYMAN\sISLAND|DELAWARE|FINNISH|FRENCH|GERMAN|JAPANESE|KENTUCKY|LOUISIANA|MACHINERYNJ|NEW\sYORK|RHODE\sISLANDS?|SINGAPORE|SWEDISH|SWISS|UTAH)\s(BODY\s)?LIMITED\sLIABILITY\sCOMPANY\)/ /',-1,cat(strip(compbl(NAME))));

run;

but, it is too long to run the code.

 

I have a large number of data and expect to change many types of company suffix ( such as 'COMPANY', 'LIMITED COMPANY', 'CORPORATION','CORPORATE','COOPERATIVE','LIMITED\sLIABILITY\sCOMPANY').

I expect to get the result like 

NAMENew_Namechanged
454 LIFE SCIENCES, A ROCHE COMPANY455 LIFE SCIENCES, A ROCHE COMPANY 
A & A MATERIAL CORPORATIONA & A MATERIAL CORPORATION 
A A HISTROM CORPORATIONA A HISTROM CORPORATION 
A & A MANUFACTURING COMPANYA & A MANUFACTURING COMPANY 
A. AHLSTROM / A FINNISH CORPORATIONA. AHLSTROM CORP1
A. STUCKI COMPANY, A DELAWARE CORPORATIONA. STUCKI COMPANY CORP1
AB VOLVO, A SWEDISH BODY CORPORATEAB VOLVO CORP1
ACCESS DATA CORPORATION A BROADRIDGE COMPANYACCESS DATA CORPORATION A BROADRIDGE COMPANY 
AMERITECH PLASTICS INCORPORATED (A DELAWARE CORPORATION)AMERITECH PLASTICS INCORPORATED  
ANEMOSTAT PRODUCTS DIVISION, DYNAMICS CORP. OF AMERCA, A NEW YORK CORPORATIONANEMOSTAT PRODUCTS DIVISION, DYNAMICS CORP. OF AMERCA CORP1
ANEST IWATA CORPORATION (A JAPANESE CORPORATION)ANEST IWATA CORPORATION1
ANHUI HE AN INFORMATION TECHNOLOGY COMPANYANHUI HE AN INFORMATION TECHNOLOGY COMPANY 
ANHUI HUAI AN CHEMICAL GROUP COMPANYANHUI HUAI AN CHEMICAL GROUP COMPANY 
ANHUI JIN AN KANG BIOTECHNOLOGY COMPANYANHUI JIN AN KANG BIOTECHNOLOGY COMPANY 

Could you please give me some suggestion about this?

is there any method to simply the code?

 

thanks in advance.

1 REPLY 1
ChrisNZ
Tourmaline | Level 20

1. What do you use the CAT function?

2. Use the COMPBL function first, once and for all 

3. Regular Expression are very power full, but expensive to run, so slowness is expected

4. Function INDEX is very fast. Test the string before calling the PRXCHANGE function:
  if index(NAME,'COMPANY') then NAME=prxchange(...

5. Try adding an o after the last slash for the RegEx so it compiles once only.

6. Can you have 1 instead of -1?

sas-innovate-2024.png

Available on demand!

Missed SAS Innovate Las Vegas? Watch all the action for free! View the keynotes, general sessions and 22 breakouts on demand.

 

Register now!

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

Click image to register for webinarClick image to register for webinar

Classroom Training Available!

Select SAS Training centers are offering in-person courses. View upcoming courses for:

View all other training opportunities.

Discussion stats
  • 1 reply
  • 414 views
  • 0 likes
  • 2 in conversation