Dear all,
When I use the following codes
data HAVE;
input NAME :& $500.;
if find(NAME,' CO ') then do;
if prxmatch('/(.*) CO\s?[\(.*\)|\[.*\]|\{.*\}|''.*''|".*"](.*)/',NAME ) then do;
NAME_B=prxchange('s/(.*) CO\s?([(\(.*\))|(\[.*\])|(\{.*\})|(''.*'')|(".*")]+)(.*)/\1 CO \2/',1,NAME );
NAME_address=prxchange('s/(.*) CO\s?([(\(.*\))|(\[.*\])|(\{.*\})|(''.*'')|(".*")]+)(.*)/\3/',1,NAME );
end;
else do;
NAME_B=substr(NAME,1,find(NAME,' CO ')+2);
NAME_address=strip(substr(NAME,find(NAME,' CO ')+4, length(NAME)));
end;
end;
cards;
3M CO (MINNESOTA MINING AND MANUFACTURING COMPANY)
3M CO 'MINNESOTA MINING AND MANUFACTURING COMPANY'
3M CO "MINNESOTA MINING AND MANUFACTURING COMPANY"
3M CO (MINNESOTA MINING AND MANUFACTURING COMPANY) A LTD IN UK
3M CO 'MINNESOTA MINING AND MANUFACTURING COMPANY' A UK CO
3M CO "MINNESOTA MINING AND MANUFACTURING COMPANY" A UK LTD
3M CO (1986) A LTD IN UK
3M CO 'US' A UK CO
3M CO "UK" A UK LTD
3M CO A UK LTD
run;
I got
NAME | NAME_B | NAME_address |
3M CO (MINNESOTA MINING AND MANUFACTURING COMPANY) | 3M CO ( | MINNESOTA MINING AND MANUFACTURING COMPANY) |
3M CO 'MINNESOTA MINING AND MANUFACTURING COMPANY' | 3M CO ' | MINNESOTA MINING AND MANUFACTURING COMPANY' |
3M CO "MINNESOTA MINING AND MANUFACTURING COMPANY" | 3M CO " | MINNESOTA MINING AND MANUFACTURING COMPANY" |
3M CO (MINNESOTA MINING AND MANUFACTURING COMPANY) A LTD IN UK | 3M CO ( | MINNESOTA MINING AND MANUFACTURING COMPANY) A LTD IN UK |
3M CO 'MINNESOTA MINING AND MANUFACTURING COMPANY' A UK CO | 3M CO ' | MINNESOTA MINING AND MANUFACTURING COMPANY' A UK CO |
3M CO "MINNESOTA MINING AND MANUFACTURING COMPANY" A UK LTD | 3M CO " | MINNESOTA MINING AND MANUFACTURING COMPANY" A UK LTD |
3M CO (1986) A LTD IN UK | 3M CO ( | 1986) A LTD IN UK |
3M CO 'US' A UK CO | 3M CO ' | US' A UK CO |
3M CO "UK" A UK LTD | 3M CO " | UK" A UK LTD |
3M CO A UK LTD | 3M CO | A UK LTD |
However, I expect to get
NAME | NAME_B | NAME_address |
3M CO (MINNESOTA MINING AND MANUFACTURING COMPANY) | 3M CO (MINNESOTA MINING AND MANUFACTURING COMPANY) | |
3M CO 'MINNESOTA MINING AND MANUFACTURING COMPANY' | 3M CO 'MINNESOTA MINING AND MANUFACTURING COMPANY' | |
3M CO "MINNESOTA MINING AND MANUFACTURING COMPANY" | 3M CO "MINNESOTA MINING AND MANUFACTURING COMPANY" | |
3M CO (MINNESOTA MINING AND MANUFACTURING COMPANY) A LTD IN UK | 3M CO (MINNESOTA MINING AND MANUFACTURING COMPANY) | A LTD IN UK |
3M CO 'MINNESOTA MINING AND MANUFACTURING COMPANY' A UK CO | 3M CO 'MINNESOTA MINING AND MANUFACTURING COMPANY' | A UK CO |
3M CO "MINNESOTA MINING AND MANUFACTURING COMPANY" A UK LTD | 3M CO "MINNESOTA MINING AND MANUFACTURING COMPANY" | A UK LTD |
3M CO (1986) A LTD IN UK | 3M CO (1986) | A LTD IN UK |
3M CO 'US' A UK CO | 3M CO 'US' | A UK CO |
3M CO "UK" A UK LTD | 3M CO "UK" | A UK LTD |
3M CO A UK LTD | 3M CO | A UK LTD |
I expect to 1.separate the name variable into two different variables (which are NAME_B, NAME_address) based on the company suffix 'CO' and 2. the string among (), [], {}, ' ', " " should be put in the NAME_B variable.
Could you please give me some suggestions about this?
thanks in advance.
See if this helps.
data HAVE(drop= start LENGTH prx_id);
input NAME :& $500.;
Length NAME_B $500. NAME_address $100.;
if _n_=1 then prx_id=prxparse("/A\s[UL][KT][\sD][\w\s][\w]+\s?(UK)*/");
retain prx_id;
call prxsubstr(prx_id,NAME,START,LENGTH);
if START > 0 then do;
NAME_B=substrn(NAME,1,start-1);
NAME_address=substrn(NAME,start,LENGTH);
end;
else do;
NAME_B=Name;
NAME_address='';
end;
cards;
3M CO (MINNESOTA MINING AND MANUFACTURING COMPANY)
3M CO 'MINNESOTA MINING AND MANUFACTURING COMPANY'
3M CO "MINNESOTA MINING AND MANUFACTURING COMPANY"
3M CO (MINNESOTA MINING AND MANUFACTURING COMPANY) A LTD IN UK
3M CO 'MINNESOTA MINING AND MANUFACTURING COMPANY' A UK CO
3M CO "MINNESOTA MINING AND MANUFACTURING COMPANY" A UK LTD
3M CO (1986) A LTD IN UK
3M CO 'US' A UK CO
3M CO "UK" A UK LTD
3M CO A UK LTD
run;
Build your skills. Make connections. Enjoy creative freedom. Maybe change the world. Registration is now open through August 30th. Visit the SAS Hackathon homepage.
Register today!Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.
Select SAS Training centers are offering in-person courses. View upcoming courses for: