I have a column of names. How do I remove everything in the parentheses (), including the parentheses, and all punctuation (i.e., . , / and the like)?
Here's the file with observation number and name.
1 ACCU TEST SYSTEMS INC
2 ADDVANTAGE TECHNOLOGIES GROUP INC (AEY)
3 AIRXPANDERS INC
4 ALFI, INC. (ALF, ALFIW)
5 AMERICAN PAIN AND STRESS INC
6 AMERICAN TELEDATA CORP.
7 BIONOVO INC
8 BRENCO INTERNATIONAL INC.
9 CANOO INC. (GOEV, GOEVW)
10 CAREER EDUCATION CORP (PRDO)
11 CODI CORP
12 COMPUTER STORE INC (THE )
13 CPI LTD (COMMUNICATIONS & POWER INDUSTRIES)
14 CYREN LTD. (CYRN)
15 DIPLOMAT ELECTRONICS CORP
16 DIXIE EQUIPMENT COMPANY, INC.
17 DUTCH GOLD RESOURCES INC
18 ELECTRIC LAST MILE SOLUTIONS, INC. (ELMS, ELMSW)
19 ELECTRONIC CIGARETTES INTERNATIONAL GROUP, LTD.
20 ELECTRONIC GAME CARD INC
21 EMPIRE POST MEDIA, INC. (EMPM)
22 ENERGROUP TECHNOLOGIES
23 GLOBAL ENVIRONMENTAL LABORATORIES, INC.
24 GLOBE PHOTOS, INC. (GBPT)
25 GREENROSE HOLDING CO INC.
26 HURRICANE CREEK PARTNERS LTD.
27 INTEGRATED NETWORK CORP.
28 INTERNATIONAL HERITAGE, INC.
29 IVDS INTERACTIVE ACQUISITION PARTNERS
30 JACKS INC
31 LEXAGENE HOLDINGS INC. (LXXGF)
32 LIQOUR BARN NORTHERN CALIF. INC
33 LOGOS SCIENTIFIC, INC.
34 LUXEYARD, INC.
35 MAD CATZ INTERACTIVE INC
36 MARK BENSKIN & CO. INC
37 MEDIA CENTRAL, INC.
38 META MATERIALS INC. (MMAT)
39 MOMENTOUS ENTERTAINMENT GROUP INC
40 NDIVISION INC. (NDVN)
41 NEUROLOGIX INC/DE
42 PROTECH, INC. D.B.A. OMEGA TEST SYSTEMS
43 REDBOX ENTERTAINMENT INC. (RDBX, RDBXW)
44 REGENT ENTERPRISE INC.
45 RODAC CORP
46 RPC INC (RES)
47 SHAPEWAYS HOLDINGS, INC. (SHPWQ)
48 STEALTH TECHNOLOGIES, INC. (STTH, STTHD)
49 SULPHCO INC
50 SUNWORKS, INC. (SUNW)
51 TOYS PLUS INC
52 TRANSBIOTEC, INC. (IMLE, SOBR, IMLED)
53 TRANSWITCH CORP /DE
54 UNIQUE FABRICATING, INC. (UFAB)
55 VISION COMMUNICATIONS CORP
56 VSC INC (VULCAN STEEL COMPANY)
57 WEBER INC. (WEBR)
From another post it looks like this would work for you:
data want;
set have;
want = prxchange('s/\(([^\)]+)\)//i', -1, yournames);
drop yournames;
run;
So row 21 reads
EMPIRE POST MEDIA, INC. (EMPM)
and you want
EMPIRE POST MEDIA INC
Am I understanding correctly?
From another post it looks like this would work for you:
data want;
set have;
want = prxchange('s/\(([^\)]+)\)//i', -1, yournames);
drop yournames;
run;
I always do this crazy thing with embedded compress calls, but someone will have a much more slick way:
data test;
nm='some / company name (ABC)';
nmsub=strip(scan(nm,1,'()'));
nmsub=compress(nmsub,compress(lowcase(nmsub),'abcdefghijklmnopqrstuvwxyz0123456789 '));
run;
proc print data=test; run;
Here is another way:
data raw;
informat text $80.;
input text &;
cards;
abc(123)
(123)abc(pqr)
;
data want;
set raw;
want = prxchange('s/\(.*?\)//i', -1, text);
run;
Good news: We've extended SAS Hackathon registration until Sept. 12, so you still have time to be part of our biggest event yet – our five-year anniversary!
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.
Ready to level-up your skills? Choose your own adventure.