BookmarkSubscribeRSS Feed
🔒 This topic is solved and locked. Need further help from the community? Please sign in and ask a new question.
wlierman
Lapis Lazuli | Level 10

I have a question on how to remove only apostrophes from proper names on a row-by-row basis. There maybe more than one apostrophe that must be removed.

 

 

For example I have "place of employment" field and the entries in this field are names of businesses.  But the same business names are not entered consistently so that

 

FREDS PHARMACY

FRED'S PHARMACY

OREGON CITY FRED MEYER

OREGON CITY'S FRED MEYER

OREGON CITY FRED MEYER'S

JIMS GLADSTONE SUBARU CARS AND TRUCKS

JIM'S GLADSTONE SUBARU CAR'S AND TRUCK'S

 

and so on.

 

what I want is the apostrophes removed so the names of businesses above look like

 

FREDS PHARMACY

FREDS PHARMACY

OREGON CITY FRED MEYER

OREGON CITYS FRED MEYER

OREGON CITY FRED MEYERS

JIMS GLADSTONE SUBARU CARS AND TRUCKS

JIMS GLADSTONE SUBARU CARS AND TRUCKS

 

I don't care about duplicates since each row is a unique contact.

 

I have slightly over 30,000 non-duplicated rows to check and then remove any apostrophes. 

 

Thank you for your help.

 

 

1 ACCEPTED SOLUTION

Accepted Solutions
Tom
Super User Tom
Super User

You put the cart before the horse.

data SASCDC_2.Arias_NAICS_Classify_H;
  set SASCDC_2.Arias_NAICS_Classify_H;
  P_O_E = COMPRESS(Place_of_employment, "'");
run;

 

View solution in original post

6 REPLIES 6
Reeza
Super User
How can you define a proper name apostrophe from other apostrophes?

You can use COMPRESS() to remove all apostrophes, but removing it from proper names isn't an easy task.
ballardw
Super User

The compress function will remove characters.

 

data example;
   x="JIM'S GLADSTONE SUBARU CAR'S AND TRUCK'S";
   y=compress(x,"'");
run;

I use the double quotes around the character to remove in compress for legibility and create a new variable so you can compare results.

wlierman
Lapis Lazuli | Level 10

I applied the compress code.

 

I received the following result in the log

7121  Data SASCDC_2.Arias_NAICS_Classify_H;
7122     *Retain Contact_Person_ID Place_of_Employment NAICS_Sector Sector_Type Type_Firm
7122! Other_Categories;
7123     P_O_E = COMPRESS(Place_of_employment, "'");
7124  Set SASCDC_2.Arias_NAICS_Classify_H;
ERROR: Variable Place_of_employment has been defined as both character and numeric.
7125  run;

NOTE: Numeric values have been converted to character values at the places given by:
      (Line):(Column).
      7123:21
NOTE: The SAS System stopped processing this step because of errors.
WARNING: The data set SASCDC_2.ARIAS_NAICS_CLASSIFY_H may be incomplete.  When this step was
         stopped there were 0 observations and 8 variables.
WARNING: Data set SASCDC_2.ARIAS_NAICS_CLASSIFY_H was not replaced because this step was
         stopped.
NOTE: DATA statement used (Total process time):
      real time           0.01 seconds
      cpu time            0.01 seconds

Place_of_employment is a character variable.  But does the compress function convert it to numeric?

 

 

Reeza
Super User
Move the code to after the SET statement. Before the variable doesn't exist.
Tom
Super User Tom
Super User

You put the cart before the horse.

data SASCDC_2.Arias_NAICS_Classify_H;
  set SASCDC_2.Arias_NAICS_Classify_H;
  P_O_E = COMPRESS(Place_of_employment, "'");
run;

 

wlierman
Lapis Lazuli | Level 10

Thank you for the help.

I tend to put the cart before the horse quite often!

 

 

sas-innovate-2024.png

Available on demand!

Missed SAS Innovate Las Vegas? Watch all the action for free! View the keynotes, general sessions and 22 breakouts on demand.

 

Register now!

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

Click image to register for webinarClick image to register for webinar

Classroom Training Available!

Select SAS Training centers are offering in-person courses. View upcoming courses for:

View all other training opportunities.

Discussion stats
  • 6 replies
  • 1124 views
  • 4 likes
  • 4 in conversation