Replacing whole string IF that string contains a certain sub string.

Accepted Solution Solved
Reply
New Contributor
Posts: 4
Accepted Solution

Replacing whole string IF that string contains a certain sub string.

I have a string variable called degree that contains student's majors. My problem is that the majors were written in so they are not uniform.

 

For example:  for history majors, some say "history" and some say "mod-hist". Some say "stat" while some say "statistics". Some are capitalized, some are not.

 

I need to make all of the majors have a uniform label. I want all of the history majors to have "history" as their major. All of the stats majors to have "statistics" as their major. (etc).

 

What is the most simple way to do this? 

 

I was trying to find somewhere to search for a substring such as "hist" and then replace that WHOLE string value (not just the substring) with "history" but I haven't found a way to do that. 


Accepted Solutions
Solution
‎10-12-2017 03:53 PM
PROC Star
Posts: 1,800

Re: Replacing whole string IF that string contains a certain sub string.

Posted in reply to UCFtigers2017

Like this?


if find(DEGREE, 'hist', 'i') then DEGREE='History';

View solution in original post


All Replies
Solution
‎10-12-2017 03:53 PM
PROC Star
Posts: 1,800

Re: Replacing whole string IF that string contains a certain sub string.

Posted in reply to UCFtigers2017

Like this?


if find(DEGREE, 'hist', 'i') then DEGREE='History';
Super User
Posts: 20,236

Re: Replacing whole string IF that string contains a certain sub string.

Posted in reply to UCFtigers2017

 

I would start by making everything Proper or Upper Case. 

Then remove all the obvious values that are already ok. 

Then make your rules and recode things. It may mean a lot of IF/THEN statements. 

 

Look at the character functions in the documentation (search SAS functions by category) to see useful functions such as COMPRESS, COMBL, FIND and INDEX. 

 

Respected Advisor
Posts: 4,186

Re: Replacing whole string IF that string contains a certain sub string.

Posted in reply to UCFtigers2017

@UCFtigers2017

I would be doing something similar to what @Reeza suggests:

1. compbl() and propcase() your source variable

2. create a table with distinct values

3. use these values to create a format (i.e. export the table to Excel, add your standardized labels, re-import the Excel and create a format out of it)

4. apply the format to the original values (with compl() and propcase() applied first) to standardize the strings.

Super User
Posts: 11,578

Re: Replacing whole string IF that string contains a certain sub string.

Posted in reply to UCFtigers2017

You may have to very carefully consider the order of your processing and the actual desired target(s).

ExampleS: Art History,Secondary Education- History emphasis

Is Art History an Art degree or History or possibly Humanities? If it is Art then you would have to search and replace 'Art' before your "hist" search AND the replacement value could not contain the word History. Otherwise your other search would change it from Art to History.

And perhaps you have some majors that don't actually include the 'key' value.

 

Is Business Statistics a Business degree or a Statistics degree (or program or major or concentration or ??).

 

I might take a different approach then "if / then /else" by creating a list of all of the unique values and creating a format or even an Informat, to allow use of Just and Upcase options for comparison,

☑ This topic is solved.

Need further help from the community? Please ask a new question.

Discussion stats
  • 4 replies
  • 91 views
  • 0 likes
  • 5 in conversation