Help using Base SAS procedures

Re-coding contents of variables to be in all lowercase/uppercase and delete punctuation

Accepted Solution Solved
Reply
Contributor
Posts: 71
Accepted Solution

Re-coding contents of variables to be in all lowercase/uppercase and delete punctuation

I have a dataset which contains multiple character responses. Below are four example of the same response as far as interpretation, but due to capitalization and punctuation, each response would have to be re-coded. There are multiple questions that have the same issue and I am trying to find ways to save time and also have code to make all the data uniform.

I wanted to clean-up the data prior to doing any re-coding by making all the responses uppercase and deleting any punctuation. Is there a SAS function that can do this?

Example of differences in format of responses, but interpretation is the same

I agree.

I Agree

I don't know

I Don't Know.

I appreciate any help you can give. Thank you.


Accepted Solutions
Solution
‎07-30-2012 09:31 AM
Respected Advisor
Posts: 3,156

Re: Re-coding contents of variables to be in all lowercase/uppercase and delete punctuation

Hi, this may get you started:

data have;

input var $20.;

new_var=upcase(compress(var,,'kda'));

put new_var=;

cards;

I agree.

I Agree

I don't know

I Don't Know.

;

Haikuo

View solution in original post


All Replies
Solution
‎07-30-2012 09:31 AM
Respected Advisor
Posts: 3,156

Re: Re-coding contents of variables to be in all lowercase/uppercase and delete punctuation

Hi, this may get you started:

data have;

input var $20.;

new_var=upcase(compress(var,,'kda'));

put new_var=;

cards;

I agree.

I Agree

I don't know

I Don't Know.

;

Haikuo

Contributor
Posts: 71

Re: Re-coding contents of variables to be in all lowercase/uppercase and delete punctuation

Hi Hai.kuo,

Thank you for your answer. It was very helpful. I was curious, does the 'kda' refer to the letters in the responses of the 'I agree' and 'I don't know'? Thanks again!

Respected Advisor
Posts: 3,156

Re: Re-coding contents of variables to be in all lowercase/uppercase and delete punctuation

Hi,

'kda' will let compress() to only Keep Digits(numbers) and Alphabet (letters), and get rid of anything  else.

Haikuo

PROC Star
Posts: 1,167

Re: Re-coding contents of variables to be in all lowercase/uppercase and delete punctuation

Check out DataFlux. It is a SAS product specifically designed to improve data quality by doing things like standardizing responses.

Tom

Contributor
Posts: 71

Re: Re-coding contents of variables to be in all lowercase/uppercase and delete punctuation

Hi Tom,

Is this a program that costs in addition to purchasing the SAS base program? Thanks.

PROC Star
Posts: 1,167

Re: Re-coding contents of variables to be in all lowercase/uppercase and delete punctuation

I'm afraid it is, and I believe it's fairly expensive. Really, it's only an option if your organization already has it, or alternatively, if your organization does enough data cleansing to make it worth licensing for the whole organization.

For your one requirement, it would be overkill.

Best,

Tom

🔒 This topic is solved and locked.

Need further help from the community? Please ask a new question.

Discussion stats
  • 6 replies
  • 295 views
  • 3 likes
  • 3 in conversation