BookmarkSubscribeRSS Feed
🔒 This topic is solved and locked. Need further help from the community? Please sign in and ask a new question.
HyunJee
Fluorite | Level 6

I have a dataset which contains multiple character responses. Below are four example of the same response as far as interpretation, but due to capitalization and punctuation, each response would have to be re-coded. There are multiple questions that have the same issue and I am trying to find ways to save time and also have code to make all the data uniform.

I wanted to clean-up the data prior to doing any re-coding by making all the responses uppercase and deleting any punctuation. Is there a SAS function that can do this?

Example of differences in format of responses, but interpretation is the same

I agree.

I Agree

I don't know

I Don't Know.

I appreciate any help you can give. Thank you.

1 ACCEPTED SOLUTION

Accepted Solutions
Haikuo
Onyx | Level 15

Hi, this may get you started:

data have;

input var $20.;

new_var=upcase(compress(var,,'kda'));

put new_var=;

cards;

I agree.

I Agree

I don't know

I Don't Know.

;

Haikuo

View solution in original post

6 REPLIES 6
Haikuo
Onyx | Level 15

Hi, this may get you started:

data have;

input var $20.;

new_var=upcase(compress(var,,'kda'));

put new_var=;

cards;

I agree.

I Agree

I don't know

I Don't Know.

;

Haikuo

HyunJee
Fluorite | Level 6

Hi Hai.kuo,

Thank you for your answer. It was very helpful. I was curious, does the 'kda' refer to the letters in the responses of the 'I agree' and 'I don't know'? Thanks again!

Haikuo
Onyx | Level 15

Hi,

'kda' will let compress() to only Keep Digits(numbers) and Alphabet (letters), and get rid of anything  else.

Haikuo

TomKari
Onyx | Level 15

Check out DataFlux. It is a SAS product specifically designed to improve data quality by doing things like standardizing responses.

Tom

HyunJee
Fluorite | Level 6

Hi Tom,

Is this a program that costs in addition to purchasing the SAS base program? Thanks.

TomKari
Onyx | Level 15

I'm afraid it is, and I believe it's fairly expensive. Really, it's only an option if your organization already has it, or alternatively, if your organization does enough data cleansing to make it worth licensing for the whole organization.

For your one requirement, it would be overkill.

Best,

Tom

SAS Innovate 2025: Save the Date

 SAS Innovate 2025 is scheduled for May 6-9 in Orlando, FL. Sign up to be first to learn about the agenda and registration!

Save the date!

What is Bayesian Analysis?

Learn the difference between classical and Bayesian statistical approaches and see a few PROC examples to perform Bayesian analysis in this video.

Find more tutorials on the SAS Users YouTube channel.

SAS Training: Just a Click Away

 Ready to level-up your skills? Choose your own adventure.

Browse our catalog!

Discussion stats
  • 6 replies
  • 1632 views
  • 3 likes
  • 3 in conversation