BookmarkSubscribeRSS Feed
🔒 This topic is solved and locked. Need further help from the community? Please sign in and ask a new question.
fengyuwuzu
Pyrite | Level 9

in my data, the gender has values like Male, M, m, male, Female, FEMALE, F, f.

When i use proc freq, they are all listed as different values.

I can recode them using if condition, but is there a simpler way to recode them unifomly?

 

Thanks

1 ACCEPTED SOLUTION

Accepted Solutions
Astounding
PROC Star

Another possibility:  don't recode.  Create a format to translate:

 

proc format;

value $gender 'M', 'm', 'Male', 'male' = 'Male'

'F', 'f', 'Female', 'female' = 'Female';

run;

 

proc freq data=have;

tables gender;

format gender $gender.;

run;

 

This leaves open the possibility of not changing your data, but still grouping the values for reporting purposes. It's a decent approach for cleaning data, as the PROC FREQ output displays values that occur in the data but are not accounted for using the format. 

 

If you did want to change the values, you could still apply the format:

 

data want;

set have;

sex = put(gender, $gender.);

run;

View solution in original post

6 REPLIES 6
FreelanceReinh
Jade | Level 19

Hi @fengyuwuzu,

 

You could use character functions instead. Example:

data want;
set have;
length sex $1;
sex=lowcase(char(gender,1));
drop gender;
rename sex=gender;
run;
fengyuwuzu
Pyrite | Level 9
Thank you, This also works!
FreelanceReinh
Jade | Level 19

@fengyuwuzu: When using the formatting approach you may want to use the order=formatted option in PROC FREQ statements, so as to ensure a consistent ordering of the two categories in the output. (Otherwise, the sort order would depend on the data because, for example, 'F'<'M', but 'M'<'f'.)

fengyuwuzu
Pyrite | Level 9
Thank you for the tip, FreelanceReinhard.
Astounding
PROC Star

Another possibility:  don't recode.  Create a format to translate:

 

proc format;

value $gender 'M', 'm', 'Male', 'male' = 'Male'

'F', 'f', 'Female', 'female' = 'Female';

run;

 

proc freq data=have;

tables gender;

format gender $gender.;

run;

 

This leaves open the possibility of not changing your data, but still grouping the values for reporting purposes. It's a decent approach for cleaning data, as the PROC FREQ output displays values that occur in the data but are not accounted for using the format. 

 

If you did want to change the values, you could still apply the format:

 

data want;

set have;

sex = put(gender, $gender.);

run;

ballardw
Super User

If I'm building the data set I often take @Astounding's idea one step earlie with an INFORMAT to read all of the values into my desired coding so later on all of the values a uniform.

sas-innovate-2024.png

Don't miss out on SAS Innovate - Register now for the FREE Livestream!

Can't make it to Vegas? No problem! Watch our general sessions LIVE or on-demand starting April 17th. Hear from SAS execs, best-selling author Adam Grant, Hot Ones host Sean Evans, top tech journalist Kara Swisher, AI expert Cassie Kozyrkov, and the mind-blowing dance crew iLuminate! Plus, get access to over 20 breakout sessions.

 

Register now!

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

Click image to register for webinarClick image to register for webinar

Classroom Training Available!

Select SAS Training centers are offering in-person courses. View upcoming courses for:

View all other training opportunities.

Discussion stats
  • 6 replies
  • 4389 views
  • 4 likes
  • 4 in conversation