Hi SAS Community, I'm often working with datasets, where many of the variables are stored as text. For example, I'll have columns like "Gender" or various Yes/No questions that are stored as strings (e.g., Male/Female, Yes/No). For my analysis, I'd like to convert these text variables into numeric format (e.g., 1 = Male, 2 = Female; 1 = Yes, 0 = No) and apply corresponding labels for better readability during analysis. In the beginning I used plain if statements, but recently I found a better way using formats: data mydata;
input id $ foo $ bar $;
datalines;
a Yes No
d No Typo
f No Yes
;
run;
proc format;
value $parse_yes_no
'Yes' = '1'
'No' = '0'
other = '2';
value yes_no
1 = 'Yes'
0 = 'No'
2 = 'Other';
run;
data mydata_numeric;
set mydata;
foo_num = input(put(foo, $parse_yes_no.), 8.);
bar_num = input(put(bar, $parse_yes_no.), 8.);
drop foo bar;
rename foo_num=foo bar_num=bar;
run;
data mydata_labeled;
set mydata_numeric;
format foo yes_no. bar yes_no.;
run; While this works, I’m sure there must be a more efficient way to do this, especially if there are a lot of columns to convert. Is there a more efficient way to convert text fields to numeric values and apply labels, perhaps using arrays or macros to streamline this process for many variables? What would be the best practice in such a scenario to maintain clean and readable code while ensuring flexibility for different datasets? Any insights, ideas, or suggestions would be greatly appreciated! Thank you!
... View more