Hi, I have about 190 variables to change from character variable to numeric variable.
Since they are questionnaire data, original data in Excel contains text like: 1-not satisfied, 2-likely not satisfied,....,5-very satisfied. I used compress to extract the number only for the health variable and used the traditional way to convert char to num (please see below code). Is there a faster way to do it?
data allnew;
set all;
HealthN = compress(Health, '-', 'A');
newHealthN = input(HealthN,1.0);
drop Health HealthN;
rename newHealthN = Health;
run;
The previous person who worked on this project wrote this code:
/*extract the numbers from categorical variables- to remove format-change to numeric*/
data allnew;
set all;
array sect[*] $ Health Health_later SECT_I_5--SECT_I_38 SECT_IV_a--SECT_VIII_13 Sleep_Quality;
array newsect[*] Health Health_later SECT_I_5--SECT_I_38 SECT_IV_a--SECT_VIII_13 Sleep_Quality;
do i=1 to dim(sect);
newsect[i]=input(substr(sect[i],1,1),1.);
end;
drop i;
run;
However, log did not say the right thing:
Numeric values have been converted to character values at the places given by:
(Line):(Column).
Will you be creating an SDTM domain for this data? I ask as SDTM are normalised data structures, and if I had to deal with this type of data the first thing I would do in most cases is to normalise it, this makes most programming tasks simpler in the long run. So have:
<id variables> QUESTION RESPONSE
Q1 1-Happy
Q2 3-Unhappy
...
Then you can simply do:
numeric_response=input(scan(response,1,"-"),best.);
Now to answer your question, your reading data in from Excel (which is unstructured), and from the warning message we can see that it is trying to convert numeric data to chraracter, so I would assume that something on your import is going wrong and one of the variables is nuemeric rather than text.
Please see the proc contents output. They are labeled as char instead of numeric.
SubjectID | Char | 5 | $5. | $5. | SubjectID |
VISIT | Char | 4 | $4. | $4. | VISIT |
InterviewerID | Num | 8 | InterviewerID | ||
TodaysDate | Num | 8 | DATE9. | DATE9. | TodaysDate |
Quality_Of_Life | Char | 2 | $2. | $2. | Quality_Of_Life |
Curr_Quality_Of_Life | Char | 2 | $2. | $2. | Curr_Quality_Of_Life |
Health | Char | 11 | $11. | $11. | Health |
Health_Later | Char | 36 | $36. | $36. | Health_Later |
SECT_I_5 | Char | 18 | $18. | $18. | SECT_I_5 |
SECT_I_6 | Char | 11 | $11. | $11. | SECT_I_6 |
SECT_I_7 | Char | 18 | $18. | $18. | SECT_I_7 |
SECT_I_8 | Char | 24 | $24. | $24. | SECT_I_8 |
SECT_I_9 | Char | 24 | $24. | $24. | SECT_I_9 |
SECT_I_10 | Char | 24 | $24. | $24. | SECT_I_10 |
SECT_I_11 | Char | 24 | $24. | $24. | SECT_I_11 |
SECT_I_12 | Char | 24 | $24. | $24. | SECT_I_12 |
SECT_I_13 | Char | 24 | $24. | $24. | SECT_I_13 |
SECT_I_14 | Char | 24 | $24. | $24. | SECT_I_14 |
SECT_I_15 | Char | 24 | $24. | $24. | SECT_I_15 |
SECT_I_16 | Char | 24 | $24. | $24. | SECT_I_16 |
SECT_I_17 | Char | 24 | $24. | $24. | SECT_I_17 |
SECT_I_18 | Char | 24 | $24. | $24. | SECT_I_18 |
SECT_I_19 | Char | 24 | $24. | $24. | SECT_I_19 |
SECT_I_20 | Char | 24 | $24. | $24. | SECT_I_20 |
SECT_I_21 | Char | 24 | $24. | $24. | SECT_I_21 |
SECT_I_22 | Char | 24 | $24. | $24. | SECT_I_22 |
SECT_I_23 | Char | 24 | $24. | $24. | SECT_I_23 |
SECT_I_24 | Char | 24 | $24. | $24. | SECT_I_24 |
SECT_I_25 | Char | 24 | $24. | $24. | SECT_I_25 |
SECT_I_26 | Char | 24 | $24. | $24. | SECT_I_26 |
SECT_I_27 | Char | 5 | $5. | $5. | SECT_I_27 |
SECT_I_28 | Char | 5 | $5. | $5. | SECT_I_28 |
SECT_I_29 | Char | 5 | $5. | $5. | SECT_I_29 |
SECT_I_30 | Char | 5 | $5. | $5. | SECT_I_30 |
SECT_I_31 | Char | 5 | $5. | $5. | SECT_I_31 |
SECT_I_32 | Char | 5 | $5. | $5. | SECT_I_32 |
SECT_I_33 | Char | 5 | $5. | $5. | SECT_I_33 |
SECT_I_34 | Char | 18 | $18. | $18. | SECT_I_34 |
SECT_I_35 | Char | 18 | $18. | $18. | SECT_I_35 |
SECT_I_36 | Char | 18 | $18. | $18. | SECT_I_36 |
SECT_I_37 | Char | 18 | $18. | $18. | SECT_I_37 |
SECT_I_38 | Char | 22 | $22. | $22. | SECT_I_38 |
SECT_II_B | Char | 2 | $2. | $2. | SECT_II_B |
SECT_II_C_A | Num | 8 | SECT_II_C_A | ||
SECT_II_C_B | Num | 8 | SECT_II_C_B | ||
SECT_II_C_C | Num | 8 | SECT_II_C_C | ||
SECT_III_1 | Char | 1 | $1. | $1. | SECT_III_1 |
SECT_III_2 | Char | 1 | $1. | $1. | SECT_III_2 |
SECT_III_3 | Char | 1 | $1. | $1. | SECT_III_3 |
SECT_III_4 | Char | 1 | $1. | $1. | SECT_III_4 |
SECT_III_5 | Char | 1 | $1. | $1. | SECT_III_5 |
SECT_III_6 | Char | 1 | $1. | $1. | SECT_III_6 |
SECT_III_7 | Char | 1 | $1. | $1. | SECT_III_7 |
SECT_III_8 | Char | 1 | $1. | $1. | SECT_III_8 |
SECT_III_9 | Char | 1 | $1. | $1. | SECT_III_9 |
SECT_III_10 | Char | 1 | $1. | $1. | SECT_III_10 |
SECT_IV_a | Char | 15 | $15. | $15. | SECT_IV_a |
SECT_IV_b | Char | 15 | $15. | $15. | SECT_IV_b |
SECT_IV_c | Char | 15 | $15. | $15. | SECT_IV_c |
SECT_IV_d | Char | 15 | $15. | $15. | SECT_IV_d |
SECT_IV_e | Char | 15 | $15. | $15. | SECT_IV_e |
SECT_IV_f | Char | 15 | $15. | $15. | SECT_IV_f |
SECT_IV_g | Char | 15 | $15. | $15. | SECT_IV_g |
SECT_IV_h | Char | 15 | $15. | $15. | SECT_IV_h |
SECT_IV_i | Char | 15 | $15. | $15. | SECT_IV_i |
SECT_IV_j | Char | 15 | $15. | $15. | SECT_IV_j |
SECT_IV_k | Char | 15 | $15. | $15. | SECT_IV_k |
SECT_IV_l | Char | 15 | $15. | $15. | SECT_IV_l |
SECT_IV_m | Char | 15 | $15. | $15. | SECT_IV_m |
SECT_IV_n | Char | 15 | $15. | $15. | SECT_IV_n |
SECT_IV_o | Char | 15 | $15. | $15. | SECT_IV_o |
SECT_IV_p | Char | 15 | $15. | $15. | SECT_IV_p |
SECT_IV_q | Char | 15 | $15. | $15. | SECT_IV_q |
SECT_IV_r | Char | 15 | $15. | $15. | SECT_IV_r |
SECT_IV_s | Char | 15 | $15. | $15. | SECT_IV_s |
SECT_IV_t | Char | 15 | $15. | $15. | SECT_IV_t |
Phys_Well_1 | Char | 14 | $14. | $14. | Phys_Well_1 |
Phys_Well_2 | Char | 14 | $14. | $14. | Phys_Well_2 |
Phys_Well_3 | Char | 14 | $14. | $14. | Phys_Well_3 |
Phys_Well_4 | Char | 14 | $14. | $14. | Phys_Well_4 |
Phys_Well_5 | Char | 14 | $14. | $14. | Phys_Well_5 |
Phys_Well_6 | Char | 14 | $14. | $14. | Phys_Well_6 |
Phys_Well_7 | Char | 14 | $14. | $14. | Phys_Well_7 |
SOCIAL_Well_8 | Char | 14 | $14. | $14. | SOCIAL_Well_8 |
SOCIAL_Well_9 | Char | 14 | $14. | $14. | SOCIAL_Well_9 |
SOCIAL_Well_10 | Char | 14 | $14. | $14. | SOCIAL_Well_10 |
SOCIAL_Well_11 | Char | 14 | $14. | $14. | SOCIAL_Well_11 |
SOCIAL_Well_12 | Char | 14 | $14. | $14. | SOCIAL_Well_12 |
SOCIAL_Well_13 | Char | 14 | $14. | $14. | SOCIAL_Well_13 |
SOCIAL_Well_14 | Char | 14 | $14. | $14. | SOCIAL_Well_14 |
EMOT_Well_15 | Char | 14 | $14. | $14. | EMOT_Well_15 |
EMOT_Well_16 | Char | 14 | $14. | $14. | EMOT_Well_16 |
EMOT_Well_17 | Char | 14 | $14. | $14. | EMOT_Well_17 |
EMOT_Well_18 | Char | 14 | $14. | $14. | EMOT_Well_18 |
EMOT_Well_19 | Char | 14 | $14. | $14. | EMOT_Well_19 |
EMOT_Well_20 | Char | 14 | $14. | $14. | EMOT_Well_20 |
FUNCT_Well_21 | Char | 14 | $14. | $14. | FUNCT_Well_21 |
FUNCT_Well_22 | Char | 14 | $14. | $14. | FUNCT_Well_22 |
FUNCT_Well_23 | Char | 14 | $14. | $14. | FUNCT_Well_23 |
FUNCT_Well_24 | Char | 14 | $14. | $14. | FUNCT_Well_24 |
FUNCT_Well_25 | Char | 14 | $14. | $14. | FUNCT_Well_25 |
FUNCT_Well_26 | Char | 14 | $14. | $14. | FUNCT_Well_26 |
FUNCT_Well_27 | Char | 14 | $14. | $14. | FUNCT_Well_27 |
ADDL_CONC_28 | Char | 14 | $14. | $14. | ADDL_CONC_28 |
ADDL_CONC_29 | Char | 14 | $14. | $14. | ADDL_CONC_29 |
ADDL_CONC_30 | Char | 14 | $14. | $14. | ADDL_CONC_30 |
ADDL_CONC_31 | Char | 14 | $14. | $14. | ADDL_CONC_31 |
ADDL_CONC_32 | Char | 14 | $14. | $14. | ADDL_CONC_32 |
ADDL_CONC_33 | Char | 14 | $14. | $14. | ADDL_CONC_33 |
ADDL_CONC_34 | Char | 14 | $14. | $14. | ADDL_CONC_34 |
ADDL_CONC_35 | Char | 14 | $14. | $14. | ADDL_CONC_35 |
ADDL_CONC_36 | Char | 14 | $14. | $14. | ADDL_CONC_36 |
ADDL_CONC_37 | Char | 14 | $14. | $14. | ADDL_CONC_37 |
ADDL_CONC_38 | Char | 14 | $14. | $14. | ADDL_CONC_38 |
ADDL_CONC_39 | Char | 14 | $14. | $14. | ADDL_CONC_39 |
ADDL_CONC_40 | Char | 14 | $14. | $14. | ADDL_CONC_40 |
ADDL_CONC_41 | Char | 14 | $14. | $14. | ADDL_CONC_41 |
ADDL_CONC_42 | Char | 14 | $14. | $14. | ADDL_CONC_42 |
ADDL_CONC_43 | Char | 14 | $14. | $14. | ADDL_CONC_43 |
ADDL_CONC_44 | Char | 14 | $14. | $14. | ADDL_CONC_44 |
ADDL_CONC_45 | Char | 14 | $14. | $14. | ADDL_CONC_45 |
ADDL_CONC_46 | Char | 14 | $14. | $14. | ADDL_CONC_46 |
ADDL_CONC_47 | Char | 14 | $14. | $14. | ADDL_CONC_47 |
ADDL_CONC_48 | Char | 14 | $14. | $14. | ADDL_CONC_48 |
ADDL_CONC_49 | Char | 14 | $14. | $14. | ADDL_CONC_49 |
ADDL_CONC_50 | Char | 14 | $14. | $14. | ADDL_CONC_50 |
ADDL_CONC_51 | Char | 14 | $14. | $14. | ADDL_CONC_51 |
ADDL_CONC_52 | Char | 14 | $14. | $14. | ADDL_CONC_52 |
ADDL_CONC_53 | Char | 14 | $14. | $14. | ADDL_CONC_53 |
ADDL_CONC_54 | Char | 14 | $14. | $14. | ADDL_CONC_54 |
ADDL_CONC_55 | Char | 14 | $14. | $14. | ADDL_CONC_55 |
FATG_ENRGY_56 | Char | 14 | $14. | $14. | FATG_ENRGY_56 |
FATG_ENRGY_57 | Char | 14 | $14. | $14. | FATG_ENRGY_57 |
FATG_ENRGY_58 | Char | 14 | $14. | $14. | FATG_ENRGY_58 |
FATG_ENRGY_59 | Char | 14 | $14. | $14. | FATG_ENRGY_59 |
FATG_ENRGY_60 | Char | 14 | $14. | $14. | FATG_ENRGY_60 |
FATG_ENRGY_61 | Char | 14 | $14. | $14. | FATG_ENRGY_61 |
FATG_ENRGY_62 | Char | 14 | $14. | $14. | FATG_ENRGY_62 |
FATG_ENRGY_63 | Char | 14 | $14. | $14. | FATG_ENRGY_63 |
FATG_ENRGY_64 | Char | 14 | $14. | $14. | FATG_ENRGY_64 |
FATG_ENRGY_65 | Char | 14 | $14. | $14. | FATG_ENRGY_65 |
FATG_ENRGY_66 | Char | 14 | $14. | $14. | FATG_ENRGY_66 |
FATG_ENRGY_67 | Char | 14 | $14. | $14. | FATG_ENRGY_67 |
FATG_ENRGY_68 | Char | 14 | $14. | $14. | FATG_ENRGY_68 |
SECT_VI_1 | Char | 14 | $14. | $14. | SECT_VI_1 |
SECT_VI_2 | Char | 14 | $14. | $14. | SECT_VI_2 |
SECT_VI_3 | Char | 14 | $14. | $14. | SECT_VI_3 |
SECT_VI_4 | Char | 14 | $14. | $14. | SECT_VI_4 |
SECT_VI_5 | Char | 14 | $14. | $14. | SECT_VI_5 |
SECT_VI_6 | Char | 14 | $14. | $14. | SECT_VI_6 |
SECT_VI_7 | Char | 14 | $14. | $14. | SECT_VI_7 |
SECT_VI_8 | Char | 14 | $14. | $14. | SECT_VI_8 |
SECT_VI_9 | Char | 14 | $14. | $14. | SECT_VI_9 |
SECT_VI_10 | Char | 14 | $14. | $14. | SECT_VI_10 |
SECT_VI_11 | Char | 14 | $14. | $14. | SECT_VI_11 |
SECT_VI_12 | Char | 14 | $14. | $14. | SECT_VI_12 |
SECT_VI_13 | Char | 14 | $14. | $14. | SECT_VI_13 |
SECT_VI_14 | Char | 14 | $14. | $14. | SECT_VI_14 |
SECT_VI_15 | Char | 14 | $14. | $14. | SECT_VI_15 |
SECT_VI_16 | Char | 14 | $14. | $14. | SECT_VI_16 |
SECT_VI_17 | Char | 14 | $14. | $14. | SECT_VI_17 |
SECT_VI_18 | Char | 14 | $14. | $14. | SECT_VI_18 |
SECT_VI_19 | Char | 14 | $14. | $14. | SECT_VI_19 |
SECT_VI_20 | Char | 14 | $14. | $14. | SECT_VI_20 |
SECT_VII_1 | Char | 22 | $22. | $22. | SECT_VII_1 |
SECT_VII_2 | Char | 22 | $22. | $22. | SECT_VII_2 |
SECT_VII_3 | Char | 22 | $22. | $22. | SECT_VII_3 |
SECT_VII_4 | Char | 22 | $22. | $22. | SECT_VII_4 |
SECT_VII_5 | Char | 22 | $22. | $22. | SECT_VII_5 |
SECT_VII_6 | Char | 22 | $22. | $22. | SECT_VII_6 |
SECT_VII_7 | Char | 22 | $22. | $22. | SECT_VII_7 |
SECT_VII_8 | Char | 22 | $22. | $22. | SECT_VII_8 |
SECT_VII_9 | Char | 22 | $22. | $22. | SECT_VII_9 |
SECT_VII_10 | Char | 22 | $22. | $22. | SECT_VII_10 |
SECT_VIII_1 | Char | 22 | $22. | $22. | SECT_VIII_1 |
SECT_VIII_2 | Char | 22 | $22. | $22. | SECT_VIII_2 |
SECT_VIII_3 | Char | 22 | $22. | $22. | SECT_VIII_3 |
SECT_VIII_4 | Char | 22 | $22. | $22. | SECT_VIII_4 |
SECT_VIII_5 | Char | 22 | $22. | $22. | SECT_VIII_5 |
SECT_VIII_6 | Char | 22 | $22. | $22. | SECT_VIII_6 |
SECT_VIII_7 | Char | 22 | $22. | $22. | SECT_VIII_7 |
SECT_VIII_8 | Char | 22 | $22. | $22. | SECT_VIII_8 |
SECT_VIII_9 | Char | 22 | $22. | $22. | SECT_VIII_9 |
SECT_VIII_10 | Char | 22 | $22. | $22. | SECT_VIII_10 |
SECT_VIII_11 | Char | 22 | $22. | $22. | SECT_VIII_11 |
SECT_VIII_12 | Char | 22 | $22. | $22. | SECT_VIII_12 |
SECT_VIII_13 | Char | 22 | $22. | $22. | SECT_VIII_13 |
BedTime | Num | 8 | BedTime | ||
Fall_To_sleep_Minutes | Num | 8 | Fall_To_sleep_Minutes | ||
GetUpTime | Num | 8 | GetUpTime | ||
Sleep_Hours | Num | 8 | Sleep_Hours | ||
Sleep_Quality | Char | 13 | $13. | $13. | Sleep_Quality |
SECT_IX_6 | Char | 1 | $1. | $1. | SECT_IX_6 |
SECT_IX_7 | Char | 1 | $1. | $1. | SECT_IX_7 |
SECT_IX_8 | Char | 1 | $1. | $1. | SECT_IX_8 |
Your code:
array sect[*] $ Health Health_later SECT_I_5--SECT_I_38 SECT_IV_a--SECT_VIII_13 Sleep_Quality;
array newsect[*] Health Health_later SECT_I_5--SECT_I_38 SECT_IV_a--SECT_VIII_13 Sleep_Quality;
do i=1 to dim(sect);
newsect[i]=input(substr(sect[i],1,1),1.);
newsect and sect both refer to the same variables. So Health is a character variable, you are trying to input() the text from Health back into Health, but Health is still a character variable, hence the numeric response from input() is being converted back to character. Have you second array use new variables:
array newsect[*] num_Health num_Health_later num_SECT_I_5...;
Typically when I get values like 1-not satisfied, 2-likely not satisfied I read the data with custom informats to convert to numeric at reading after saving the Excel as CSV so I know what I have. And then build a display format if needed. A sort of example:
proc format library=work; invalue satisfaction '1-not satisfied' =1 '2-likely not satisfied'=2 '5-very satisfied' =5 ; Value satisfaction 1 = 'Not satisfied' 2 = 'Likely not satisfied' 5 = 'Very satisfied' ; run;
And then in the data step to read the data I use informat statements to associate the Satisfaction informat with the values.
Since surveys often use a number of standard response categories, and the same organization the same categories in multiple surveys, then this becomes a considerable time saver.
Note that you can actually include additional pieces such as special missing to handle responses such as "don't know" "no opinion" or "refused" with the special missing values of .A through .Z.
Adapt LIBNAME and MEMNAME to your needs, as add excluded variables and unmark it, using next code:
%macro str2num(vx);
&vx.N = input(scan(&vx,1,'-'),best.);
%mend str2num;
data test(keep=name);
set sashelp.vcolumn(where=(libname='SASHELP' and memname='AACOMP' and
type='char' /* and name not in (...excluded names ...) */
)) end=eof;
if _N_ = 1 then
call execute('data want; set sashelp.aacomp;');
call execute('%str2num('||strip(name)||'); DROP '||name||';' );
if eof then call execute('RUN;');
run;
Don't miss out on SAS Innovate - Register now for the FREE Livestream!
Can't make it to Vegas? No problem! Watch our general sessions LIVE or on-demand starting April 17th. Hear from SAS execs, best-selling author Adam Grant, Hot Ones host Sean Evans, top tech journalist Kara Swisher, AI expert Cassie Kozyrkov, and the mind-blowing dance crew iLuminate! Plus, get access to over 20 breakout sessions.
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.