I am analyzing the Breast Cancer Wisconsin (Diagnostic) Data Set, in order to run the logistic regression I have to replace the value of the Diagnosis variable M&B with 0&1.
I have tried the below code and some has error occurred:
proc format;
invalue infmt 'B' = 1
'M' = 0;
run;
data BreastCancer_1;
set Learn.Oirginal (rename = (cancer_diagnosis=diagnosis));
cancer_diagnosis = input(diagnosis,infmt.);
drop diagnosis;
run;
Error:
78 data BreastCancer_1;
79 set Learn.Oirginal (rename = (cancer_diagnosis=diagnosis));
ERROR: Variable cancer_diagnosis is not on file LEARN.OIRGINAL.
ERROR: Invalid DROP, KEEP, or RENAME option on file LEARN.OIRGINAL.
80 cancer_diagnosis = input(diagnosis,infmt.);
81 drop diagnosis;
82 run;
Old Dataset - Learn.Oirginal
New Dataset - BreastCancer_1
Old Variable - diagnosis
New Variable - cancer_diagnosis