I have a dataset which has initially got CHAR values in one of its columns. Also there are '--' and bank spaces. I would like to denote '--' as 9999 and complete bank spaces as 0. And then I want to conditionally format the remaining values: from CHAR to numeric. Here's my code. But the code converts all 9999s and 0s too. Please help!
proc import datafile='C:\...\EU_ER.csv'
dbms=csv replace out=eu_er;
getnames=yes;
guessingrows=max;
run;
data eu_er;
set eu_er;
if Surv_M='--' then Surv_M=9999;
if Surv_M=' ' then Surv_M=0;
SurvM=input(Surv_M, best32.);
if SurvM ^= 9999 OR SurvM ^= 0 then do;
format SurvM 32.4;
end;
drop Surv_M;
rename SurvM=Surv_M;
run;
Why would you want to convert your MISSING values into valid numbers? That would make it impossible for SAS to know that it shouldn't use those values when calculating mean,min,max etc for your variable? Leave them as MISSING values. You can use special missing values if you want to distinguish between hyphens and blanks.
Here is code that converts the blanks into missing and hyphens into the special missing .A.
data eu_er_fixed;
set eu_er;
length Surv_M_number 8;
format Suv_M_number 32.4 ;
if Surv_M='--' then Surv_M_number=.a;
else if Surv_M=' ' then Surv_M_number=.;
else Surv_M_number = input(Surv_M,32.);
drop Surv_M;
rename Surv_M_number=Surv_M;
run;
Formats are rules for converting values into text. Informats are rules for converting text into values.
You might be able to make this easier with a user defined informat.
proc format ;
invalue survey '--'=.a ' '=. ;
run;
data eu_er_fixed;
set eu_er;
length Surv_M_number 8;
format Suv_M_number 32.4 ;
Surv_M_number = input(Surv_M,survey32.);
drop Surv_M;
rename Surv_M_number=Surv_M;
run;
format is a compile time statement and not at execution time statement
associating a format to a variable is one thing while using a format(in a put function at execution time) assigning the formatted values for the new char/num var respectively is another thing
@novinosrin, what solution would like to suggest? Much thanks.
Sure if you could provide us a sample data that we can copy paste to our sas environment, and your required output for the input sample, one of us in the community will provide you the code tested using the sample you give us plz
Mention your requirement logic in a couple of sentences along with your sample than pics
Why would you want to convert your MISSING values into valid numbers? That would make it impossible for SAS to know that it shouldn't use those values when calculating mean,min,max etc for your variable? Leave them as MISSING values. You can use special missing values if you want to distinguish between hyphens and blanks.
Here is code that converts the blanks into missing and hyphens into the special missing .A.
data eu_er_fixed;
set eu_er;
length Surv_M_number 8;
format Suv_M_number 32.4 ;
if Surv_M='--' then Surv_M_number=.a;
else if Surv_M=' ' then Surv_M_number=.;
else Surv_M_number = input(Surv_M,32.);
drop Surv_M;
rename Surv_M_number=Surv_M;
run;
Formats are rules for converting values into text. Informats are rules for converting text into values.
You might be able to make this easier with a user defined informat.
proc format ;
invalue survey '--'=.a ' '=. ;
run;
data eu_er_fixed;
set eu_er;
length Surv_M_number 8;
format Suv_M_number 32.4 ;
Surv_M_number = input(Surv_M,survey32.);
drop Surv_M;
rename Surv_M_number=Surv_M;
run;
@d6k5d3 wrote:
What is survey32.?
The specification for what INFORMAT to use with the INPUT() function call. It was defined in the PROC FORMAT step.
SAS Innovate 2025 is scheduled for May 6-9 in Orlando, FL. Sign up to be first to learn about the agenda and registration!
SAS' Charu Shankar shares her PROC SQL expertise by showing you how to master the WHERE clause using real winter weather data.
Find more tutorials on the SAS Users YouTube channel.