Hi,
I need help in replacing the value 'I don't know how tall I am' from height column as well as 'I don't know how much I weigh' from the weight column with the 0 .
Data Have
policy_no height weight
1 I don't know how tall I am 60
2 1.5 I don't know how tall I am
Data Want
policy_no height Weight
1 0 60
2 1.5 0
To answer your specific question, even though I think this is not a good thing to do:
data have2;
set have;
if height='I don''t know how tall I am' then height='0';
if weight='I don''t know how tall I am' then weight='0';
run;
To do the calculation of means, you would have to convert these to numeric variables as stated by @ballardw , but it really doesn't make sense to set these to zero, when you should be setting these to missing.
Does it really say tall when weight is missing?
In your data set, are the variables HEIGHT and WEIGHT character or numeric? Please run PROC CONTENTS on this data set and show us. Also, if either of these variables have had a format applied, please identify which formats are used.
Are you intending to do any analysis or summaries with the "height" and "weight" as actual numeric values?
If so then you need create new variables
Perhaps something like:
data want; set have; numheight= input(height,?? 5.); if missing(numheight) then numheight=0; numweight= input(weight,?? 5.); if missing(numweight) then numweight=0; run;
INPUT is the way to create numeric values from text. The ?? suppresses the invalid data messages you would normally see when attempting to read "I don't know my height" into a numeric value. The IF assumes the reason for missing is something like your text entries and sets 0.
Or change how you read the data into SAS to address such things earlier.
Hi, yes im intending to replace them with mean value for analysis
Then set them as MISSING, not a valid value like zero.
Make new numeric variables to store the actual numbers.
Alphabetic List of Variables and Attributes # Variable Type Len Format Informat Label 1 ExternalReferenceId Char 255 $255. $255. ExternalReferenceId 2 Height Char 255 $255. 3 Weight Char 255 $255.
Why are those fields length $255? The longest string that the normal SAS informat can read is 32 characters.
Here is a step that will create numeric variables HT and WT from your existing strings. Anything that is not a valid number is converted to missing.
data want;
set have;
ht = input(left(height),32.);
wt = input(left(weight),32.);
run;
If you want to suppress the warnings about invalid data you can use the ?? prefix on the informat.
It's important to understand how SAS handles missing values, not understanding could lead to results you are not expecting:
See Understanding How SAS Handles Missing Values
Here's a simple example that helps explain it:
data _null_ ;
/* Notice how the mean of 0,1,2,3 = 1.5 that's because SAS calculates the mean as (0+1+2+3)/4 = 6/4 = 1.5 */
avg=mean(0,1,2,3) ;
put "Mean of 0,1,2,3 = " avg ;
/* Notice how the mean of .,1,2,3 = 2 that's because SAS calculates the mean as (1+2+3)/3 = 6/3 = 2 */
/* Basically SAS doesn't include missing values when using functions */
avg=mean(.,1,2,3) ;
put "Mean of .,1,2,3 = " avg ;
run ;
To answer your specific question, even though I think this is not a good thing to do:
data have2;
set have;
if height='I don''t know how tall I am' then height='0';
if weight='I don''t know how tall I am' then weight='0';
run;
To do the calculation of means, you would have to convert these to numeric variables as stated by @ballardw , but it really doesn't make sense to set these to zero, when you should be setting these to missing.
Does it really say tall when weight is missing?
data have;
input policy_no height $&100. weight $&100.;
datalines;
1 I don't know how tall I am 60
2 1.5 I don't know how tall I am
;
run;
data have2;
set have;
if anyalpha(height) then height='0';
if anyalpha(weight) then weight='0';
run;
Build your skills. Make connections. Enjoy creative freedom. Maybe change the world. Registration is now open through August 30th. Visit the SAS Hackathon homepage.
Register today!Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.
Select SAS Training centers are offering in-person courses. View upcoming courses for: