I have a dataset that has character part and the numeric part. I do not need the character part of the variable in the final analysis. How do I remove some part of the observation (i.e. the character part) and keep the rest (i.e. the numeric part).
For example in the dataset below, how do I program SAS so that 'cm' are removed from the 'Height' variable and only the numeric value is remaining. Same thing for Weight.
ID Age Height Weight
1001 15 163cm 140Lb
1002 21 180 145lb
1003 23 154cm 165
1004 19 132 134lb
1008 21 129cm 134lb
1009 14 113cm 123
1010 9 100 100
Any help in this matter will be highly appreciated.
Thanks
Sapkota
To get height and weight as numeric variables, you need to use input function in addition to compress
data want;
set have;
height_num=input(compress(height,,'kd'),best.);
weight_num=input(compress(weight,,'kd'),best.);
run;
Look at the compress function. You can't change a character variable to numeric so you'll need to create a new variable. If you really need it to be the same variable name, rename the height variable before the conversion in the set statement.
height_num=compress(height, 'kd');
height=compress(height,,'kd');
To get height and weight as numeric variables, you need to use input function in addition to compress
data want;
set have;
height_num=input(compress(height,,'kd'),best.);
weight_num=input(compress(weight,,'kd'),best.);
run;
Why 'kd' used?
I am not getting..
It means keep digit and seems to be a SAS option for the compress function. That is quite handy instead having to specify the actual characters.
k and d are two modifiers in compress function that will keep digits in a string and throw away all the rest.
Thank you for the helpful responses.
I have one related question. How do I delete the part of the numeric value from one observation which is entirely numeric. For instant in the following example, in the 'Height' variable, if I would like to delete the last two digits of all the observation (so that Height of ID 1001 would be12 instead of 1230 and Height of ID 1002 would be 14 instead of 1435), how do I do it?
ID Age Height
1001 10 1230
1002 12 1435
1003 13 1453
1004 15 1432
1005 11 1786
1006 12 1234
Any help will be highly appreciated. Thanks
Sapkota
Try this for the desired output.
data want;
set have;
new_height=int(divide(height,100));
run;
Registration is now open for SAS Innovate 2025 , our biggest and most exciting global event of the year! Join us in Orlando, FL, May 6-9.
Sign up by Dec. 31 to get the 2024 rate of just $495.
Register now!
Learn the difference between classical and Bayesian statistical approaches and see a few PROC examples to perform Bayesian analysis in this video.
Find more tutorials on the SAS Users YouTube channel.
Ready to level-up your skills? Choose your own adventure.