I have a data set called "BASKETBALL_ST" and it looks something like this:
Age Points Assists Rebounds Steals
21 12.1 7.2 7.4 1.2
26 20.3 7.2 . 1.2
21 10.0 . 7.4 .
19 8.4 10.1 . 2.3
I want to run a proc means for it, but the variables with missing values (Assists, Rebounds, Steals) become character values, so they're not showing up on the proc means output.
How can I delete ONLY the cells with missing (not the whole row)? Age and points are numeric, so they are showing in the proc means report, as desired. However, I can't figure out how to delete the missing values without deleting an entire row. This is my code so far:
proc import datafile="C:\Users\John\Downloads\SASLearn\SASLearnOct\BASKETBALL_ST.xls" dbms=xls out=work.BASKETBALL_ST replace; data work.cleanBASKETBALL_ST; set work.BASKETBALL_ST; if Assists=. then Assists=0; if Rebounds=. then Rebounds=0; if Rebounds=. then Rebounds=0; run; proc means data=work.cleanBASKETBALL_ST n mean median std min max range maxdec=2; run;
You cannot change the type of an existing variable.
So make new variables instead.
data cleanBASKETBALL_ST;
set BASKETBALL_ST;
_1 = input(assists,32.);
_2 = input(rebounds,32.);
_3 = input(steals),32.);
rename _1=Assists _2=Rebounds _3=Steals
Assists=Assists_char Rebounds=Rebounds_char Steals=Steals_char
;
run;
Take a look at your XLS file and try to figure out why it has character values in some of the cells in the columns that you want to be numbers. Where did it come from? Can you get the data as a CSV file instead? With a text file, such as a CSV file, you can write your own data step to read in the data lines. That way you will have complete control over how the variables are defined.
Thanks for the reply! I just saw it after revising my question a bit. I appreciate the guidance on the conversion, but I thought it would be a better report if the missing cells (not the entire row) were deleted.
@unwashedhelimix wrote:
Thanks for the reply! I just saw it after revising my question a bit. I appreciate the guidance on the conversion, but I thought it would be a better report if the missing cells (not the entire row) were deleted.
I do not know what you mean by that comment. There is no need to delete observations (or values). Can you explain what you meant more clearly?
If you have data as you posted (with numeric variables) SAS will handle it properly.
data have;
input Age Points Assists Rebounds Steals;
cards;
21 12.1 7.2 7.4 1.2
26 20.3 7.2 . 1.2
21 10 . 7.4 .
19 8.4 10.1 . 2.3
;
For example the mean values will be calculated using only the non-missing values.
The MEANS Procedure Variable N Mean Std Dev Minimum Maximum ----------------------------------------------------------------------------- Age 4 21.7500000 2.9860788 19.0000000 26.0000000 Points 4 12.7000000 5.2883520 8.4000000 20.3000000 Assists 3 8.1666667 1.6743158 7.2000000 10.1000000 Rebounds 2 7.4000000 0 7.4000000 7.4000000 Steals 3 1.5666667 0.6350853 1.2000000 2.3000000 -----------------------------------------------------------------------------
Registration is now open for SAS Innovate 2025 , our biggest and most exciting global event of the year! Join us in Orlando, FL, May 6-9.
Sign up by Dec. 31 to get the 2024 rate of just $495.
Register now!
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.
Ready to level-up your skills? Choose your own adventure.