Hello,
So I haven't attempted any code for this as of yet, being that I'm unsure of where to start. My question is how can I use SAS to identify where values may be missing for certain observations? For example, if each observation is supposed to have first name, last name, and age variables, yet some observations are missing at least one of these variables, how can I use SAS to tell me who/how many are missing a value?
An example to illustrate @Kurt_Bremser response. Try running:
data df;
infile cards;
input temp $4.
speed $7.
@14 measure1
@18 measure2
@23 measure3
@28 measure4 ;
list;
datalines;
cold slow . 2.7 6.6 3.1
warm medium 4.2 5.1 7.9 9.1
hot fast 9.4 11.0 . 6.8
cool . . 9.1 8.9
cool medium 6.1 4.3 12.2 3.7
slow . 2.9 3.3 1.7
slow . 2.9 3.3 1.7
;;;;
proc print;
proc format;
value $missfmt ' '='Missing' other='Not Missing';
value missfmt . ='Missing' other='Not Missing';
proc freq data=df;
format _CHARACTER_ $missfmt.;
tables _CHARACTER_ / missing missprint nocum nopercent;
format _NUMERIC_ missfmt.;
tables _NUMERIC_ / missing missprint nocum nopercent;
run;
SAS permits the use of _CHARACTER_ to denote all character variables in the data set. Likewise, _NUMERIC_ denotes all numeric variables.
For new users, SAS documentation includes the document, SAS 9.4 Language Reference: Concepts. For working with missing values see:
You an use the MISSING() function, for example
if missing(variablename) then do;
... some commands ...
end;
proc freq shows you the number of missing values in the output.
The missing() function can be used for testing while you process your dataset in a data step.
An example to illustrate @Kurt_Bremser response. Try running:
data df;
infile cards;
input temp $4.
speed $7.
@14 measure1
@18 measure2
@23 measure3
@28 measure4 ;
list;
datalines;
cold slow . 2.7 6.6 3.1
warm medium 4.2 5.1 7.9 9.1
hot fast 9.4 11.0 . 6.8
cool . . 9.1 8.9
cool medium 6.1 4.3 12.2 3.7
slow . 2.9 3.3 1.7
slow . 2.9 3.3 1.7
;;;;
proc print;
proc format;
value $missfmt ' '='Missing' other='Not Missing';
value missfmt . ='Missing' other='Not Missing';
proc freq data=df;
format _CHARACTER_ $missfmt.;
tables _CHARACTER_ / missing missprint nocum nopercent;
format _NUMERIC_ missfmt.;
tables _NUMERIC_ / missing missprint nocum nopercent;
run;
SAS permits the use of _CHARACTER_ to denote all character variables in the data set. Likewise, _NUMERIC_ denotes all numeric variables.
For new users, SAS documentation includes the document, SAS 9.4 Language Reference: Concepts. For working with missing values see:
To get a count of missing values for a single observation, use (in a DATA step):
n_missing = cmiss(of _all_);
To get a more useful report about which variables are missing, you need to supply a little more information ... what should be included in the report? What if there are 100 variables with a missing value? What variable identifies the observation so you know which observation to check later?
Join us for SAS Innovate 2025, our biggest and most exciting global event of the year, in Orlando, FL, from May 6-9. Sign up by March 14 for just $795.
SAS' Charu Shankar shares her PROC SQL expertise by showing you how to master the WHERE clause using real winter weather data.
Find more tutorials on the SAS Users YouTube channel.