Hi I am trying to figure out why the resulting table (final table) contains zero counts of variables that are defined to be missing (empty). Here below is the example of my code:
data sample; do i = 1 to 1000; output; end; run; data tmp; set sample; a = .; b = ''; c = 'text'; run; proc contents data=tmp Position noprint out=see (keep = LIBNAME MEMNAME NAME TYPE LENGTH ENGINE MEMTYPE NOBS); run; data _null_; set see (keep=name); n = _N_; call symput('nrows',compress(n)); run; proc sql noprint; select distinct name into: var1 - : var&nrows from SEE; quit; data final_table; LENGTH NAME $50.; NAME = ''; NOBS_MISSING = .; run; data final_table; set final_table (obs=0); run; %macro loop_(); %do i = 1 %to &nrows; proc sql; create table &&var&i. as select compress("&&var&i.") as NAME, count(&&var&i.) as NOBS_MISSING from TMP where &&var&i. IS MISSING; quit; data final_table; set final_table &&var&i.; run; %end; %mend; %loop_()
I thought it should return nobs_missing = 1000 for missing variables, i.e. "a" and "b" (no matter if char or int format). Any suggestions? Thanks
What is the question exactly?
I suspect that you are getting impacted by using the WHERE clause.
create table &&var&i. as
select compress("&&var&i.") as NAME
, count(&&var&i.) as NOBS_MISSING
from TMP
where &&var&i. IS MISSING
;
If no data meets the WHERE condition then there is nothing to return so you end up with zero rows instead of one row with a count of zero.
Try using MISSING() function, or a CASE statement that generates 1 or 0 values, so that you can use the SUM() aggregate function instead of COUNT()+WHERE.
create table &&var&i. as
select compress("&&var&i.") as NAME
, sum(missing(&&var&i.)) as NOBS_MISSING
from TMP
group by 1
;
Also COUNT(X) will not count missing values of X. So if you only give it missing values you will always get a count of zero. You should use count(*) or count(1). Run this example.
proc sql ;
select count(*) as count_of_records
, count(age) as count_of_age
, count(' ') as count_of_missing
from sashelp.class
;
quit;
I have run your code and it seems that
where &&var&i. IS MISSING;
does not work properly with sas sql.
I have replaced the sql step with next step and it worked perfectly:
data &&var&i.;
set tmp end=eof;
name = compress("&&var&i.");
where &&var&i. is missing;
if eof then do;
nobs_missing = _N_;
output;
end;
run;
Thank you both for your answers. I used proc freq which provides one with counts of missing/complete observations but wanted to know the reason for that missing function is not working properly with combination with where statement.
Can you please provide a simple example of just a few data records and queries that demonstrate a case that shows either the MISSING() function or the IS MISSING logical condition not working properly? Eliminate the macro code and macro variables, just show the data step and/or SQL query that is not producing the expected results.
It has been written by other user:
where &&var&i. IS MISSING;
Perhaps, &&var&i cannot be read properly and thus one needs to rewrite the form of macro variable.
@Tom - next simplified code shows that "where var is missing" doesn't work properly in sas sql
data tmp1;
infile datalines dlm=',' dsd truncover;
input i char1 $ char2 $ num1 num2;
datalines;
1,x,,11,.
2,y,a,12,3
; run;
proc sql;
create table tst_num2a as select
compress("num2") as NAME,
count(num2) as NOBS_MISSING
from TMP1 where (num2 IS MISSING);
quit;
proc sql;
create table tst_char2 as select
compress("char2") as NAME,
count(char2) as NOBS_MISSING
from TMP1 where (char2 IS MISSING);
quit;
Available on demand!
Missed SAS Innovate Las Vegas? Watch all the action for free! View the keynotes, general sessions and 22 breakouts on demand.
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.