Hi All,
I have a dataset in SASHelp library- citiday. What I want to do is to sumamrise the number of missing values for each numeric variable in this dataset and have the results for each numeric variable with percentage of missing in a dataset Full. I tried the code below, but cant put things together.
proc contents data=sashelp.citiday noprint out=_temp; run;
proc means data=sashelp.citiday n nmiss;
var _numeric_;
ods output summary=_stat_tem(drop=label: );
run;
proc sql;
create table full as
select a.*,a.NMiss /(a.N+a.NMiss) * 100 as pct_missing, b.label from _stat_tem (rename=(variable=varname)) a
inner join _temp b
on a.varname=b.name;
quit;
Please help/advise.
Thnak you
Sk
I have a macro that does that if you're interested:
I have a macro that does that if you're interested:
ods output summary=summary;
proc means data=sashelp.class stackods n nmiss;
var _numeric_;
run;
data want;
set summary;
pct_missing=nmiss/(n+nmiss);
format pct_missing percent6.2;
run;
proc print data=want;
run;
hello Stat@stat,
thank you for the help. Is there an option in SAS which puts the percntile information in the _FREQ_ column? I mean when I use -
proc means data= SASHelp.Citiday n nmiss mean std min max p1 p5 p10 p25 p50 p75 p90 p95 p99;
var _numeric_;
ods output =test (drop=label:)
;
run;
This doesnt produce pernctile information (only - N, NMISS, MAX,MIN,MEAN,STD).
But if I use -
proc means data= SASHelp.Citiday n nmiss mean std min max p1 p5 p10 p25 p50 p75 p90 p95 p99;
var _numeric_;
ods output summary=_stat_temppretty(drop=label:)
;
run;
The problem is that even if I do get percentile the summary information gets added as a suffix to the variable name. For example, if I have a variable AMOUNT, I get AMOUNT_N, AMOUNT_NMISS, AMOUNT_P1....so on. So I cant even use a proc transpose here. Basically, my end goal is to have all missing data, percntile, min, max, median for original variables.
@ksharp - the autoname does not work here.
Sk
You didn't use the code posted by Stat@sas. Note the STACKODS option used, which is missing from your code.
See the example from here:
proc means data=sashelp.class stackods n nmiss p1 p5 p95 p99;
var _numeric_;
ods output summary=want;
run;
Thank you Reeza and stat@sas.
My nly problem in using the macro suggested by Reeza is that this macro given percentage of missing for each variable in the dataset. What I am also looking for is to add summary information , for each variable, like - 1st percentile, 5th percentile, minimum, maximum etc. ..
This can be only achieved by proc means. Now, when I use proc means -
proc means data=&libname..&dsetin. mean std min max p1 p5 p10 p25 p50 p75 p90 p95 p99; | |
var _numeric_; | |
ods output summary=_stat_&var.; |
run;
How do I merge with the &dsetout output to get percent missing for all variables in One file. This is because when I use _numeric_ in Proc means, I could not retain the name of my original variable so could not merge it back with the dataset having missing percent information.
Kind regards
sk
ods output summary=_stat_&var. /autoname ;
If you need other stats and have only numeric variables, I suggest using @stat@sas solution instead.
Simply add the variables into the code and it will generate the numbers required in less steps.
Don't miss out on SAS Innovate - Register now for the FREE Livestream!
Can't make it to Vegas? No problem! Watch our general sessions LIVE or on-demand starting April 17th. Hear from SAS execs, best-selling author Adam Grant, Hot Ones host Sean Evans, top tech journalist Kara Swisher, AI expert Cassie Kozyrkov, and the mind-blowing dance crew iLuminate! Plus, get access to over 20 breakout sessions.
Learn the difference between classical and Bayesian statistical approaches and see a few PROC examples to perform Bayesian analysis in this video.
Find more tutorials on the SAS Users YouTube channel.