<?xml version="1.0" encoding="UTF-8"?>
<rss xmlns:content="http://purl.org/rss/1.0/modules/content/" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" xmlns:taxo="http://purl.org/rss/1.0/modules/taxonomy/" version="2.0">
  <channel>
    <title>topic Summarizing missing data in SAS Programming</title>
    <link>https://communities.sas.com/t5/SAS-Programming/Summarizing-missing-data/m-p/524758#M142717</link>
    <description>&lt;P&gt;Does anyone have a good way of summarize the % missing for each variable in a dataset? Alternatively, I would also want to be able to split out all the variables with over, say 20%, missing data in my dataset.&lt;/P&gt;</description>
    <pubDate>Fri, 04 Jan 2019 23:09:25 GMT</pubDate>
    <dc:creator>Melk</dc:creator>
    <dc:date>2019-01-04T23:09:25Z</dc:date>
    <item>
      <title>Summarizing missing data</title>
      <link>https://communities.sas.com/t5/SAS-Programming/Summarizing-missing-data/m-p/524758#M142717</link>
      <description>&lt;P&gt;Does anyone have a good way of summarize the % missing for each variable in a dataset? Alternatively, I would also want to be able to split out all the variables with over, say 20%, missing data in my dataset.&lt;/P&gt;</description>
      <pubDate>Fri, 04 Jan 2019 23:09:25 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Programming/Summarizing-missing-data/m-p/524758#M142717</guid>
      <dc:creator>Melk</dc:creator>
      <dc:date>2019-01-04T23:09:25Z</dc:date>
    </item>
    <item>
      <title>Re: Summarizing missing data</title>
      <link>https://communities.sas.com/t5/SAS-Programming/Summarizing-missing-data/m-p/524769#M142719</link>
      <description>&lt;P&gt;proc freq data=yourdata;&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&amp;nbsp; tables yourvariable(s) / missing;&lt;/P&gt;
&lt;P&gt;run;&lt;/P&gt;
&lt;P&gt;is the first thing that comes to mind. This will give a table that includes the count and percent missing for each variable on the tables statement.&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;Depending on other details of what you may want and variable type other options arise.&lt;/P&gt;
&lt;P&gt;Example data and desired result do help get more targeted responses.&lt;/P&gt;
&lt;P&gt;It is not clear what you may want next.&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;</description>
      <pubDate>Sat, 05 Jan 2019 00:27:17 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Programming/Summarizing-missing-data/m-p/524769#M142719</guid>
      <dc:creator>ballardw</dc:creator>
      <dc:date>2019-01-05T00:27:17Z</dc:date>
    </item>
    <item>
      <title>Re: Summarizing missing data</title>
      <link>https://communities.sas.com/t5/SAS-Programming/Summarizing-missing-data/m-p/524773#M142720</link>
      <description>Is there a way to use the ods output statement to create a table with the variable name and % missing - so then I can use a data step to select only those variables that have over a certain percentage of missing data to present to the clinician? I have many variables so it is very inefficient to do it for each variable as a separate table.</description>
      <pubDate>Sat, 05 Jan 2019 00:51:59 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Programming/Summarizing-missing-data/m-p/524773#M142720</guid>
      <dc:creator>Melk</dc:creator>
      <dc:date>2019-01-05T00:51:59Z</dc:date>
    </item>
    <item>
      <title>Re: Summarizing missing data</title>
      <link>https://communities.sas.com/t5/SAS-Programming/Summarizing-missing-data/m-p/524779#M142725</link>
      <description>&lt;P&gt;I have some macros and examples here on some different approaches.&amp;nbsp;&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;&lt;A href="https://gist.github.com/statgeek" target="_blank"&gt;https://gist.github.com/statgeek&lt;/A&gt;&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;EDIT: Decided to clean it up a bit, this is the one that I think meets your needs.&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;&lt;A href="https://gist.github.com/statgeek/2de1faf1644dc8160fe721056202f111" target="_blank"&gt;https://gist.github.com/statgeek/2de1faf1644dc8160fe721056202f111&lt;/A&gt;&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;PRE&gt;/*This program creates a report with the number and percent of
missing data for each variable in the data set.
The ony change should be to the macro variable, INPUT_DSN.

Author: F. Khurshed
Date: 2019-01-04*/
*create sample data to work with;

data class;
    set sashelp.class;

    if age=14 then
        call missing(height, weight, sex);

    if name='Alfred' then
        call missing(sex, age, height);
    label age="Fancy Age Label";
run;

*set input data set name;
%let INPUT_DSN = class;
%let OUTPUT_DSN = want;
*create format for missing;

proc format;
    value $ missfmt ' '="Missing" other="Not Missing";
    value nmissfmt .="Missing" other="Not Missing";
run;

*Proc freq to count missing/non missing;
ods select none;
*turns off the output so the results do not get too messy;
ods table onewayfreqs=temp;

proc freq data=&amp;amp;INPUT_DSN.;
    table _all_ / missing;
    format _numeric_ nmissfmt. _character_ $missfmt.;
run;

ods select all;
*Format output;

data long;
    length variable $32. variable_value $50.;
    set temp;
    Variable=scan(table, 2);
    Variable_Value=strip(trim(vvaluex(variable)));
    presentation=catt(frequency, " (", trim(put(percent/100, percent7.1)), ")");
    keep variable variable_value frequency percent cum: presentation;
    label variable='Variable' variable_value='Variable Value';
run;

proc sort data=long;
    by variable;
run;

*make it a wide data set for presentation, with values as N (Percent);

proc transpose data=long out=wide_presentation (drop=_name_);
    by variable;
    id variable_value;
    var presentation;
run;

*transpose only N;

proc transpose data=long out=wide_N prefix=N_;
    by variable;
    id variable_value;
    var frequency;
run;

*transpose only percents;

proc transpose data=long out=wide_PCT prefix=PCT_;
    by variable;
    id variable_value;
    var percent;
run;

*final output file;

data &amp;amp;Output_DSN.;
    merge wide_N wide_PCT wide_presentation;
    by variable;
    drop _name_;
    label N_Missing='# Missing' N_Not_Missing='# Not Missing' 
        PCT_Missing='% Missing' N_Not_Missing='% Not Missing' Missing='Missing' 
        Not_missing='Not Missing';
run;

title "Missing Report of &amp;amp;INPUT_DSN.";

proc print data=&amp;amp;output_dsn. noobs label;
run;&lt;/PRE&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;</description>
      <pubDate>Sat, 05 Jan 2019 03:52:59 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Programming/Summarizing-missing-data/m-p/524779#M142725</guid>
      <dc:creator>Reeza</dc:creator>
      <dc:date>2019-01-05T03:52:59Z</dc:date>
    </item>
    <item>
      <title>Re: Summarizing missing data</title>
      <link>https://communities.sas.com/t5/SAS-Programming/Summarizing-missing-data/m-p/524793#M142730</link>
      <description>&lt;P&gt;Reeza,&amp;nbsp;&lt;/P&gt;
&lt;P&gt;It would be succinct by using SQL.&lt;/P&gt;
&lt;PRE&gt;&lt;CODE class=" language-sas"&gt;data have;
set sashelp.heart;
run;
proc transpose data=have(obs=0) out=temp;
var _all_;
run;
data _null_;
set temp end=last;
if _n_=1 then call execute('proc sql;create table want as select ');
call execute(cat('nmiss(',_name_,')/count(*) as ',_name_,'
format=percent8.2'));
if last then call execute('from have;quit;');
else call execute(',');
run;
&lt;/CODE&gt;&lt;/PRE&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;</description>
      <pubDate>Sat, 05 Jan 2019 10:35:08 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Programming/Summarizing-missing-data/m-p/524793#M142730</guid>
      <dc:creator>Ksharp</dc:creator>
      <dc:date>2019-01-05T10:35:08Z</dc:date>
    </item>
    <item>
      <title>Re: Summarizing missing data</title>
      <link>https://communities.sas.com/t5/SAS-Programming/Summarizing-missing-data/m-p/527609#M143868</link>
      <description>Thank you!</description>
      <pubDate>Wed, 16 Jan 2019 03:06:48 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Programming/Summarizing-missing-data/m-p/527609#M143868</guid>
      <dc:creator>Melk</dc:creator>
      <dc:date>2019-01-16T03:06:48Z</dc:date>
    </item>
    <item>
      <title>Re: Summarizing missing data</title>
      <link>https://communities.sas.com/t5/SAS-Programming/Summarizing-missing-data/m-p/527612#M143871</link>
      <description>Ksharp this is very succinct and I will test it out - thank you!</description>
      <pubDate>Wed, 16 Jan 2019 03:14:58 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Programming/Summarizing-missing-data/m-p/527612#M143871</guid>
      <dc:creator>Melk</dc:creator>
      <dc:date>2019-01-16T03:14:58Z</dc:date>
    </item>
  </channel>
</rss>

