BookmarkSubscribeRSS Feed
☑ This topic is solved. Need further help from the community? Please sign in and ask a new question.
gloverb1
Calcite | Level 5

Hi,

 

I have a survey dataset with ~400 variables that I want to be able to find the number of special missing for each variable. I want the number of missing but need to differentiate between responses 777 (I don't know) and 999 (I prefer not to answer). Is there a quick way, similar to proc means nmiss, that will output how many of each special missing are in each variable? 

 

Thanks,

Brian

1 ACCEPTED SOLUTION

Accepted Solutions
Reeza
Super User
data have;
input q1 q2;
datalines;
0 999
1 .
3 0
777 1
0 2
2 3
3 2
777 1
;
run;

proc format;
    value $ missfmt ' '="Missing" other="Not Missing";
    value nmissfmt .="Missing" 777,999 = [8.] other="Not Missing";
run;



*Proc freq to count missing/non missing;
ods select none;
*turns off the output so the results do not get too messy;
ods table onewayfreqs=temp;

proc freq data=have;
    table _all_ / missing;
    format _numeric_ nmissfmt. _character_ $missfmt.;
run;

ods select all;
*Format output;

data long;
    length variable $32. variable_value $50.;
    set temp;
    Variable=scan(table, 2);
    Variable_Value=strip(trim(vvaluex(variable)));
    presentation=catt(frequency, " (", trim(put(percent/100, percent7.1)), ")");
    keep variable variable_value frequency percent cum: presentation;
    label variable='Variable' variable_value='Variable Value';
run;

data long;
    length variable $32. variable_value $50.;
    set temp;
    Variable=scan(table, 2);
    Variable_Value=strip(trim(vvaluex(variable)));
    presentation=catt(frequency, " (", trim(put(percent/100, percent7.1)), ")");
    keep variable variable_value frequency percent cum: presentation;
    label variable='Variable' variable_value='Variable Value';
run;

proc sort data=long;
    by variable;
run;



*transpose only N;

proc transpose data=long (where=(variable_value in ('777', '999'))) out=wide_N prefix=N_;
    by variable;
    id variable_value;
    var frequency;
run;

Derived from this code here: https://gist.github.com/statgeek/2de1faf1644dc8160fe721056202f111

 

Gets you pretty close, but no 0's. I'm assuming you can figure out how to add those. 

 

@gloverb1 wrote:

Thanks for the quick response. Given the data below, is there a way to get an output formatted such: 

777 | 999

Q1    2   |   0

Q2    0   |   1

My end goal is to be able to identify any outlying number of 777 or 999 responses for a given variable. 

data have;
input q1 q2;
datalines;
0 999
1 .
3 0
777 1
0 2
2 3
3 2
777 1
;
run;

 


 

View solution in original post

3 REPLIES 3
Reeza
Super User
proc format;
    value $ missfmt ' '="Missing" other="Not Missing";
    value nmissfmt .="Missing" 777 = "I don't know" 999 = "I Prefer not to answer" other="Not Missing";
run;

*Proc freq to count missing/non missing;
ods select none;
*turns off the output so the results do not get too messy;
ods table onewayfreqs=temp;

proc freq data=&INPUT_DSN.;
    table _all_ / missing;
    format _numeric_ nmissfmt. _character_ $missfmt.;
run;

ods select all;
*Format output;

data long;
    length variable $32. variable_value $50.;
    set temp;
    Variable=scan(table, 2);
    Variable_Value=strip(trim(vvaluex(variable)));
    presentation=catt(frequency, " (", trim(put(percent/100, percent7.1)), ")");
    keep variable variable_value frequency percent cum: presentation;
    label variable='Variable' variable_value='Variable Value';
run;

Something like above should get you started for numeric variables. If you can post sample data that always helps get working, tested code. 

 


@gloverb1 wrote:

Hi,

 

I have a survey dataset with ~400 variables that I want to be able to find the number of special missing for each variable. I want the number of missing but need to differentiate between responses 777 (I don't know) and 999 (I prefer not to answer). Is there a quick way, similar to proc means nmiss, that will output how many of each special missing are in each variable? 

 

Thanks,

Brian


 

gloverb1
Calcite | Level 5

Thanks for the quick response. Given the data below, is there a way to get an output formatted such: 

777 | 999

Q1    2   |   0

Q2    0   |   1

My end goal is to be able to identify any outlying number of 777 or 999 responses for a given variable. 

data have;
input q1 q2;
datalines;
0 999
1 .
3 0
777 1
0 2
2 3
3 2
777 1
;
run;

 

Reeza
Super User
data have;
input q1 q2;
datalines;
0 999
1 .
3 0
777 1
0 2
2 3
3 2
777 1
;
run;

proc format;
    value $ missfmt ' '="Missing" other="Not Missing";
    value nmissfmt .="Missing" 777,999 = [8.] other="Not Missing";
run;



*Proc freq to count missing/non missing;
ods select none;
*turns off the output so the results do not get too messy;
ods table onewayfreqs=temp;

proc freq data=have;
    table _all_ / missing;
    format _numeric_ nmissfmt. _character_ $missfmt.;
run;

ods select all;
*Format output;

data long;
    length variable $32. variable_value $50.;
    set temp;
    Variable=scan(table, 2);
    Variable_Value=strip(trim(vvaluex(variable)));
    presentation=catt(frequency, " (", trim(put(percent/100, percent7.1)), ")");
    keep variable variable_value frequency percent cum: presentation;
    label variable='Variable' variable_value='Variable Value';
run;

data long;
    length variable $32. variable_value $50.;
    set temp;
    Variable=scan(table, 2);
    Variable_Value=strip(trim(vvaluex(variable)));
    presentation=catt(frequency, " (", trim(put(percent/100, percent7.1)), ")");
    keep variable variable_value frequency percent cum: presentation;
    label variable='Variable' variable_value='Variable Value';
run;

proc sort data=long;
    by variable;
run;



*transpose only N;

proc transpose data=long (where=(variable_value in ('777', '999'))) out=wide_N prefix=N_;
    by variable;
    id variable_value;
    var frequency;
run;

Derived from this code here: https://gist.github.com/statgeek/2de1faf1644dc8160fe721056202f111

 

Gets you pretty close, but no 0's. I'm assuming you can figure out how to add those. 

 

@gloverb1 wrote:

Thanks for the quick response. Given the data below, is there a way to get an output formatted such: 

777 | 999

Q1    2   |   0

Q2    0   |   1

My end goal is to be able to identify any outlying number of 777 or 999 responses for a given variable. 

data have;
input q1 q2;
datalines;
0 999
1 .
3 0
777 1
0 2
2 3
3 2
777 1
;
run;

 


 

hackathon24-white-horiz.png

2025 SAS Hackathon: There is still time!

Good news: We've extended SAS Hackathon registration until Sept. 12, so you still have time to be part of our biggest event yet – our five-year anniversary!

Register Now

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

SAS Training: Just a Click Away

 Ready to level-up your skills? Choose your own adventure.

Browse our catalog!

Discussion stats
  • 3 replies
  • 967 views
  • 1 like
  • 2 in conversation