I have a numerically coded variable, the frequency table is:
| value | frequency |
| 1 | 33 |
| 2 | 22 |
| 9 | 11 |
I want to format it so that 1='YES',2='No', 9 is missing
do I applied a proc format:
proc format;
value myformat 1='Yes'
2='No'
other=' ';
run;After I applied this format to my data and run the proc freq again, the frequency table becomes:
| value | frequency |
| Yes | 33 |
| No | 22 |
| 11 |
it counts the ' ' as an independent category, not missing value
However, what I want is :
| value | frequency |
| yes | 33 |
| no | 22 |
| frequency missing:11 |
Is there a way to do that by only using format, instead of editing original data? Thanks!!!!
Works for me. What are you doing differently?
data have;
do value=1,2,9,.;
output;
end;
stop;
run;
proc format;
value myformat
1='Yes'
2='No'
other=' ';
run;
proc freq data=have ;
table value;
format value myformat.;
run;
@Patrick wrote:
Works for me. What are you doing differently?
Since the OP is not showing all of his code I suspect he is doing something more like:
data have;
input value frequency;
datalines;
1 33
2 22
9 11
;
proc format;
value myformat
1='Yes'
2='No'
other=' ';
run;
proc print data=have ;
format value myformat.;
run;
Which yields:
Obs value frequency
1 Yes 33
2 No 22
3 11
So for @clover , you may have to go back a step earlier and use the format when creating the frequencies not after. However you may end up having to "edit the data", often referred to as recode values into new variables with the 1,2 and missing values for other operations for more consistent results.
Or show more of your code.
Hi ballardw, thanks for your reply.
Yes, I formatted the data before I run frequency, and get the same frequency table as yours.
However, what I want is the table Patrick got, indicating 9 as missing, not ' ' in the frequency table.
If there is no way doing that in format, I will just re-code the data.
Hi Patrick,
I am doing the same thing as you.
I tried different examples and find out that:
if the original data has a missing value, as the example you gave(1,2,9, and .), then there would be no problem, '9' will be given to missing.
if the original data doesn't have missing value, as the example I gave (1,2, and 9), then '9' will be given to ' ' in the frequency table.
@clover wrote:
Hi Patrick,
I am doing the same thing as you.
I tried different examples and find out that:
if the original data has a missing value, as the example you gave(1,2,9, and .), then there would be no problem, '9' will be given to missing.
if the original data doesn't have missing value, as the example I gave (1,2, and 9), then '9' will be given to ' ' in the frequency table.
Looks like Proc Freq is first still looking at the internal values. It's - sort of - documented behaviour here.
I guess you will have to create a new variable - newvar=put(var,myformat.); - for use in Proc Format to get what you want.
When you use PROC FORMAT to create a user-written format that combines missing and nonmissing values into one category, PROC FREQ treats the entire category of formatted values as missing. For example, a questionnaire codes 1 as yes, 2 as no, and 8 as a no answer. The following PROC FORMAT statements create a user-written format:
proc format; value Questfmt 1 ='Yes' 2 ='No' 8,. ='Missing'; run;When you use a FORMAT statement to assign
Questfmt.to a variable, the variable’s frequency table no longer includes a frequency count for the response of 8. You must use the MISSING or MISSPRINT option in the TABLES statement to list the frequency for no answer. The frequency count for this level includes observations with either a value of 8 or a missing value (.).
It's finally time to hack! Remember to visit the SAS Hacker's Hub regularly for news and updates.
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.
Ready to level-up your skills? Choose your own adventure.