BookmarkSubscribeRSS Feed
🔒 This topic is solved and locked. Need further help from the community? Please sign in and ask a new question.
turcay
Lapis Lazuli | Level 10

Hello everyone,

 

I have a sample dataset. I want to see count of “Column1” Variable values. When I use PROC FREQ it creates what I want but I also want to see missing values as zero(0). I added my desired output. I’m not sure maybe I need to create PROC FORMAT or another procedure.

 

data have;
length Column1 8;
infile datalines missover dlm=",";
input Column1;
datalines;
2
4
5
2
4
5
2
4
5
;
Run;
Proc freq data=have;
table Column1 / out=want;
run;

Want.png

 

Thank you.

1 ACCEPTED SOLUTION

Accepted Solutions
FreelanceReinh
Jade | Level 19

I think you are looking for an equivalent of PRELOADFMT for PROC FREQ. But to my knowledge this does not exist, in spite of numerous discussions, papers and SASware ballot entries about this blatantly missing feature.

 

In many cases you will have a format which contains all possible categories. (The format labels are irrelevant.) So, you can use (part of) the CNTLOUT dataset of that format for a left join with the PROC FREQ output dataset, as shown in the following example, based on your dataset WANT:

proc format;
value myfmt
1='1'
2='2'
3='3'
4='4'
5='5'
;
run;

proc format cntlout=fmt(keep=start);
select myfmt;
run;

proc sql;
create table want1 as
select input(start, 16.)     as column1,
       coalesce(a.count,0)   as count,
       coalesce(a.percent,0) as percent
from fmt left join want a
on a.column1=input(start, 16.);
quit;

As an alternative to the PROC SQL step, you can use one or two data steps: one for converting the character values of variable START to numeric values (this step could be omitted if the "COLUMN1" variable was character and the values of the renamed START variable matched COLUMN1 values) and one MERGE step as in the following example:

data numfmt;
set fmt;
column1=input(start, 16.);
drop start;
run;

data want1;
merge want(in=a)
      numfmt;
by column1;
if ~a then do;
  count=0;
  percent=0;
end;
run;

The advantage of the data step solution is that you work with the original variables of WANT. In particular, COUNT and PERCENT keep their original labels.

View solution in original post

7 REPLIES 7
FreelanceReinh
Jade | Level 19

I think you are looking for an equivalent of PRELOADFMT for PROC FREQ. But to my knowledge this does not exist, in spite of numerous discussions, papers and SASware ballot entries about this blatantly missing feature.

 

In many cases you will have a format which contains all possible categories. (The format labels are irrelevant.) So, you can use (part of) the CNTLOUT dataset of that format for a left join with the PROC FREQ output dataset, as shown in the following example, based on your dataset WANT:

proc format;
value myfmt
1='1'
2='2'
3='3'
4='4'
5='5'
;
run;

proc format cntlout=fmt(keep=start);
select myfmt;
run;

proc sql;
create table want1 as
select input(start, 16.)     as column1,
       coalesce(a.count,0)   as count,
       coalesce(a.percent,0) as percent
from fmt left join want a
on a.column1=input(start, 16.);
quit;

As an alternative to the PROC SQL step, you can use one or two data steps: one for converting the character values of variable START to numeric values (this step could be omitted if the "COLUMN1" variable was character and the values of the renamed START variable matched COLUMN1 values) and one MERGE step as in the following example:

data numfmt;
set fmt;
column1=input(start, 16.);
drop start;
run;

data want1;
merge want(in=a)
      numfmt;
by column1;
if ~a then do;
  count=0;
  percent=0;
end;
run;

The advantage of the data step solution is that you work with the original variables of WANT. In particular, COUNT and PERCENT keep their original labels.

Reeza
Super User

There are several ways and they're well documented in this paper.

http://www.lexjansen.com/nesug/nesug11/cc/cc29.pdf

 

An additional note - if the value is listed in any of the levels you can use the SPARSE option in proc freq. But if it's not available anywhere then you'll have to use a different method.

FreelanceReinh
Jade | Level 19

@Reeza: Sure, the SPARSE option is great, if you have a cross tabulation and there are no rows or columns with only zero frequencies. Given @turcay's HAVE dataset, it would not help, even if one cross-tabulated Column1 with a dummy variable.

turcay
Lapis Lazuli | Level 10

Hello everyone,

 

First of all , thank you for your helps. I checked all your advises and @FreelanceReinh's sample codes helped me and I created what I want. It is always good to see sample code to understand best. I'm not familiar with COALESCE function I checked this link https://support.sas.com/documentation/cdl/en/lrdict/64316/HTML/default/viewer.htm#a002518172.htm and as far as I understand It returns numeric(specified) value instead of missing values. Your two method was beneficial for me but I used your first method. 

 

Thanks eveyrbody.

Reeza
Super User
Coalesce will check a list of variables, it can be more than two and assigns the first non-missing value it encounters.

ie coalesce(var1, var2, var3)
if var1 & var2 are missing then it returns var3
if var1 is missing then it returns var2
turcay
Lapis Lazuli | Level 10

Thank you @Reeza

 

Firstly, I couldn't understand well after I examined detailed I thought like your expression now I became clear.

 

Thanks again.

SAS Innovate 2025: Register Now

Registration is now open for SAS Innovate 2025 , our biggest and most exciting global event of the year! Join us in Orlando, FL, May 6-9.
Sign up by Dec. 31 to get the 2024 rate of just $495.
Register now!

What is Bayesian Analysis?

Learn the difference between classical and Bayesian statistical approaches and see a few PROC examples to perform Bayesian analysis in this video.

Find more tutorials on the SAS Users YouTube channel.

SAS Training: Just a Click Away

 Ready to level-up your skills? Choose your own adventure.

Browse our catalog!

Discussion stats
  • 7 replies
  • 2558 views
  • 2 likes
  • 4 in conversation