Solved: Re: proc Freq 2 way tables

pessi · Posted 08-06-2020 03:22 AM

Hi,

when I write the below code.

proc freq data=inp_ds ;
tables var_res*study_res / nocol norow nopercent ;
run;

i am getting below output.But I want NE which is in column(var-res) also to be in row(study_res) with all zero values.

Table of Var_Res by Study_Res
Var_Res(Var_Res)	Study_Res(Study_Res)
Frequency	PR	PD	SD	UN	Total
NE	0	0	3	0	3
PR	11	0	3	0	14
PD	0	10	9	0	19
SD	8	10	81	0	99
UN	0	0	0	35	35
Total	19	20	96	35	170

Expected output should be like this:

Table of Var_Res by Study_Res
Var_Res(Var_Res)	Study_Result(Study_Res)
Frequency	NE	PR	PD	SD	UN	Total
NE	0	0	0	3	0	3
PR	0	11	0	3	0	14
PD	0	0	10	9	0	19
SD	0	8	10	81	0	99
UN	0	0	0	0	35	35
Total	0	19	20	96	35	170

2)I should get all of these 5 NE,PR,PD,SD,UN. when there is no data for them, it should show as zeros.

proc freq data=inp_ds ;
tables var_res*study_res / nocol norow nopercent ;

reader;
run;

PaigeMiller · Posted 08-06-2020 08:32 AM

@pessi wrote:
I need to run analysis many times with different readers and different data.
For each reader VAR_RES and STUDY_RES will be there. But, they may not have all the 5 NE,PR,PD,SD,UN. When they miss 1 or 2, in the report it should be zeros. here i am giving you what output I got for John.and the expectation as well.

If it is always these exact 5: NE,PR,PD,SD,UN and the problem is that sometimes they are not there, then you need to add a record for NE and another record for PR and another record for PD and so on, these artificial records should have weight of 0, and all the real records should have a weight of 1. Then PROC FREQ with a WEIGHT statement will produce the desired results.

--
Paige Miller

View solution in original post

PaigeMiller · Posted 08-06-2020 07:06 AM

Does your raw data have NE in variable STUDY_RES?

--
Paige Miller

pessi · Posted 08-06-2020 07:11 AM

No..
NE isn't there STUDY_RES..
NE is present in only VAR_RES.

PaigeMiller · Posted 08-06-2020 07:28 AM

That's why you don't get NE in STUDY_RES in the output table.

You can trick PROC FREQ by adding a row to your raw data set that has NE in STUDY_RES. For this to work, you would have to assign a weight variable with a value of 0 to this row, and a value of 1 to all other rows. Then you would add a WEIGHT statement that uses this weight variable to PROC FREQ.

--
Paige Miller

pessi · Posted 08-06-2020 07:41 AM

If it is for only one time then I can do..But, I need to get proc freq by reader. Then some readers missing not just NE but also others like PR,PD.. example below for Reader John.
he is missing NE and PR. in place of them it should be 0. Is there any way to do that..?
Reader=John
Table of Var_Res by Study_Res
Var_Res Study_Res(Study_Res)
Frequency PD SD UN Total
PD 6 4 0 10
SD 4 11 0 15
UN 0 0 8 8
Total 10 15 8 33

PaigeMiller · Posted 08-06-2020 08:08 AM

Are you saying (and do I understand this properly) that each time you have to run this analysis, there could be different individuals, and sometimes some of the individuals appear only in VAR_RES and sometimes some of them appear only in STUDY_RES?

--
Paige Miller

pessi · Posted 08-06-2020 08:19 AM

I need to run analysis many times with different readers and different data.
For each reader VAR_RES and STUDY_RES will be there. But, they may not have all the 5 NE,PR,PD,SD,UN. When they miss 1 or 2, in the report it should be zeros. here i am giving you what output I got for John.and the expectation as well.
is it possible even through any other procedure..?
What I am getting:
Reader=John
Table of Var_Res by Study_Res
Var_Res Study_Res(Study_Res)
Frequency PD SD UN Total
PD 6 4 0 10
SD 4 11 0 15
UN 0 0 8 8
Total 10 15 8 33

Expected:
Reader=John
Table of Var_Res by Study_Res
Var_Res Study_Res(Study_Res)
Frequency NE PR PD SD UN Total
NE 0 0 0 0 0 0
PR 0 0 0 0 0 0
PD 0 0 6 4 0 10
SD 0 0 4 11 0 15
UN 0 0 0 0 8 8
Total 0 0 10 15 8 33

PaigeMiller · Posted 08-06-2020 08:32 AM

@pessi wrote:
I need to run analysis many times with different readers and different data.
For each reader VAR_RES and STUDY_RES will be there. But, they may not have all the 5 NE,PR,PD,SD,UN. When they miss 1 or 2, in the report it should be zeros. here i am giving you what output I got for John.and the expectation as well.

If it is always these exact 5: NE,PR,PD,SD,UN and the problem is that sometimes they are not there, then you need to add a record for NE and another record for PR and another record for PD and so on, these artificial records should have weight of 0, and all the real records should have a weight of 1. Then PROC FREQ with a WEIGHT statement will produce the desired results.

--
Paige Miller

pessi · Posted 08-06-2020 08:52 AM

Every time, I will get these 5 only. So do I need to add for each reader what ever they are missing(either NE or PD or PD) and weight as 0? Let me try that.

Watts · Posted 08-06-2020 09:15 AM

PROC FREQ includes observations that have 0 weights if you specify the ZEROS option in the WEIGHT statement. Otherwise, PROC FREQ ignores observations that have 0 weights.

pessi · Posted 08-07-2020 06:04 AM

Yes..It worked. Thanks @Paige Miller.

As said by LenoxPaces_SAS, I have kept ZEROS option as well.

ballardw · Posted 08-06-2020 12:09 PM

You may want to consider another procedure that has options to handle some of this.

Proc Tabulate for instance can use a PRELOADFMT option with class variables to include a category defined by a variable that does not occur in the actual data and the ability to display a zero that isn't actually there.

The following code creates a small example data set with two categorical variables that will have the same format applied with more categories that actually appear in the data, a format to display them and an example with Proc Tabulate to display the counts. Note: this may not do what you likely want if you want percentages.

data have;
   length cat1 cat2 $ 3;
   do Cat1 = 'A', 'B','C';
     do cat2 = 'A','B';
        do i= 1 to (rand('integer',4));
         output;
       end;
     end;
   end;
run;

proc format;
value $cat
'A' = 'A'
'B' = 'B'
'C' = 'C'
'D' = 'D'
;
run;

proc tabulate data=have;
   class cat1 cat2/preloadfmt missing;
   format cat1 cat2 $cat.;
   table cat1,
         cat2*n=' '
         /printmiss misstext='0'
   ;
run;

To use this you need a format that displays all of the values you want, use the preloadfmt option and one or both of the order=data or Printmiss option. The Misstext option is text to display when there is no value to calculate the statistic. The different variables could have there own formats or apply the Preloadfmt to only one using separate Class statements.

Catch up on SAS Innovate 2026