BookmarkSubscribeRSS Feed
🔒 This topic is solved and locked. Need further help from the community? Please sign in and ask a new question.
pessi
Obsidian | Level 7

Hi,

when I write the below code.

proc freq data=inp_ds ;
tables var_res*study_res / nocol norow nopercent ;
run;

i am getting below output.But I want NE which is in column(var-res) also to be in row(study_res) with all zero values. 

Table of Var_Res by Study_Res

Var_Res(Var_Res)

Study_Res(Study_Res)

Frequency

 PR

PD

    SD

UN

Total

NE

0

0

3

0

3

PR

11

0

3

0

14

PD

0

10

9

0

19

SD

8

10

81

0

99

UN

0

0

0

35

35

Total

19

20

96

35

170

 

Expected output should be like this:

Table of Var_Res by Study_Res
Var_Res(Var_Res)Study_Result(Study_Res)
FrequencyNE PRPD    SDUNTotal
NE000303
PR01103014
PD00109019
SD081081099
UN00003535
Total019209635170

 

2)I should get all of these 5 NE,PR,PD,SD,UN. when there is no data for them, it should show as zeros.

proc freq data=inp_ds ;
tables var_res*study_res / nocol norow nopercent ;

reader;
run;

1 ACCEPTED SOLUTION

Accepted Solutions
PaigeMiller
Diamond | Level 26

@pessi wrote:
I need to run analysis many times with different readers and different data.
For each reader VAR_RES and STUDY_RES will be there. But, they may not have all the 5 NE,PR,PD,SD,UN. When they miss 1 or 2, in the report it should be zeros. here i am giving you what output I got for John.and the expectation as well.


If it is always these exact 5: NE,PR,PD,SD,UN and the problem is that sometimes they are not there, then you need to add a record for NE and another record for PR and another record for PD and so on, these artificial records should have weight of 0, and all the real records should have a weight of 1. Then PROC FREQ with a WEIGHT statement will produce the desired results.

--
Paige Miller

View solution in original post

11 REPLIES 11
PaigeMiller
Diamond | Level 26

Does your raw data have NE in variable STUDY_RES?

--
Paige Miller
pessi
Obsidian | Level 7
No..
NE isn't there STUDY_RES..
NE is present in only VAR_RES.
PaigeMiller
Diamond | Level 26

That's why you don't get NE in STUDY_RES in the output table.

 

You can trick PROC FREQ by adding a row to your raw data set that has NE in STUDY_RES. For this to work, you would have to assign a weight variable with a value of 0 to this row, and a value of 1 to all other rows. Then you would add a WEIGHT statement that uses this weight variable to PROC FREQ.

--
Paige Miller
pessi
Obsidian | Level 7
If it is for only one time then I can do..But, I need to get proc freq by reader. Then some readers missing not just NE but also others like PR,PD.. example below for Reader John.
he is missing NE and PR. in place of them it should be 0. Is there any way to do that..?
Reader=John
Table of Var_Res by Study_Res
Var_Res Study_Res(Study_Res)
Frequency PD SD UN Total
PD 6 4 0 10
SD 4 11 0 15
UN 0 0 8 8
Total 10 15 8 33
PaigeMiller
Diamond | Level 26

Are you saying (and do I understand this properly) that each time you have to run this analysis, there could be different individuals, and sometimes some of the individuals appear only in VAR_RES and sometimes some of them appear only in STUDY_RES?

--
Paige Miller
pessi
Obsidian | Level 7
I need to run analysis many times with different readers and different data.
For each reader VAR_RES and STUDY_RES will be there. But, they may not have all the 5 NE,PR,PD,SD,UN. When they miss 1 or 2, in the report it should be zeros. here i am giving you what output I got for John.and the expectation as well.
is it possible even through any other procedure..?
What I am getting:
Reader=John
Table of Var_Res by Study_Res
Var_Res Study_Res(Study_Res)
Frequency PD SD UN Total
PD 6 4 0 10
SD 4 11 0 15
UN 0 0 8 8
Total 10 15 8 33

Expected:
Reader=John
Table of Var_Res by Study_Res
Var_Res Study_Res(Study_Res)
Frequency NE PR PD SD UN Total
NE 0 0 0 0 0 0
PR 0 0 0 0 0 0
PD 0 0 6 4 0 10
SD 0 0 4 11 0 15
UN 0 0 0 0 8 8
Total 0 0 10 15 8 33
PaigeMiller
Diamond | Level 26

@pessi wrote:
I need to run analysis many times with different readers and different data.
For each reader VAR_RES and STUDY_RES will be there. But, they may not have all the 5 NE,PR,PD,SD,UN. When they miss 1 or 2, in the report it should be zeros. here i am giving you what output I got for John.and the expectation as well.


If it is always these exact 5: NE,PR,PD,SD,UN and the problem is that sometimes they are not there, then you need to add a record for NE and another record for PR and another record for PD and so on, these artificial records should have weight of 0, and all the real records should have a weight of 1. Then PROC FREQ with a WEIGHT statement will produce the desired results.

--
Paige Miller
pessi
Obsidian | Level 7
Every time, I will get these 5 only. So do I need to add for each reader what ever they are missing(either NE or PD or PD) and weight as 0? Let me try that.
Watts
SAS Employee

PROC FREQ includes observations that have 0 weights if you specify the ZEROS option in the WEIGHT statement. Otherwise, PROC FREQ ignores observations that have 0 weights.

 

pessi
Obsidian | Level 7
Yes..It worked. Thanks @Paige Miller.

As said by LenoxPaces_SAS, I have kept ZEROS option as well.
ballardw
Super User

You may want to consider another procedure that has options to handle some of this.

Proc Tabulate for instance can use a PRELOADFMT option with class variables to include a category defined by a variable that does not occur in the actual data and the ability to display a zero that isn't actually there.

 

The following code creates a small example data set with two categorical variables that will have the same format applied with more categories that actually appear in the data, a format to display them and an example with Proc Tabulate to display the counts. Note: this may not do what you likely want if you want percentages.

 

data have;
   length cat1 cat2 $ 3;
   do Cat1 = 'A', 'B','C';
     do cat2 = 'A','B';
        do i= 1 to (rand('integer',4));
         output;
       end;
     end;
   end;
run;

proc format;
value $cat
'A' = 'A'
'B' = 'B'
'C' = 'C'
'D' = 'D'
;
run;

proc tabulate data=have;
   class cat1 cat2/preloadfmt missing;
   format cat1 cat2 $cat.;
   table cat1,
         cat2*n=' '
         /printmiss misstext='0'
   ;
run;

To use this you need a format that displays all of the values you want, use the preloadfmt option and one or both of the order=data or Printmiss option. The Misstext option is text to display when there is no value to calculate the statistic. The different variables could have there own formats or apply the Preloadfmt to only one using separate Class statements.

sas-innovate-2024.png

Don't miss out on SAS Innovate - Register now for the FREE Livestream!

Can't make it to Vegas? No problem! Watch our general sessions LIVE or on-demand starting April 17th. Hear from SAS execs, best-selling author Adam Grant, Hot Ones host Sean Evans, top tech journalist Kara Swisher, AI expert Cassie Kozyrkov, and the mind-blowing dance crew iLuminate! Plus, get access to over 20 breakout sessions.

 

Register now!

What is ANOVA?

ANOVA, or Analysis Of Variance, is used to compare the averages or means of two or more populations to better understand how they differ. Watch this tutorial for more.

Find more tutorials on the SAS Users YouTube channel.

Discussion stats
  • 11 replies
  • 1373 views
  • 1 like
  • 4 in conversation