Hi,
when I write the below code.
proc freq data=inp_ds ;
tables var_res*study_res / nocol norow nopercent ;
run;
i am getting below output.But I want NE which is in column(var-res) also to be in row(study_res) with all zero values.
Table of Var_Res by Study_Res | |||||
Var_Res(Var_Res) | Study_Res(Study_Res) | ||||
Frequency | PR | PD | SD | UN | Total |
NE | 0 | 0 | 3 | 0 | 3 |
PR | 11 | 0 | 3 | 0 | 14 |
PD | 0 | 10 | 9 | 0 | 19 |
SD | 8 | 10 | 81 | 0 | 99 |
UN | 0 | 0 | 0 | 35 | 35 |
Total | 19 | 20 | 96 | 35 | 170 |
Expected output should be like this:
Table of Var_Res by Study_Res | ||||||
Var_Res(Var_Res) | Study_Result(Study_Res) | |||||
Frequency | NE | PR | PD | SD | UN | Total |
NE | 0 | 0 | 0 | 3 | 0 | 3 |
PR | 0 | 11 | 0 | 3 | 0 | 14 |
PD | 0 | 0 | 10 | 9 | 0 | 19 |
SD | 0 | 8 | 10 | 81 | 0 | 99 |
UN | 0 | 0 | 0 | 0 | 35 | 35 |
Total | 0 | 19 | 20 | 96 | 35 | 170 |
2)I should get all of these 5 NE,PR,PD,SD,UN. when there is no data for them, it should show as zeros.
proc freq data=inp_ds ;
tables var_res*study_res / nocol norow nopercent ;
reader;
run;
@pessi wrote:
I need to run analysis many times with different readers and different data.
For each reader VAR_RES and STUDY_RES will be there. But, they may not have all the 5 NE,PR,PD,SD,UN. When they miss 1 or 2, in the report it should be zeros. here i am giving you what output I got for John.and the expectation as well.
If it is always these exact 5: NE,PR,PD,SD,UN and the problem is that sometimes they are not there, then you need to add a record for NE and another record for PR and another record for PD and so on, these artificial records should have weight of 0, and all the real records should have a weight of 1. Then PROC FREQ with a WEIGHT statement will produce the desired results.
Does your raw data have NE in variable STUDY_RES?
That's why you don't get NE in STUDY_RES in the output table.
You can trick PROC FREQ by adding a row to your raw data set that has NE in STUDY_RES. For this to work, you would have to assign a weight variable with a value of 0 to this row, and a value of 1 to all other rows. Then you would add a WEIGHT statement that uses this weight variable to PROC FREQ.
Are you saying (and do I understand this properly) that each time you have to run this analysis, there could be different individuals, and sometimes some of the individuals appear only in VAR_RES and sometimes some of them appear only in STUDY_RES?
@pessi wrote:
I need to run analysis many times with different readers and different data.
For each reader VAR_RES and STUDY_RES will be there. But, they may not have all the 5 NE,PR,PD,SD,UN. When they miss 1 or 2, in the report it should be zeros. here i am giving you what output I got for John.and the expectation as well.
If it is always these exact 5: NE,PR,PD,SD,UN and the problem is that sometimes they are not there, then you need to add a record for NE and another record for PR and another record for PD and so on, these artificial records should have weight of 0, and all the real records should have a weight of 1. Then PROC FREQ with a WEIGHT statement will produce the desired results.
PROC FREQ includes observations that have 0 weights if you specify the ZEROS option in the WEIGHT statement. Otherwise, PROC FREQ ignores observations that have 0 weights.
You may want to consider another procedure that has options to handle some of this.
Proc Tabulate for instance can use a PRELOADFMT option with class variables to include a category defined by a variable that does not occur in the actual data and the ability to display a zero that isn't actually there.
The following code creates a small example data set with two categorical variables that will have the same format applied with more categories that actually appear in the data, a format to display them and an example with Proc Tabulate to display the counts. Note: this may not do what you likely want if you want percentages.
data have; length cat1 cat2 $ 3; do Cat1 = 'A', 'B','C'; do cat2 = 'A','B'; do i= 1 to (rand('integer',4)); output; end; end; end; run; proc format; value $cat 'A' = 'A' 'B' = 'B' 'C' = 'C' 'D' = 'D' ; run; proc tabulate data=have; class cat1 cat2/preloadfmt missing; format cat1 cat2 $cat.; table cat1, cat2*n=' ' /printmiss misstext='0' ; run;
To use this you need a format that displays all of the values you want, use the preloadfmt option and one or both of the order=data or Printmiss option. The Misstext option is text to display when there is no value to calculate the statistic. The different variables could have there own formats or apply the Preloadfmt to only one using separate Class statements.
Are you ready for the spotlight? We're accepting content ideas for SAS Innovate 2025 to be held May 6-9 in Orlando, FL. The call is open until September 25. Read more here about why you should contribute and what is in it for you!
ANOVA, or Analysis Of Variance, is used to compare the averages or means of two or more populations to better understand how they differ. Watch this tutorial for more.
Find more tutorials on the SAS Users YouTube channel.