Hi,
I ran the following proc and got the data below.
proc freq data=ds1;
table var1;
by time group;
run;
data output;
input time group var1 count percentage;
datalines;
0 1 1 3 42.8
0 1 2 4 57.14
0 2 1 7 58.3
0 2 2 5 41.6
3 1 1 2 28.6
3 1 2 5 71.5
3 2 1 1 50
3 2 2 1 50
6 1 2 3 100
6 2 1 2 100
9 1 2 2 100
9 2 1 2 66.6
9 2 2 2 33.3
;
run;
Var1 is a categorical variable (1=yes, 2=no, 3=unknown). My question is if there is a way to use proc freq or if a data step is what is needed to get the desired output. These are the points to consider:
1) This is data grouped by window and by group. The percentage for each category of var1 is calculated in the above proc freq by: (count of category/sum of counts with categories with answers)*100.
2) However, we need to calculate the percentages by taking into account that the sample is 12. That is for the first row, the percentage should be calculated by (count of category/12)*100= 25%.Second row the percentage should be 33.33%.
3) A third category should be created to account for the unknown in the sample size of 12 (12-(3+4)=5). That means for the first group, the percentages should be 1=25%, 2=33.3%, 3= 41.6%. But since since this step is just one before other data wrangling, new rows have to be created (all rows were var1=3):
data want;
input time group var1 count percentage;
datalines;
0 1 1 3 25
0 1 2 4 33.33333333
0 1 3 5 41.66666667
0 2 1 7 58.33333333
0 2 2 5 41.66666667
0 2 3 0 0
3 1 1 2 16.66666667
3 1 2 5 41.66666667
3 1 3 5 41.66666667
3 2 1 1 8.333333333
3 2 2 1 8.333333333
3 2 3 10 83.33333333
6 1 2 3 25
6 1 3 9 75
6 2 1 2 16.66666667
6 2 3 10 83.33333333
9 1 2 2 16.66666667
9 1 3 10 83.33333333
9 2 1 2 16.66666667
9 2 2 2 16.66666667
9 2 3 8 66.66666667
;
run;
Thank you so much!
Hi, I was presented with another scenario of data.
data output;
input time group var1 count percentage;
datalines;
0 1 1 3 42.8
0 1 2 4 57.14
0 2 1 5 45.45
0 2 2 5 45.45
0 2 3 1 9.09
3 1 1 2 28.6
3 1 2 5 71.5
3 2 1 1 50
3 2 2 1 50
6 1 2 3 100
6 2 1 2 100
9 1 2 2 100
9 2 1 2 66.6
9 2 2 2 33.3
;
run;
The difference is that some datasets will have data for var1=3 (see above). How can you add the counts for the same var1=3? Right now if I run the code as it is I get data want as this:
data out;
input time group var1 count percentage;
datalines;
0 1 1 3 25
0 1 2 4 33.33333333
0 1 3 5 41.66666667
0 2 1 5 41.66666667
0 2 2 5 41.66666667
0 2 3 1 8.333333333
0 2 3 1 8.333333333
3 1 1 2 16.66666667
3 1 2 5 41.66666667
3 1 3 5 41.66666667
3 2 1 1 8.333333333
3 2 2 1 8.333333333
3 2 3 10 83.33333333
6 1 2 3 25
6 1 3 9 75
6 2 1 2 16.66666667
6 2 3 10 83.33333333
9 1 2 2 16.66666667
9 1 3 10 83.33333333
9 2 1 2 16.66666667
9 2 2 2 16.66666667
9 2 3 8 66.66666667
;
run;
But what I need is this:
data want;
input time group var1 count percentage;
datalines;
0 1 1 3 25
0 1 2 4 33.33333333
0 1 3 5 41.66666667
0 2 1 5 41.66666667
0 2 2 5 41.66666667
0 2 3 2 16.66666667
3 1 1 2 16.66666667
3 1 2 5 41.66666667
3 1 3 5 41.66666667
3 2 1 1 8.333333333
3 2 2 1 8.333333333
3 2 3 10 83.33333333
6 1 2 3 25
6 1 3 9 75
6 2 1 2 16.66666667
6 2 3 10 83.33333333
9 1 2 2 16.66666667
9 1 3 10 83.33333333
9 2 1 2 16.66666667
9 2 2 2 16.66666667
9 2 3 8 66.66666667
;
run;
Can the code provided be modified? Or how can data want be accomplished?
Assuming the data are sorted by TIME/GROUP/VAR1, then if you already have a VAR1=3 frequency, it will be the observation in which last.group=1.
In such cases you don't want to do the extra OUTPUT statement.
So, in the DO group that checks for last.group=1, just change that IF test from
IF last.group then do:
to a test that only works when last.group=1 and var1 is not already equal to 3.
Hi! Sorry for not getting back earlier. I deleted the second output, but the issue is that it only reports the 3's that were originally in the dataset and not the 3's derived from counting the observations from 1 and 2.
Are you ready for the spotlight? We're accepting content ideas for SAS Innovate 2025 to be held May 6-9 in Orlando, FL. The call is open until September 25. Read more here about why you should contribute and what is in it for you!
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.