Hi,
I ran the following proc and got the data below.
proc freq data=ds1;
table var1;
by time group;
run;
data output;
input time group var1 count percentage;
datalines;
0 1 1 3 42.8
0 1 2 4 57.14
0 2 1 7 58.3
0 2 2 5 41.6
3 1 1 2 28.6
3 1 2 5 71.5
3 2 1 1 50
3 2 2 1 50
6 1 2 3 100
6 2 1 2 100
9 1 2 2 100
9 2 1 2 66.6
9 2 2 2 33.3
;
run;
Var1 is a categorical variable (1=yes, 2=no, 3=unknown). My question is if there is a way to use proc freq or if a data step is what is needed to get the desired output. These are the points to consider:
1) This is data grouped by window and by group. The percentage for each category of var1 is calculated in the above proc freq by: (count of category/sum of counts with categories with answers)*100.
2) However, we need to calculate the percentages by taking into account that the sample is 12. That is for the first row, the percentage should be calculated by (count of category/12)*100= 25%.Second row the percentage should be 33.33%.
3) A third category should be created to account for the unknown in the sample size of 12 (12-(3+4)=5). That means for the first group, the percentages should be 1=25%, 2=33.3%, 3= 41.6%. But since since this step is just one before other data wrangling, new rows have to be created (all rows were var1=3):
data want;
input time group var1 count percentage;
datalines;
0 1 1 3 25
0 1 2 4 33.33333333
0 1 3 5 41.66666667
0 2 1 7 58.33333333
0 2 2 5 41.66666667
0 2 3 0 0
3 1 1 2 16.66666667
3 1 2 5 41.66666667
3 1 3 5 41.66666667
3 2 1 1 8.333333333
3 2 2 1 8.333333333
3 2 3 10 83.33333333
6 1 2 3 25
6 1 3 9 75
6 2 1 2 16.66666667
6 2 3 10 83.33333333
9 1 2 2 16.66666667
9 1 3 10 83.33333333
9 2 1 2 16.66666667
9 2 2 2 16.66666667
9 2 3 8 66.66666667
;
run;
Thank you so much!
Hi, I was presented with another scenario of data.
data output;
input time group var1 count percentage;
datalines;
0 1 1 3 42.8
0 1 2 4 57.14
0 2 1 5 45.45
0 2 2 5 45.45
0 2 3 1 9.09
3 1 1 2 28.6
3 1 2 5 71.5
3 2 1 1 50
3 2 2 1 50
6 1 2 3 100
6 2 1 2 100
9 1 2 2 100
9 2 1 2 66.6
9 2 2 2 33.3
;
run;
The difference is that some datasets will have data for var1=3 (see above). How can you add the counts for the same var1=3? Right now if I run the code as it is I get data want as this:
data out;
input time group var1 count percentage;
datalines;
0 1 1 3 25
0 1 2 4 33.33333333
0 1 3 5 41.66666667
0 2 1 5 41.66666667
0 2 2 5 41.66666667
0 2 3 1 8.333333333
0 2 3 1 8.333333333
3 1 1 2 16.66666667
3 1 2 5 41.66666667
3 1 3 5 41.66666667
3 2 1 1 8.333333333
3 2 2 1 8.333333333
3 2 3 10 83.33333333
6 1 2 3 25
6 1 3 9 75
6 2 1 2 16.66666667
6 2 3 10 83.33333333
9 1 2 2 16.66666667
9 1 3 10 83.33333333
9 2 1 2 16.66666667
9 2 2 2 16.66666667
9 2 3 8 66.66666667
;
run;
But what I need is this:
data want;
input time group var1 count percentage;
datalines;
0 1 1 3 25
0 1 2 4 33.33333333
0 1 3 5 41.66666667
0 2 1 5 41.66666667
0 2 2 5 41.66666667
0 2 3 2 16.66666667
3 1 1 2 16.66666667
3 1 2 5 41.66666667
3 1 3 5 41.66666667
3 2 1 1 8.333333333
3 2 2 1 8.333333333
3 2 3 10 83.33333333
6 1 2 3 25
6 1 3 9 75
6 2 1 2 16.66666667
6 2 3 10 83.33333333
9 1 2 2 16.66666667
9 1 3 10 83.33333333
9 2 1 2 16.66666667
9 2 2 2 16.66666667
9 2 3 8 66.66666667
;
run;
Can the code provided be modified? Or how can data want be accomplished?
Assuming the data are sorted by TIME/GROUP/VAR1, then if you already have a VAR1=3 frequency, it will be the observation in which last.group=1.
In such cases you don't want to do the extra OUTPUT statement.
So, in the DO group that checks for last.group=1, just change that IF test from
IF last.group then do:
to a test that only works when last.group=1 and var1 is not already equal to 3.
Hi! Sorry for not getting back earlier. I deleted the second output, but the issue is that it only reports the 3's that were originally in the dataset and not the 3's derived from counting the observations from 1 and 2.
Registration is now open for SAS Innovate 2025 , our biggest and most exciting global event of the year! Join us in Orlando, FL, May 6-9.
Sign up by Dec. 31 to get the 2024 rate of just $495.
Register now!
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.
Ready to level-up your skills? Choose your own adventure.