Dear Community,
I'm trying to create a frequency table using the following sample dataset.
For subjects, var1-var3 are variables measured on different days (day variable). I would like to generate a frequency count for how many subjects with value=1 on any given day per study (startified by study).
Any help to achieve the output is greatly appreciated.
data test;
input id var1 var2 var3 day stud;
datalines;
100101 1 1 1 -4 1
100101 1 0 0 1 1
100101 1 0 0 3 1
100101 1 1 1 8 1
100101 1 0 0 15 1
100101 1 0 0 17 1
100101 0 1 1 27 1
100103 1 1 1 -5 1
100103 1 0 0 1 1
100103 1 0 0 3 1
100103 0 1 1 7 1
100106 1 1 1 -4 1
100106 1 0 0 1 1
100106 1 0 0 3 1
100106 1 1 1 8 1
100106 1 0 0 15 1
100106 1 0 0 17 1
100106 1 1 1 26 1
1000101 1 1 1 -13 2
1000101 1 0 0 -8 2
1000101 1 0 0 0 2
1000101 1 0 0 1 2
1000101 1 0 0 1 2
1000101 0 1 1 7 2
1000101 1 0 0 8 2
1000101 1 1 1 15 2
1000101 1 0 0 111 2
1000103 1 1 1 -18 2
1000103 1 0 0 -9 2
1000103 1 0 0 0 2
1000103 0 1 1 6 2
1000103 1 0 0 7 2
1000103 1 0 0 17 2
1000103 0 1 1 19 2
1000103 1 0 0 24 2
4900201 1 0 0 -9 2
4900201 0 1 1 -9 2
4900201 1 0 0 -8 2
4900201 1 0 0 0 2
4900201 1 0 0 1 2
4900201 1 0 0 2 2
4900201 0 1 1 6 2
4900201 1 0 0 7 2
4900201 1 0 0 15 2
;
run;
Any help to achieve the example output is greatly appreciated.
day | day | day | day | day | day | day | |
-4 | 1 | 3 | 8 | 15 | 17 | 27 | |
var1 | |||||||
var2 | |||||||
var3 |
Proc tabulate allows nesting and grouping in different manners.
See:
data test; input id var1 var2 var3 day stud; datalines; 100101 1 1 1 -4 1 100101 1 0 0 1 1 100101 1 0 0 3 1 100101 1 1 1 8 1 100101 1 0 0 15 1 100101 1 0 0 17 1 100101 0 1 1 27 1 100103 1 1 1 -5 1 100103 1 0 0 1 1 100103 1 0 0 3 1 100103 0 1 1 7 1 100106 1 1 1 -4 1 100106 1 0 0 1 1 100106 1 0 0 3 1 100106 1 1 1 8 1 100106 1 0 0 15 1 100106 1 0 0 17 1 100106 1 1 1 26 1 1000101 1 1 1 -13 2 1000101 1 0 0 -8 2 1000101 1 0 0 0 2 1000101 1 0 0 1 2 1000101 1 0 0 1 2 1000101 0 1 1 7 2 1000101 1 0 0 8 2 1000101 1 1 1 15 2 1000101 1 0 0 111 2 1000103 1 1 1 -18 2 1000103 1 0 0 -9 2 1000103 1 0 0 0 2 1000103 0 1 1 6 2 1000103 1 0 0 7 2 1000103 1 0 0 17 2 1000103 0 1 1 19 2 1000103 1 0 0 24 2 4900201 1 0 0 -9 2 4900201 0 1 1 -9 2 4900201 1 0 0 -8 2 4900201 1 0 0 0 2 4900201 1 0 0 1 2 4900201 1 0 0 2 2 4900201 0 1 1 6 2 4900201 1 0 0 7 2 4900201 1 0 0 15 2 ; run; proc tabulate data=test; class stud id day; var var1 var2 var3; /* this does separate table for each level of study*/ table stud='Study:', id, day*(var1 var2 var3)*(sum=""*f=best5.) /misstext=' ' ; /* nests id within levels of stud*/ table stud='Study' * id, day*(var1 var2 var3)*(sum=""*f=best5.) /misstext=' ' ; run;
As @Reeza sum gets the number of 1 values, there are a number of statistics that could be requested as well. Place them inside the () around sum to group as in these tables.
PROC MEANS, Example 2 & 10 in the documentation- and since you're using 1 and 0 you can use SUM, MEAN and N statistics for what you want.
SUM = Number of Value = 1
Mean = Percent = 1
N = Number of responses, 0 & 1's.
Proc tabulate allows nesting and grouping in different manners.
See:
data test; input id var1 var2 var3 day stud; datalines; 100101 1 1 1 -4 1 100101 1 0 0 1 1 100101 1 0 0 3 1 100101 1 1 1 8 1 100101 1 0 0 15 1 100101 1 0 0 17 1 100101 0 1 1 27 1 100103 1 1 1 -5 1 100103 1 0 0 1 1 100103 1 0 0 3 1 100103 0 1 1 7 1 100106 1 1 1 -4 1 100106 1 0 0 1 1 100106 1 0 0 3 1 100106 1 1 1 8 1 100106 1 0 0 15 1 100106 1 0 0 17 1 100106 1 1 1 26 1 1000101 1 1 1 -13 2 1000101 1 0 0 -8 2 1000101 1 0 0 0 2 1000101 1 0 0 1 2 1000101 1 0 0 1 2 1000101 0 1 1 7 2 1000101 1 0 0 8 2 1000101 1 1 1 15 2 1000101 1 0 0 111 2 1000103 1 1 1 -18 2 1000103 1 0 0 -9 2 1000103 1 0 0 0 2 1000103 0 1 1 6 2 1000103 1 0 0 7 2 1000103 1 0 0 17 2 1000103 0 1 1 19 2 1000103 1 0 0 24 2 4900201 1 0 0 -9 2 4900201 0 1 1 -9 2 4900201 1 0 0 -8 2 4900201 1 0 0 0 2 4900201 1 0 0 1 2 4900201 1 0 0 2 2 4900201 0 1 1 6 2 4900201 1 0 0 7 2 4900201 1 0 0 15 2 ; run; proc tabulate data=test; class stud id day; var var1 var2 var3; /* this does separate table for each level of study*/ table stud='Study:', id, day*(var1 var2 var3)*(sum=""*f=best5.) /misstext=' ' ; /* nests id within levels of stud*/ table stud='Study' * id, day*(var1 var2 var3)*(sum=""*f=best5.) /misstext=' ' ; run;
As @Reeza sum gets the number of 1 values, there are a number of statistics that could be requested as well. Place them inside the () around sum to group as in these tables.
.
Rather than *day try , day in the PROC TABULATE - look at the example again and note where the comma's are placed.
SAS Innovate 2025 is scheduled for May 6-9 in Orlando, FL. Sign up to be first to learn about the agenda and registration!
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.
Ready to level-up your skills? Choose your own adventure.