Dear Community,
I'm trying to create a frequency table using the following sample dataset.
For subjects, var1-var3 are variables measured on different days (day variable). I would like to generate a frequency count for how many subjects with value=1 on any given day per study (startified by study).
Any help to achieve the output is greatly appreciated.
data test;
input id var1 var2 var3 day stud;
datalines;
100101 1 1 1 -4 1
100101 1 0 0 1 1
100101 1 0 0 3 1
100101 1 1 1 8 1
100101 1 0 0 15 1
100101 1 0 0 17 1
100101 0 1 1 27 1
100103 1 1 1 -5 1
100103 1 0 0 1 1
100103 1 0 0 3 1
100103 0 1 1 7 1
100106 1 1 1 -4 1
100106 1 0 0 1 1
100106 1 0 0 3 1
100106 1 1 1 8 1
100106 1 0 0 15 1
100106 1 0 0 17 1
100106 1 1 1 26 1
1000101 1 1 1 -13 2
1000101 1 0 0 -8 2
1000101 1 0 0 0 2
1000101 1 0 0 1 2
1000101 1 0 0 1 2
1000101 0 1 1 7 2
1000101 1 0 0 8 2
1000101 1 1 1 15 2
1000101 1 0 0 111 2
1000103 1 1 1 -18 2
1000103 1 0 0 -9 2
1000103 1 0 0 0 2
1000103 0 1 1 6 2
1000103 1 0 0 7 2
1000103 1 0 0 17 2
1000103 0 1 1 19 2
1000103 1 0 0 24 2
4900201 1 0 0 -9 2
4900201 0 1 1 -9 2
4900201 1 0 0 -8 2
4900201 1 0 0 0 2
4900201 1 0 0 1 2
4900201 1 0 0 2 2
4900201 0 1 1 6 2
4900201 1 0 0 7 2
4900201 1 0 0 15 2
;
run;
Any help to achieve the example output is greatly appreciated.
day | day | day | day | day | day | day | |
-4 | 1 | 3 | 8 | 15 | 17 | 27 | |
var1 | |||||||
var2 | |||||||
var3 |
Proc tabulate allows nesting and grouping in different manners.
See:
data test; input id var1 var2 var3 day stud; datalines; 100101 1 1 1 -4 1 100101 1 0 0 1 1 100101 1 0 0 3 1 100101 1 1 1 8 1 100101 1 0 0 15 1 100101 1 0 0 17 1 100101 0 1 1 27 1 100103 1 1 1 -5 1 100103 1 0 0 1 1 100103 1 0 0 3 1 100103 0 1 1 7 1 100106 1 1 1 -4 1 100106 1 0 0 1 1 100106 1 0 0 3 1 100106 1 1 1 8 1 100106 1 0 0 15 1 100106 1 0 0 17 1 100106 1 1 1 26 1 1000101 1 1 1 -13 2 1000101 1 0 0 -8 2 1000101 1 0 0 0 2 1000101 1 0 0 1 2 1000101 1 0 0 1 2 1000101 0 1 1 7 2 1000101 1 0 0 8 2 1000101 1 1 1 15 2 1000101 1 0 0 111 2 1000103 1 1 1 -18 2 1000103 1 0 0 -9 2 1000103 1 0 0 0 2 1000103 0 1 1 6 2 1000103 1 0 0 7 2 1000103 1 0 0 17 2 1000103 0 1 1 19 2 1000103 1 0 0 24 2 4900201 1 0 0 -9 2 4900201 0 1 1 -9 2 4900201 1 0 0 -8 2 4900201 1 0 0 0 2 4900201 1 0 0 1 2 4900201 1 0 0 2 2 4900201 0 1 1 6 2 4900201 1 0 0 7 2 4900201 1 0 0 15 2 ; run; proc tabulate data=test; class stud id day; var var1 var2 var3; /* this does separate table for each level of study*/ table stud='Study:', id, day*(var1 var2 var3)*(sum=""*f=best5.) /misstext=' ' ; /* nests id within levels of stud*/ table stud='Study' * id, day*(var1 var2 var3)*(sum=""*f=best5.) /misstext=' ' ; run;
As @Reeza sum gets the number of 1 values, there are a number of statistics that could be requested as well. Place them inside the () around sum to group as in these tables.
PROC MEANS, Example 2 & 10 in the documentation- and since you're using 1 and 0 you can use SUM, MEAN and N statistics for what you want.
SUM = Number of Value = 1
Mean = Percent = 1
N = Number of responses, 0 & 1's.
Proc tabulate allows nesting and grouping in different manners.
See:
data test; input id var1 var2 var3 day stud; datalines; 100101 1 1 1 -4 1 100101 1 0 0 1 1 100101 1 0 0 3 1 100101 1 1 1 8 1 100101 1 0 0 15 1 100101 1 0 0 17 1 100101 0 1 1 27 1 100103 1 1 1 -5 1 100103 1 0 0 1 1 100103 1 0 0 3 1 100103 0 1 1 7 1 100106 1 1 1 -4 1 100106 1 0 0 1 1 100106 1 0 0 3 1 100106 1 1 1 8 1 100106 1 0 0 15 1 100106 1 0 0 17 1 100106 1 1 1 26 1 1000101 1 1 1 -13 2 1000101 1 0 0 -8 2 1000101 1 0 0 0 2 1000101 1 0 0 1 2 1000101 1 0 0 1 2 1000101 0 1 1 7 2 1000101 1 0 0 8 2 1000101 1 1 1 15 2 1000101 1 0 0 111 2 1000103 1 1 1 -18 2 1000103 1 0 0 -9 2 1000103 1 0 0 0 2 1000103 0 1 1 6 2 1000103 1 0 0 7 2 1000103 1 0 0 17 2 1000103 0 1 1 19 2 1000103 1 0 0 24 2 4900201 1 0 0 -9 2 4900201 0 1 1 -9 2 4900201 1 0 0 -8 2 4900201 1 0 0 0 2 4900201 1 0 0 1 2 4900201 1 0 0 2 2 4900201 0 1 1 6 2 4900201 1 0 0 7 2 4900201 1 0 0 15 2 ; run; proc tabulate data=test; class stud id day; var var1 var2 var3; /* this does separate table for each level of study*/ table stud='Study:', id, day*(var1 var2 var3)*(sum=""*f=best5.) /misstext=' ' ; /* nests id within levels of stud*/ table stud='Study' * id, day*(var1 var2 var3)*(sum=""*f=best5.) /misstext=' ' ; run;
As @Reeza sum gets the number of 1 values, there are a number of statistics that could be requested as well. Place them inside the () around sum to group as in these tables.
.
Rather than *day try , day in the PROC TABULATE - look at the example again and note where the comma's are placed.
Join us for SAS Innovate 2025, our biggest and most exciting global event of the year, in Orlando, FL, from May 6-9. Sign up by March 14 for just $795.
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.
Ready to level-up your skills? Choose your own adventure.