Dear Community,
I'm trying to create a frequency table using the following sample dataset.
For subjects, var1-var3 are variables measured on different days (day variable). I would like to generate a frequency count for how many subjects with value=1 on any given day per study (startified by study).
Any help to achieve the output is greatly appreciated.
data test;
input id var1 var2 var3 day stud;
datalines;
100101 1 1 1 -4 1
100101 1 0 0 1 1
100101 1 0 0 3 1
100101 1 1 1 8 1
100101 1 0 0 15 1
100101 1 0 0 17 1
100101 0 1 1 27 1
100103 1 1 1 -5 1
100103 1 0 0 1 1
100103 1 0 0 3 1
100103 0 1 1 7 1
100106 1 1 1 -4 1
100106 1 0 0 1 1
100106 1 0 0 3 1
100106 1 1 1 8 1
100106 1 0 0 15 1
100106 1 0 0 17 1
100106 1 1 1 26 1
1000101 1 1 1 -13 2
1000101 1 0 0 -8 2
1000101 1 0 0 0 2
1000101 1 0 0 1 2
1000101 1 0 0 1 2
1000101 0 1 1 7 2
1000101 1 0 0 8 2
1000101 1 1 1 15 2
1000101 1 0 0 111 2
1000103 1 1 1 -18 2
1000103 1 0 0 -9 2
1000103 1 0 0 0 2
1000103 0 1 1 6 2
1000103 1 0 0 7 2
1000103 1 0 0 17 2
1000103 0 1 1 19 2
1000103 1 0 0 24 2
4900201 1 0 0 -9 2
4900201 0 1 1 -9 2
4900201 1 0 0 -8 2
4900201 1 0 0 0 2
4900201 1 0 0 1 2
4900201 1 0 0 2 2
4900201 0 1 1 6 2
4900201 1 0 0 7 2
4900201 1 0 0 15 2
;
run;
Any help to achieve the example output is greatly appreciated.
day | day | day | day | day | day | day | |
-4 | 1 | 3 | 8 | 15 | 17 | 27 | |
var1 | |||||||
var2 | |||||||
var3 |
Proc tabulate allows nesting and grouping in different manners.
See:
data test; input id var1 var2 var3 day stud; datalines; 100101 1 1 1 -4 1 100101 1 0 0 1 1 100101 1 0 0 3 1 100101 1 1 1 8 1 100101 1 0 0 15 1 100101 1 0 0 17 1 100101 0 1 1 27 1 100103 1 1 1 -5 1 100103 1 0 0 1 1 100103 1 0 0 3 1 100103 0 1 1 7 1 100106 1 1 1 -4 1 100106 1 0 0 1 1 100106 1 0 0 3 1 100106 1 1 1 8 1 100106 1 0 0 15 1 100106 1 0 0 17 1 100106 1 1 1 26 1 1000101 1 1 1 -13 2 1000101 1 0 0 -8 2 1000101 1 0 0 0 2 1000101 1 0 0 1 2 1000101 1 0 0 1 2 1000101 0 1 1 7 2 1000101 1 0 0 8 2 1000101 1 1 1 15 2 1000101 1 0 0 111 2 1000103 1 1 1 -18 2 1000103 1 0 0 -9 2 1000103 1 0 0 0 2 1000103 0 1 1 6 2 1000103 1 0 0 7 2 1000103 1 0 0 17 2 1000103 0 1 1 19 2 1000103 1 0 0 24 2 4900201 1 0 0 -9 2 4900201 0 1 1 -9 2 4900201 1 0 0 -8 2 4900201 1 0 0 0 2 4900201 1 0 0 1 2 4900201 1 0 0 2 2 4900201 0 1 1 6 2 4900201 1 0 0 7 2 4900201 1 0 0 15 2 ; run; proc tabulate data=test; class stud id day; var var1 var2 var3; /* this does separate table for each level of study*/ table stud='Study:', id, day*(var1 var2 var3)*(sum=""*f=best5.) /misstext=' ' ; /* nests id within levels of stud*/ table stud='Study' * id, day*(var1 var2 var3)*(sum=""*f=best5.) /misstext=' ' ; run;
As @Reeza sum gets the number of 1 values, there are a number of statistics that could be requested as well. Place them inside the () around sum to group as in these tables.
PROC MEANS, Example 2 & 10 in the documentation- and since you're using 1 and 0 you can use SUM, MEAN and N statistics for what you want.
SUM = Number of Value = 1
Mean = Percent = 1
N = Number of responses, 0 & 1's.
Proc tabulate allows nesting and grouping in different manners.
See:
data test; input id var1 var2 var3 day stud; datalines; 100101 1 1 1 -4 1 100101 1 0 0 1 1 100101 1 0 0 3 1 100101 1 1 1 8 1 100101 1 0 0 15 1 100101 1 0 0 17 1 100101 0 1 1 27 1 100103 1 1 1 -5 1 100103 1 0 0 1 1 100103 1 0 0 3 1 100103 0 1 1 7 1 100106 1 1 1 -4 1 100106 1 0 0 1 1 100106 1 0 0 3 1 100106 1 1 1 8 1 100106 1 0 0 15 1 100106 1 0 0 17 1 100106 1 1 1 26 1 1000101 1 1 1 -13 2 1000101 1 0 0 -8 2 1000101 1 0 0 0 2 1000101 1 0 0 1 2 1000101 1 0 0 1 2 1000101 0 1 1 7 2 1000101 1 0 0 8 2 1000101 1 1 1 15 2 1000101 1 0 0 111 2 1000103 1 1 1 -18 2 1000103 1 0 0 -9 2 1000103 1 0 0 0 2 1000103 0 1 1 6 2 1000103 1 0 0 7 2 1000103 1 0 0 17 2 1000103 0 1 1 19 2 1000103 1 0 0 24 2 4900201 1 0 0 -9 2 4900201 0 1 1 -9 2 4900201 1 0 0 -8 2 4900201 1 0 0 0 2 4900201 1 0 0 1 2 4900201 1 0 0 2 2 4900201 0 1 1 6 2 4900201 1 0 0 7 2 4900201 1 0 0 15 2 ; run; proc tabulate data=test; class stud id day; var var1 var2 var3; /* this does separate table for each level of study*/ table stud='Study:', id, day*(var1 var2 var3)*(sum=""*f=best5.) /misstext=' ' ; /* nests id within levels of stud*/ table stud='Study' * id, day*(var1 var2 var3)*(sum=""*f=best5.) /misstext=' ' ; run;
As @Reeza sum gets the number of 1 values, there are a number of statistics that could be requested as well. Place them inside the () around sum to group as in these tables.
.
Rather than *day try , day in the PROC TABULATE - look at the example again and note where the comma's are placed.
Are you ready for the spotlight? We're accepting content ideas for SAS Innovate 2025 to be held May 6-9 in Orlando, FL. The call is open until September 25. Read more here about why you should contribute and what is in it for you!
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.