Dear Community,
I'm trying to create a frequency table using the following sample dataset.
For subjects, var1-var3 are variables measured on different days (day variable). I would like to generate a frequency count for how many subjects with value=1 on any given day per study (startified by study).
Any help to achieve the output is greatly appreciated.
data test;
input id var1 var2 var3 day stud;
datalines;
100101 1 1 1 -4 1
100101 1 0 0 1 1
100101 1 0 0 3 1
100101 1 1 1 8 1
100101 1 0 0 15 1
100101 1 0 0 17 1
100101 0 1 1 27 1
100103 1 1 1 -5 1
100103 1 0 0 1 1
100103 1 0 0 3 1
100103 0 1 1 7 1
100106 1 1 1 -4 1
100106 1 0 0 1 1
100106 1 0 0 3 1
100106 1 1 1 8 1
100106 1 0 0 15 1
100106 1 0 0 17 1
100106 1 1 1 26 1
1000101 1 1 1 -13 2
1000101 1 0 0 -8 2
1000101 1 0 0 0 2
1000101 1 0 0 1 2
1000101 1 0 0 1 2
1000101 0 1 1 7 2
1000101 1 0 0 8 2
1000101 1 1 1 15 2
1000101 1 0 0 111 2
1000103 1 1 1 -18 2
1000103 1 0 0 -9 2
1000103 1 0 0 0 2
1000103 0 1 1 6 2
1000103 1 0 0 7 2
1000103 1 0 0 17 2
1000103 0 1 1 19 2
1000103 1 0 0 24 2
4900201 1 0 0 -9 2
4900201 0 1 1 -9 2
4900201 1 0 0 -8 2
4900201 1 0 0 0 2
4900201 1 0 0 1 2
4900201 1 0 0 2 2
4900201 0 1 1 6 2
4900201 1 0 0 7 2
4900201 1 0 0 15 2
;
run;
Any help to achieve the example output is greatly appreciated.
| day | day | day | day | day | day | day | |
| -4 | 1 | 3 | 8 | 15 | 17 | 27 | |
| var1 | |||||||
| var2 | |||||||
| var3 |
Proc tabulate allows nesting and grouping in different manners.
See:
data test;
input id var1 var2 var3 day stud;
datalines;
100101 1 1 1 -4 1
100101 1 0 0 1 1
100101 1 0 0 3 1
100101 1 1 1 8 1
100101 1 0 0 15 1
100101 1 0 0 17 1
100101 0 1 1 27 1
100103 1 1 1 -5 1
100103 1 0 0 1 1
100103 1 0 0 3 1
100103 0 1 1 7 1
100106 1 1 1 -4 1
100106 1 0 0 1 1
100106 1 0 0 3 1
100106 1 1 1 8 1
100106 1 0 0 15 1
100106 1 0 0 17 1
100106 1 1 1 26 1
1000101 1 1 1 -13 2
1000101 1 0 0 -8 2
1000101 1 0 0 0 2
1000101 1 0 0 1 2
1000101 1 0 0 1 2
1000101 0 1 1 7 2
1000101 1 0 0 8 2
1000101 1 1 1 15 2
1000101 1 0 0 111 2
1000103 1 1 1 -18 2
1000103 1 0 0 -9 2
1000103 1 0 0 0 2
1000103 0 1 1 6 2
1000103 1 0 0 7 2
1000103 1 0 0 17 2
1000103 0 1 1 19 2
1000103 1 0 0 24 2
4900201 1 0 0 -9 2
4900201 0 1 1 -9 2
4900201 1 0 0 -8 2
4900201 1 0 0 0 2
4900201 1 0 0 1 2
4900201 1 0 0 2 2
4900201 0 1 1 6 2
4900201 1 0 0 7 2
4900201 1 0 0 15 2
;
run;
proc tabulate data=test;
class stud id day;
var var1 var2 var3;
/* this does separate table for each level of study*/
table stud='Study:',
id,
day*(var1 var2 var3)*(sum=""*f=best5.)
/misstext=' '
;
/* nests id within levels of stud*/
table stud='Study' * id,
day*(var1 var2 var3)*(sum=""*f=best5.)
/misstext=' '
;
run;
As @Reeza sum gets the number of 1 values, there are a number of statistics that could be requested as well. Place them inside the () around sum to group as in these tables.
PROC MEANS, Example 2 & 10 in the documentation- and since you're using 1 and 0 you can use SUM, MEAN and N statistics for what you want.
SUM = Number of Value = 1
Mean = Percent = 1
N = Number of responses, 0 & 1's.
Proc tabulate allows nesting and grouping in different manners.
See:
data test;
input id var1 var2 var3 day stud;
datalines;
100101 1 1 1 -4 1
100101 1 0 0 1 1
100101 1 0 0 3 1
100101 1 1 1 8 1
100101 1 0 0 15 1
100101 1 0 0 17 1
100101 0 1 1 27 1
100103 1 1 1 -5 1
100103 1 0 0 1 1
100103 1 0 0 3 1
100103 0 1 1 7 1
100106 1 1 1 -4 1
100106 1 0 0 1 1
100106 1 0 0 3 1
100106 1 1 1 8 1
100106 1 0 0 15 1
100106 1 0 0 17 1
100106 1 1 1 26 1
1000101 1 1 1 -13 2
1000101 1 0 0 -8 2
1000101 1 0 0 0 2
1000101 1 0 0 1 2
1000101 1 0 0 1 2
1000101 0 1 1 7 2
1000101 1 0 0 8 2
1000101 1 1 1 15 2
1000101 1 0 0 111 2
1000103 1 1 1 -18 2
1000103 1 0 0 -9 2
1000103 1 0 0 0 2
1000103 0 1 1 6 2
1000103 1 0 0 7 2
1000103 1 0 0 17 2
1000103 0 1 1 19 2
1000103 1 0 0 24 2
4900201 1 0 0 -9 2
4900201 0 1 1 -9 2
4900201 1 0 0 -8 2
4900201 1 0 0 0 2
4900201 1 0 0 1 2
4900201 1 0 0 2 2
4900201 0 1 1 6 2
4900201 1 0 0 7 2
4900201 1 0 0 15 2
;
run;
proc tabulate data=test;
class stud id day;
var var1 var2 var3;
/* this does separate table for each level of study*/
table stud='Study:',
id,
day*(var1 var2 var3)*(sum=""*f=best5.)
/misstext=' '
;
/* nests id within levels of stud*/
table stud='Study' * id,
day*(var1 var2 var3)*(sum=""*f=best5.)
/misstext=' '
;
run;
As @Reeza sum gets the number of 1 values, there are a number of statistics that could be requested as well. Place them inside the () around sum to group as in these tables.
.
Rather than *day try , day in the PROC TABULATE - look at the example again and note where the comma's are placed.
April 27 – 30 | Gaylord Texan | Grapevine, Texas
Walk in ready to learn. Walk out ready to deliver. This is the data and AI conference you can't afford to miss.
Register now and lock in 2025 pricing—just $495!
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.
Ready to level-up your skills? Choose your own adventure.