Hello all,
I have initially posted this inappropriately, am sorry. I am creating a better representation of the issue so I can get help on this.
I have a dataset with category (teachers, students) and a list of forms (form2, event46, case32) which they are expected to fill within 60days after the school resume. Event46 has a group variable (grp=A, B, C). That is someone can be in any of the group. Among the teachers and students, some did not fill the form (0), some filled the form within 30 days (1), and some filled the form within 60 days (2).
The 0,1,2 represents the days that the form was filled below:
No form filled=0
Form filled within 0-30days=1
Form filled within 31-60days=2
This is what I want
Table 1:
1. If a teacher or student only filled form2 within 30days and did not fill event46 or case32 they should be in the form2 only column. Same apply for other forms.
If they filled the form after 30 days, they will be in the 31-60 days count by category.
2. If anyone filled more than one(1) forms either within 30 days or within 60 days or both, they should be in the multiple column.
Table 2:
3. Finally, I want a total of those who did not fill any form in both category(teacher or student).
I have created a sample output and a demo dataset. Thanks so much for your help.
data have;
input category $ form2 case32 event46 grp $;
datalines;
Teacher 1 0 0 .
Student 1 0 0 .
Teacher 0 1 0 .
Teacher 1 0 1 A
Teacher 0 0 1 C
Teacher 0 0 1 B
Teacher 1 0 0 .
Student 0 1 1 C
Student 0 0 1 C
Student 0 0 1 C
Student 1 0 0 .
Student 0 1 0 .
Student 0 2 0 .
Teacher 1 1 1 B
Teacher 0 0 2 A
Teacher 0 0 2 B
Teacher 2 0 0 .
Student 0 1 1 C
Teacher 0 0 0 .
Student 0 0 0 .
Student 0 0 1 B
Teacher 0 0 1 B
Teacher 0 2 0 .
Teacher 2 2 0 .
Student 0 1 0 .
Teacher 2 0 1 A
Student 0 0 1 A
Teacher 0 0 0 .
Student 2 0 0 .
Teacher 0 0 2 C
Student 0 0 2 C
Student 0 0 2 B
Student 0 0 2 A
;
run;
The expected output should look like this:
| Form2 only | Case32 only | Event46 A only | Event46 B only | Event46 C Only | Multiple | Total |
Teacher fill form within 0-30 days |
|
|
|
|
|
|
|
Teacher fill form within 31-60 days |
|
|
|
|
|
|
|
Student fill form within 0-30days |
|
|
|
|
|
|
|
Student fill form within 31-60days |
|
|
|
|
|
|
|
Total |
|
|
|
|
|
|
|
Total | |
Teacher who did not complete any form within 0-60 days |
|
Student who did not complete any form within 0-60 days |
|
Total |
|
To just transform the data from wide to long (=1 row per form whether filled out or not) without special treatment for cases with no form provided, you could also consider below approach. The main difference to the previous version is the addition of a form_ind variable (0 or 1) that then gets used in Proc Tabulate for SUM instead of N (row count).
data long;
set have;
array vars{*} form2 case32 event46;
length form_name $32;
length report_form_name $32;
do _i=1 to dim(vars);
form_cnt=sum(0,form_cnt, vars(_i)>0);
max_day_grp=max(max_day_grp,vars(_i));
end;
if form_cnt=0 then max_day_grp=99;
do form_id=1 to dim(vars);
form_ind =vars(form_id)>0;
form_name =propcase(vname(vars(form_id)));
if form_cnt=0 then report_form_name='No form';
else if form_cnt>1 then report_form_name='Multiple Forms';
else if form_name='Event46' then report_form_name=catx('_',form_name,grp);
else report_form_name=form_name;
day_grp=vars(form_id);
output;
end;
drop form2 case32 event46 _i;
run;
proc format;
value form_type
1 ='Form2'
2 ='Case32'
3 ='Event46'
99='Multiple'
;
value day_grp
0 ='No form'
1 ='0-30 days'
2 ='31-60 days'
99='Did not complete any form within 0-60 days'
;
value form_cnt
1 ='Single'
2-high='Multi-Form'
;
run;
%let sv_missing=%sysfunc(getoption(missing,keyword));
options missing=' ';
title 'Table 1';
proc tabulate data=long(where=(form_ind=1));
class category report_form_name form_cnt max_day_grp;
var form_ind;
format form_cnt form_cnt. max_day_grp day_grp.;
table
category=' '*max_day_grp=' ' all='Total',
(report_form_name=' ' all='Total')*form_ind=' '*sum=' '*f=best32.0
;
run;
title;
title 'Table 2';
proc tabulate data=long(where=(form_cnt=0 and form_id=1));
class category max_day_grp;
format max_day_grp day_grp.;
table
category=' '*max_day_grp=' ' all='Total',
all='Total'*n=' '
;
run;
title;
options &sv_missing;
One way to go is to reshape your data so it's easy to use with Proc's like Tabulate or Report.
Below code is fully working with the sample data you shared.
data prep;
set have;
array vars{*} form2 case32 event46;
length form_name $32;
length report_form_name $32;
do _i=1 to dim(vars);
form_cnt=sum(form_cnt, vars(_i)>0);
max_day_grp=max(max_day_grp,vars(_i));
end;
if form_cnt=0 then
do;
max_day_grp=99;
output;
end;
else
if form_cnt>0 then
do;
do form_id=1 to dim(vars);
if vars(form_id)>0 then
do;
form_name=propcase(vname(vars(form_id)));
day_grp=vars(form_id);
if form_cnt=1 then report_form_id=form_id;
else report_form_id=99;
if form_cnt=1 then
do;
report_form_name=catx('_',form_name,grp);
report_grp=grp;
end;
else
do;
report_form_name='Multiple';
report_grp=' ';
end;
output;
end;
end;
end;
drop form2 case32 event46 _i;
run;
proc format;
value form_type
1 ='Form2'
2 ='Case32'
3 ='Event46'
99='Multiple'
;
value day_grp
1 ='0-30 days'
2 ='31-60 days'
99='Did not complete any form within 0-60 days'
;
value form_cnt
1 ='Single'
2-high='Multi-Form'
;
run;
%let sv_missing=%sysfunc(getoption(missing,keyword));
options missing=' ';
title 'Table 1: Option 1';
proc tabulate data=prep(where=(form_cnt>0)) missing;
class category report_grp report_form_id form_cnt max_day_grp;
format max_day_grp day_grp. report_form_id form_type. form_cnt form_cnt.;
table
category=' '*max_day_grp=' ' all='Total',
(form_cnt=' '*report_grp=' '*report_form_id=' ' all='Total')*n=' '
;
run;
title;
title 'Table 1: Option 2';
proc tabulate data=prep(where=(form_cnt>0)) missing;
class category report_form_name form_cnt max_day_grp;
format max_day_grp day_grp. report_form_id form_type. form_cnt form_cnt.;
table
category=' '*max_day_grp=' ' all='Total',
(form_cnt=' '*report_form_name=' ' all='Total')*n=' '
;
run;
title;
title 'Table 2';
proc tabulate data=prep(where=(form_cnt=0)) missing;
class category max_day_grp;
format max_day_grp day_grp.;
table
category=' '*max_day_grp=' ' all='Total',
all='Total'*n=' '
;
run;
title;
options &sv_missing;
@Patrick Why has the observation increases from 33 to 40?
To just transform the data from wide to long (=1 row per form whether filled out or not) without special treatment for cases with no form provided, you could also consider below approach. The main difference to the previous version is the addition of a form_ind variable (0 or 1) that then gets used in Proc Tabulate for SUM instead of N (row count).
data long;
set have;
array vars{*} form2 case32 event46;
length form_name $32;
length report_form_name $32;
do _i=1 to dim(vars);
form_cnt=sum(0,form_cnt, vars(_i)>0);
max_day_grp=max(max_day_grp,vars(_i));
end;
if form_cnt=0 then max_day_grp=99;
do form_id=1 to dim(vars);
form_ind =vars(form_id)>0;
form_name =propcase(vname(vars(form_id)));
if form_cnt=0 then report_form_name='No form';
else if form_cnt>1 then report_form_name='Multiple Forms';
else if form_name='Event46' then report_form_name=catx('_',form_name,grp);
else report_form_name=form_name;
day_grp=vars(form_id);
output;
end;
drop form2 case32 event46 _i;
run;
proc format;
value form_type
1 ='Form2'
2 ='Case32'
3 ='Event46'
99='Multiple'
;
value day_grp
0 ='No form'
1 ='0-30 days'
2 ='31-60 days'
99='Did not complete any form within 0-60 days'
;
value form_cnt
1 ='Single'
2-high='Multi-Form'
;
run;
%let sv_missing=%sysfunc(getoption(missing,keyword));
options missing=' ';
title 'Table 1';
proc tabulate data=long(where=(form_ind=1));
class category report_form_name form_cnt max_day_grp;
var form_ind;
format form_cnt form_cnt. max_day_grp day_grp.;
table
category=' '*max_day_grp=' ' all='Total',
(report_form_name=' ' all='Total')*form_ind=' '*sum=' '*f=best32.0
;
run;
title;
title 'Table 2';
proc tabulate data=long(where=(form_cnt=0 and form_id=1));
class category max_day_grp;
format max_day_grp day_grp.;
table
category=' '*max_day_grp=' ' all='Total',
all='Total'*n=' '
;
run;
title;
options &sv_missing;
@Patrick Can does blank spaces in table 1 be converted to 0. So I can as well have the count of missing in table 1 although I will actually be counting it in table 2. Thanks
To display missings as zero just use:
options missing='0';
@Patrick How can anyone with a value of 1 be counted as a single case. For example row 33 in our data. When form2=2,case32=0 and event46=1, the report_form_name should be Event46A instead of multiple.
Anyone with 2 double digits e.g., 110, 220 should remain as multiple.
@CathyVI wrote:
@Patrick How can anyone with a value of 1 be counted as a single case. For example row 33 in our data. When form2=2,case32=0 and event46=1, the report_form_name should be Event46A instead of multiple.
Anyone with 2 double digits e.g., 110, 220 should remain as multiple.
So now it's about counting cases and no more forms which totally changes the story!
How would you want to report cases 112, 121, 211, 122, 212, 221? Please be specific and define all your cases. Provide both the variable they should appear under AND the time slot if there are multiple.
It would help if you could provide below table fully populated. With this data one can create formats which then should make the rest of the coding rather simple.
Multiple patterns can of course belong to the same case (case_no). It would then also help if you provide sample data for all the patterns.
Obs |
pattern | case_no | time_slot | description |
1 | 000 | |||
2 | 001 | |||
3 | 002 | |||
4 | 010 | |||
5 | 011 | |||
6 | 012 | |||
7 | 020 | |||
8 | 021 | |||
9 | 022 | |||
10 | 100 | |||
11 | 101 | |||
12 | 102 | |||
13 | 110 | |||
14 | 111 | |||
15 | 112 | |||
16 | 120 | |||
17 | 121 | |||
18 | 122 | |||
19 | 200 | |||
20 | 201 | |||
21 | 202 | |||
22 | 210 | |||
23 | 211 | |||
24 | 212 | |||
25 | 220 | |||
26 | 221 | |||
27 | 222 |
SAS Innovate 2025 is scheduled for May 6-9 in Orlando, FL. Sign up to be first to learn about the agenda and registration!
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.
Ready to level-up your skills? Choose your own adventure.