BookmarkSubscribeRSS Feed
🔒 This topic is solved and locked. Need further help from the community? Please sign in and ask a new question.
tinghlin
Fluorite | Level 6

I have the following data:

 

TEACHER_IDCLASS_IDX
T1C14
T1C13
T1C23
T1C23
T1C24
T1C32
T1C33
T2C44
T2C44
T2C43
T2C42
T2C52
T2C52


There are two teachers: T1, T2.

T1 teaches three classes (C1, C2, and C3).  In C1, there are 2 students, in C2, there are 3 students, and n C3, there are 2 students, 

T2 teaches two classes (C4, and C5).  In C4, there are 4 students, in C5, there are 2 students,

X is the score. 

Now, how do I write a program to calculate number of  classes and number of students taught by each teacher?

and create a new data set as follows:

 

TEACHER_ID    Nclass   Nstudents 

T1                        3             7

T2                        2             6

 

 

1 ACCEPTED SOLUTION

Accepted Solutions
Kurt_Bremser
Super User

Easy in a DATA step:

data have;
input TEACHER_ID $ CLASS_ID $ X;
datalines;
T1  C1  4
T1  C1  3
T1  C2  3
T1  C2  3
T1  C2  4
T1  C3  2
T1  C3  3
T2  C4  4
T2  C4  4
T2  C4  3
T2  C4  2
T2  C5  2
T2  C5  2
;

data want;
set have;
by teacher_id class_id;
if first.teacher_id
then do;
  nclass = 0;
  nstudents = 0;
end;
if first.class_id then nclass + 1;
nstudents + 1;
if last.teacher_id;
keep teacher_id nclass nstudents;
run;

View solution in original post

6 REPLIES 6
Kurt_Bremser
Super User

Easy in a DATA step:

data have;
input TEACHER_ID $ CLASS_ID $ X;
datalines;
T1  C1  4
T1  C1  3
T1  C2  3
T1  C2  3
T1  C2  4
T1  C3  2
T1  C3  3
T2  C4  4
T2  C4  4
T2  C4  3
T2  C4  2
T2  C5  2
T2  C5  2
;

data want;
set have;
by teacher_id class_id;
if first.teacher_id
then do;
  nclass = 0;
  nstudents = 0;
end;
if first.class_id then nclass + 1;
nstudents + 1;
if last.teacher_id;
keep teacher_id nclass nstudents;
run;
tinghlin
Fluorite | Level 6

thanks for  your reply.   I found some extra problem 


SMTT TEACHER_ID CLASS_ID X
1 T1 C1 4
1 T1 C1 3
1 T1 C2 3
2 T1 C2 3
2 T1 C2 4
2 T1 C3 2
2 T1 C3 3
1 T2 C4 4
1 T2 C4 4
1 T2 C4 3
1 T2 C4 2
2 T2 C5 2
2 T2 C5 2

SMTR is semester 1 and 2.

There are two teachers: T1, T2.

In semester 1, T1 teaches 2 classes (C1, C2). In C1, there are 2 students, in C2, 1 students.

In semester 2, T1 teaches 2 classes (C2, C3). In C2, there are 2 students, in C3, 2 student


lass C2 should be counted as two distinct classes since they are offered by two different semesters.

X is the score.


Now, how do I write a program to calculate distinct number of classes and number of students taught by each teacher?

and create a new data set as follows:

TEACHER_ID Nclass Nstudents

T1 4 7

T2 2 6

RichardDeVen
Barite | Level 11

SQL can count distinct combinations by concatenating the combination parts with a delimiter.  The delimiter ensures distinct combinations.

 

Example:

proc sql;
  create table want as
  select 
    teacher_id
  , count (distinct catx('~',smtt,teacher_id,class_id)) as Nclass
  , count (x) as Nstudents
  from have
  group by teacher_id
  ;

You can also do serial processing in DATA Step using nested DOW loops

proc sort data=have;
  by teacher_id smtt class_id;
run;

data want;
  length teacher_id $8;
  do Nclass = sum(Nclass,1) by 1 until (last.teacher_id);
    do Nstudents = sum(Nstudents,1) by 1 until (last.class_id);
      set have;
      by teacher_id smtt class_id;
    end;
  end;
  keep teacher_id Nclass Nstudents;
run;

 

novinosrin
Tourmaline | Level 20
data have;
input (TEACHER_ID	CLASS_ID) ($)	X;
cards;
T1	C1	4
T1	C1	3
T1	C2	3
T1	C2	3
T1	C2	4
T1	C3	2
T1	C3	3
T2	C4	4
T2	C4	4
T2	C4	3
T2	C4	2
T2	C5	2
T2	C5	2
;

proc sql;
create table want as
select TEACHER_ID,count(distinct class_id) as Nclass, count(class_id) as Nstudents
from have
group by TEACHER_ID;
quit;
RichardDeVen
Barite | Level 11

The statistical procedures such as FREQ MEANS SUMMARY UNIVARIATE do not count distinct values (as you want for CLASS_ID).

 

You can use SQL to directly compute distinct counts

data have;
input teacher_id $ class_id $ x;
datalines;
T1	C1	4
T1	C1	3
T1	C2	3
T1	C2	3
T1	C2	4
T1	C3	2
T1	C3	3
T2	C4	4
T2	C4	4
T2	C4	3
T2	C4	2
T2	C5	2
T2	C5	2
;

proc sql;
  create table want as
  select teacher_id, count(distinct class_id) as Nclass, count(X) as Nstudents
  from have
  group by teacher_id
  ;
How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

SAS Training: Just a Click Away

 Ready to level-up your skills? Choose your own adventure.

Browse our catalog!

Discussion stats
  • 6 replies
  • 2425 views
  • 3 likes
  • 4 in conversation