Turn on suggestions

Auto-suggest helps you quickly narrow down your search results by suggesting possible matches as you type.

Showing results for

- Home
- /
- Programming
- /
- Programming
- /
- Re: counting number of observations

Options

- RSS Feed
- Mark Topic as New
- Mark Topic as Read
- Float this Topic for Current User
- Bookmark
- Subscribe
- Mute
- Printer Friendly Page

🔒 This topic is **solved** and **locked**.
Need further help from the community? Please
sign in and ask a **new** question.

- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content

Posted 09-29-2020 06:18 AM
(1250 views)

I have the following data:

TEACHER_ID | CLASS_ID | X |

T1 | C1 | 4 |

T1 | C1 | 3 |

T1 | C2 | 3 |

T1 | C2 | 3 |

T1 | C2 | 4 |

T1 | C3 | 2 |

T1 | C3 | 3 |

T2 | C4 | 4 |

T2 | C4 | 4 |

T2 | C4 | 3 |

T2 | C4 | 2 |

T2 | C5 | 2 |

T2 | C5 | 2 |

There are two teachers: T1, T2.

T1 teaches three classes (C1, C2, and C3). In C1, there are 2 students, in C2, there are 3 students, and n C3, there are 2 students,

T2 teaches two classes (C4, and C5). In C4, there are 4 students, in C5, there are 2 students,

X is the score.

Now, how do I write a program to calculate number of classes and number of students taught by each teacher?

and create a new data set as follows:

TEACHER_ID Nclass Nstudents

T1 3 7

T2 2 6

1 ACCEPTED SOLUTION

Accepted Solutions

- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content

Easy in a DATA step:

```
data have;
input TEACHER_ID $ CLASS_ID $ X;
datalines;
T1 C1 4
T1 C1 3
T1 C2 3
T1 C2 3
T1 C2 4
T1 C3 2
T1 C3 3
T2 C4 4
T2 C4 4
T2 C4 3
T2 C4 2
T2 C5 2
T2 C5 2
;
data want;
set have;
by teacher_id class_id;
if first.teacher_id
then do;
nclass = 0;
nstudents = 0;
end;
if first.class_id then nclass + 1;
nstudents + 1;
if last.teacher_id;
keep teacher_id nclass nstudents;
run;
```

6 REPLIES 6

- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content

Easy in a DATA step:

```
data have;
input TEACHER_ID $ CLASS_ID $ X;
datalines;
T1 C1 4
T1 C1 3
T1 C2 3
T1 C2 3
T1 C2 4
T1 C3 2
T1 C3 3
T2 C4 4
T2 C4 4
T2 C4 3
T2 C4 2
T2 C5 2
T2 C5 2
;
data want;
set have;
by teacher_id class_id;
if first.teacher_id
then do;
nclass = 0;
nstudents = 0;
end;
if first.class_id then nclass + 1;
nstudents + 1;
if last.teacher_id;
keep teacher_id nclass nstudents;
run;
```

- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content

thanks for your reply. I found some extra problem

SMTT TEACHER_ID CLASS_ID X

1 T1 C1 4

1 T1 C1 3

1 T1 C2 3

2 T1 C2 3

2 T1 C2 4

2 T1 C3 2

2 T1 C3 3

1 T2 C4 4

1 T2 C4 4

1 T2 C4 3

1 T2 C4 2

2 T2 C5 2

2 T2 C5 2

SMTR is semester 1 and 2.

There are two teachers: T1, T2.

In semester 1, T1 teaches 2 classes (C1, C2). In C1, there are 2 students, in C2, 1 students.

In semester 2, T1 teaches 2 classes (C2, C3). In C2, there are 2 students, in C3, 2 student

lass C2 should be counted as two distinct classes since they are offered by two different semesters.

X is the score.

Now, how do I write a program to calculate distinct number of classes and number of students taught by each teacher?

and create a new data set as follows:

TEACHER_ID Nclass Nstudents

T1 4 7

T2 2 6

- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content

- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content

SQL can count distinct combinations by concatenating the combination parts with a delimiter. The delimiter ensures distinct combinations.

Example:

proc sql; create table want as select teacher_id , count (distinct catx('~',smtt,teacher_id,class_id)) as Nclass , count (x) as Nstudents from have group by teacher_id ;

You can also do serial processing in DATA Step using *nested *DOW loops

proc sort data=have; by teacher_id smtt class_id; run; data want; length teacher_id $8; do Nclass = sum(Nclass,1) by 1 until (last.teacher_id); do Nstudents = sum(Nstudents,1) by 1 until (last.class_id); set have; by teacher_id smtt class_id; end; end; keep teacher_id Nclass Nstudents; run;

- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content

```
data have;
input (TEACHER_ID CLASS_ID) ($) X;
cards;
T1 C1 4
T1 C1 3
T1 C2 3
T1 C2 3
T1 C2 4
T1 C3 2
T1 C3 3
T2 C4 4
T2 C4 4
T2 C4 3
T2 C4 2
T2 C5 2
T2 C5 2
;
proc sql;
create table want as
select TEACHER_ID,count(distinct class_id) as Nclass, count(class_id) as Nstudents
from have
group by TEACHER_ID;
quit;
```

- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content

The statistical procedures such as FREQ MEANS SUMMARY UNIVARIATE do not count distinct values (as you want for CLASS_ID).

You can use SQL to directly compute distinct counts

data have; input teacher_id $ class_id $ x; datalines; T1 C1 4 T1 C1 3 T1 C2 3 T1 C2 3 T1 C2 4 T1 C3 2 T1 C3 3 T2 C4 4 T2 C4 4 T2 C4 3 T2 C4 2 T2 C5 2 T2 C5 2 ; proc sql; create table want as select teacher_id, count(distinct class_id) as Nclass, count(X) as Nstudents from have group by teacher_id ;

Are you ready for the spotlight? We're accepting content ideas for **SAS Innovate 2025** to be held May 6-9 in Orlando, FL. The call is **open **until September 25. Read more here about **why** you should contribute and **what is in it** for you!

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.