How to Count a variable

Solved
Frequent Contributor
Posts: 79

How to Count a variable

So I have a question I don’t really know how to phrase.

I have a variable called course_id that contains the identification for all participants who have gone through a training at a different site. There are multiple different training sites and therefore multiple course_ids that correlate to those different sites. I would like to run a chi square analysis on the number different training sites, but when I use the course_id variable it does not combine similar sites, but instead gives me the number of participants. In those sites via their course_id.

I was wondering if there was a way to count the number of training instead of the total number of participants via using the course_id variable.

For example if I have the following data set:

course_id    location_of_site

00211              rural

00211              rural

00211              rural

33455              urban

33455              urban

33455              urban

66778              rural

66778              rural

number_trainings_rural       number_trainings_urban

2                                              1

I hope I gave enough information and made this clear enough to understand my goal.

Donald S.

Accepted Solutions
Solution
‎09-16-2014 11:56 AM
Super User
Posts: 6,756

Re: How to Count a variable

If you reduce your data down to a single dimension, I don't think there is any way to compute a chi-square from that.  Perhaps this alternative is what you are looking for.  Reduce the data down to two dimensions, eliminating the counting of participants.  Then try for a chi-square on the remaining data.  For example:

proc freq data=have;

tables course_id * location_of_site / noprint out=counts;

run;

proc freq data=counts;
tables course_id * location_of_site /     chisq;

run;

The first PROC FREQ generates COUNTS, holding a single record for each combination of COURSE_ID / LOCATION_OF_SITE.  The second PROC FREQ uses that as input to compute chi square statistics.

All Replies
Super User
Posts: 5,876

Re: How to Count a variable

select location_of_site, count(distinct course_id) as no

from have

group by location_of_site

?

Data never sleeps
Solution
‎09-16-2014 11:56 AM
Super User
Posts: 6,756

Re: How to Count a variable

If you reduce your data down to a single dimension, I don't think there is any way to compute a chi-square from that.  Perhaps this alternative is what you are looking for.  Reduce the data down to two dimensions, eliminating the counting of participants.  Then try for a chi-square on the remaining data.  For example:

proc freq data=have;

tables course_id * location_of_site / noprint out=counts;

run;

proc freq data=counts;
tables course_id * location_of_site /     chisq;

run;

The first PROC FREQ generates COUNTS, holding a single record for each combination of COURSE_ID / LOCATION_OF_SITE.  The second PROC FREQ uses that as input to compute chi square statistics.

Frequent Contributor
Posts: 79