turn on suggestions

Auto-suggest helps you quickly narrow down your search results by suggesting possible matches as you type.

Showing results for

Find a Community

- Home
- /
- SAS Programming
- /
- SAS Procedures
- /
- How to Count a variable

Topic Options

- Subscribe to RSS Feed
- Mark Topic as New
- Mark Topic as Read
- Float this Topic for Current User
- Bookmark
- Subscribe
- Printer Friendly Page

- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Highlight
- Email to a Friend
- Report Inappropriate Content

09-16-2014 11:35 AM

So I have a question I don’t really know how to phrase.

I have a variable called course_id that contains the identification for all participants who have gone through a training at a different site. There are multiple different training sites and therefore multiple course_ids that correlate to those different sites. I would like to run a chi square analysis on the number different training sites, but when I use the course_id variable it does not combine similar sites, but instead gives me the number of participants. In those sites via their course_id.

I was wondering if there was a way to count the number of training instead of the total number of participants via using the course_id variable.

For example if I have the following data set:

course_id location_of_site

00211 rural

00211 rural

00211 rural

33455 urban

33455 urban

33455 urban

66778 rural

66778 rural

number_trainings_rural number_trainings_urban

2 1

I hope I gave enough information and made this clear enough to understand my goal.

Thank you for your time,

Donald S.

Accepted Solutions

Solution

09-16-2014
11:56 AM

- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Highlight
- Email to a Friend
- Report Inappropriate Content

09-16-2014 11:56 AM

If you reduce your data down to a single dimension, I don't think there is any way to compute a chi-square from that. Perhaps this alternative is what you are looking for. Reduce the data down to two dimensions, eliminating the counting of participants. Then try for a chi-square on the remaining data. For example:

proc freq data=have;

tables course_id * location_of_site / noprint out=counts;

run;

proc freq data=counts;

tables course_id * location_of_site / chisq;

run;

The first PROC FREQ generates COUNTS, holding a single record for each combination of COURSE_ID / LOCATION_OF_SITE. The second PROC FREQ uses that as input to compute chi square statistics.

All Replies

- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Highlight
- Email to a Friend
- Report Inappropriate Content

09-16-2014 11:47 AM

select location_of_site, count(distinct course_id) as no

from have

group by location_of_site

?

Data never sleeps

Solution

09-16-2014
11:56 AM

- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Highlight
- Email to a Friend
- Report Inappropriate Content

09-16-2014 11:56 AM

If you reduce your data down to a single dimension, I don't think there is any way to compute a chi-square from that. Perhaps this alternative is what you are looking for. Reduce the data down to two dimensions, eliminating the counting of participants. Then try for a chi-square on the remaining data. For example:

proc freq data=have;

tables course_id * location_of_site / noprint out=counts;

run;

proc freq data=counts;

tables course_id * location_of_site / chisq;

run;

The first PROC FREQ generates COUNTS, holding a single record for each combination of COURSE_ID / LOCATION_OF_SITE. The second PROC FREQ uses that as input to compute chi square statistics.

- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Highlight
- Email to a Friend
- Report Inappropriate Content

09-16-2014 12:08 PM

Yes, this is exactly that I am looking for! Wonderful, thank you Astounding.