turn on suggestions

Auto-suggest helps you quickly narrow down your search results by suggesting possible matches as you type.

Showing results for

Find a Community

- Home
- /
- SAS Programming
- /
- SAS Procedures
- /
- Frequencies not adding up

Topic Options

- Subscribe to RSS Feed
- Mark Topic as New
- Mark Topic as Read
- Float this Topic for Current User
- Bookmark
- Subscribe
- Printer Friendly Page

- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Highlight
- Email to a Friend
- Report Inappropriate Content

01-13-2017 12:10 PM

I'm trying to find the frequencies for only unique ID numbers. I tried PROC FREQ, but couldn't figure out how to do whatever the SAS equivilant of SELECT DISTINCT is. I ran the following code and got numebrs that don't add up.

Code:

PROC SQL; SELECT COUNT (DISTINCT MOTPID) FROM WORK.'0__1_MOTP_COMMENTS_0000'n;

Result:

20599 |

Code:

PROC SQL; SELECT COUNT (DISTINCT MOTPID) FROM WORK.'0__1_MOTP_COMMENTS_0000'n WHERE MOTPComponentDescription = '1a (obs): Demonstrating knowledge of content and pedagogy'; PROC SQL; SELECT COUNT (DISTINCT MOTPID) FROM WORK.'0__1_MOTP_COMMENTS_0000'n WHERE MOTPComponentDescription = '1a (p&p): Demonstrating knowledge of content and pedagogy'; PROC SQL; SELECT COUNT (DISTINCT MOTPID) FROM WORK.'0__1_MOTP_COMMENTS_0000'n WHERE MOTPComponentDescription = '1e (obs): Designing coherent instruction'; PROC SQL; SELECT COUNT (DISTINCT MOTPID) FROM WORK.'0__1_MOTP_COMMENTS_0000'n WHERE MOTPComponentDescription = '1e (p&p): Designing coherent instruction'; PROC SQL; SELECT COUNT (DISTINCT MOTPID) FROM WORK.'0__1_MOTP_COMMENTS_0000'n WHERE MOTPComponentDescription = '2a: Creating an environment of respect and rapport'; PROC SQL; SELECT COUNT (DISTINCT MOTPID) FROM WORK.'0__1_MOTP_COMMENTS_0000'n WHERE MOTPComponentDescription = '2d: Managing student behavior'; PROC SQL; SELECT COUNT (DISTINCT MOTPID) FROM WORK.'0__1_MOTP_COMMENTS_0000'n WHERE MOTPComponentDescription = '3b: Using questioning and discussion techniques'; PROC SQL; SELECT COUNT (DISTINCT MOTPID) FROM WORK.'0__1_MOTP_COMMENTS_0000'n WHERE MOTPComponentDescription = '3c: Engaging students in learning'; PROC SQL; SELECT COUNT (DISTINCT MOTPID) FROM WORK.'0__1_MOTP_COMMENTS_0000'n WHERE MOTPComponentDescription = '3d: Using assessment in instruction'; PROC SQL; SELECT COUNT (DISTINCT MOTPID) FROM WORK.'0__1_MOTP_COMMENTS_0000'n WHERE MOTPComponentDescription = '4e (obs): Growing and developing professionally'; PROC SQL; SELECT COUNT (DISTINCT MOTPID) FROM WORK.'0__1_MOTP_COMMENTS_0000'n WHERE MOTPComponentDescription = '4e (p&p): Growing and developing professionally';

Result:

1a (obs): Demonstrating knowledge of content and pedagogy: 700

1a (p&p): Demonstrating knowledge of content and pedagogy: 606

1e (obs): Designing coherent instruction: 15622

1e (p&p): Designing coherent instruction: 1135

2a: Creating an environment of respect and rapport: 2466

2d: Managing student behavior: 1005

3b: Using questioning and discussion techniques: 808

3c: Engaging students in learning: 2516

3d: Using assessment in instruction: 3058

4e (obs): Growing and developing professionally: 5245

4e (p&p): Growing and developing professionally: 588

SUM = 33746

33746 != 20599

My data set looks like this (had to x out PII):

MOTPID | MOTPComponentDescription | SchoolDBN | EmployeeID | FirstName | LastName | EvaluatorUserID | EvaluatorName | Comments | complev_flag_ct | sentence_count | percent_similar_sentences | Rating | MOTPDate | Y16_Overall_Rating |

1051875 | 1e (obs): Designing coherent instruction | 13K527 | xxx | xxx | xxx | xxx | xxx | 0 | 3 | 0.00% | N/A | 9/13/2016 0:00 | Ineffective | |

1051875 | 1e (obs): Designing coherent instruction | 13K527 | xxx | xxx | xxx | xxx | xxx | 0 | 3 | 0.00% | N/A | 9/13/2016 0:00 | Ineffective | |

1051875 | 1e (obs): Designing coherent instruction | 13K527 | xxx | xxx | xxx | xxx | xxx | 0 | 3 | 0.00% | N/A | 9/13/2016 0:00 | Ineffective | |

1051876 | 3d: Using assessment in instruction | 13K527 | xxx | xxx | xxx | xxx | xxx | 0 | 4 | 0.00% | N/A | 9/13/2016 0:00 | Developing | |

1051876 | 3d: Using assessment in instruction | 13K527 | xxx | xxx | xxx | xxx | xxx | 0 | 4 | 0.00% | N/A | 9/13/2016 0:00 | Developing | |

1051876 | 3d: Using assessment in instruction | 13K527 | xxx | xxx | xxx | xxx | xxx | 0 | 4 | 0.00% | N/A | 9/13/2016 0:00 | Developing | |

1051876 | 3d: Using assessment in instruction | 13K527 | xxx | xxx | xxx | xxx | xxx | 0 | 4 | 0.00% | N/A | 9/13/2016 0:00 | Developing | |

1051876 | 2a: Creating an environment of respect and rapport | 13K527 | xxx | xxx | xxx | xxx | xxx | 1 | 5 | 20.00% | N/A | 9/13/2016 0:00 | Developing | |

1051878 | 1e (obs): Designing coherent instruction | 27Q400 | xxx | xxx | xxx | xxx | xxx | 0 | 6 | 0.00% | N/A | 9/13/2016 0:00 | Effective | |

1051878 | 1e (obs): Designing coherent instruction | 27Q400 | xxx | xxx | xxx | xxx | xxx | 0 | 6 | 0.00% | N/A | 9/13/2016 0:00 | Effective | |

1051878 | 1e (obs): Designing coherent instruction | 27Q400 | xxx | xxx | xxx | xxx | xxx | 0 | 6 | 0.00% | N/A | 9/13/2016 0:00 | Effective | |

1051886 | 1e (obs): Designing coherent instruction | 13K527 | xxx | xxx | xxx | xxx | xxx | 0 | 6 | 0.00% | N/A | 9/14/2016 0:00 | Developing | |

1051886 | 1e (obs): Designing coherent instruction | 13K527 | xxx | xxx | xxx | xxx | xxx | 0 | 6 | 0.00% | N/A | 9/14/2016 0:00 | Developing | |

**Looking for any ideas on what went wrong or if there's a better way to get my desired result (the count of unique MOTPID's by MOTPCopmponentDescription. Thanks so much in advance!**

Accepted Solutions

Solution

01-13-2017
02:10 PM

- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Highlight
- Email to a Friend
- Report Inappropriate Content

01-13-2017 02:08 PM

Thanks so much. Maybe my post was unclear. I'm not interested in how many MOTPID's had 1, 2, or 3 observations, but rather for a given MOTPComponentDescription how many unique MOTPIDs are there.

Either way, I found a solution:

```
data comment_analysis;
set WORK.'0__1_MOTP_COMMENTS_0001'n;
run;
proc sort data=comment_analysis nodupkey out=comment_analysis_clean dupout=dups;
by motpid;
run;
proc freq data=comment_analysis_clean;
table MOTPComponentDescription;
run;
```

All Replies

- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Highlight
- Email to a Friend
- Report Inappropriate Content

01-13-2017 12:45 PM

what does

proc freq data= WORK.'0__1_MOTP_COMMENTS_0000'n;

tables MOTPID*MOTPComponentDescription /list missing;

run;

give you?

I suspect the issue you see comes from the way you are using distinct.

What about

PROC SQL;

SELECT MOTPID, MOTPComponentDescription , count(*) as count

FROM WORK.'0__1_MOTP_COMMENTS_0000'n

group byMOTPID, MOTPComponentDescription;

quit;

- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Highlight
- Email to a Friend
- Report Inappropriate Content

01-13-2017 01:09 PM

Here's a set of two PROC FREQs that (together) give what you asked plus more. I'm not sure if you're interested in the "plus more" part but you might be.

proc freq data=have noprint;

tables MOTPComponentDescription * MOTPID / out=summarized_counts;

run;

proc freq data=summarized_counts;

tables MOTPComponentDescription * Count / missing list;

run;

This describes your original data set as follows. For each MOTPComponentDescription, how many MOTPIDs had a single observation? (That's the part you asked for.) How many have 2 observations? How many have 3 observations? It gives you that summary for whatever counts actually appear in the data set.

Solution

01-13-2017
02:10 PM

- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Highlight
- Email to a Friend
- Report Inappropriate Content

01-13-2017 02:08 PM

Thanks so much. Maybe my post was unclear. I'm not interested in how many MOTPID's had 1, 2, or 3 observations, but rather for a given MOTPComponentDescription how many unique MOTPIDs are there.

Either way, I found a solution:

```
data comment_analysis;
set WORK.'0__1_MOTP_COMMENTS_0001'n;
run;
proc sort data=comment_analysis nodupkey out=comment_analysis_clean dupout=dups;
by motpid;
run;
proc freq data=comment_analysis_clean;
table MOTPComponentDescription;
run;
```

- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Highlight
- Email to a Friend
- Report Inappropriate Content

01-13-2017 01:59 PM - edited 01-13-2017 02:01 PM

You can also simplify your SQL query.

```
PROC SQL;
SELECT MOTPComponentDescription, COUNT (DISTINCT MOTPID) as count
FROM WORK.'0__1_MOTP_COMMENTS_0000'n
group by MOTPComponentDescription;
quit;
```

EDIT: unless you have an explicit exclusion criteria, ie a MOTPID can only be per one MOTPComponentDescription then the total doesn't have to add up to the distinct count you found earlier. You can have a single MOTPID across different MOTPComponentDescription, which means the sum can be greater than the overall unique count.

- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Highlight
- Email to a Friend
- Report Inappropriate Content

01-13-2017 02:09 PM

Thanks so much for all of your responses. I managed to figure out the following solution, which got the exact result I was looking for:

```
data comment_analysis;
set WORK.'0__1_MOTP_COMMENTS_0001'n;
run;
proc sort data=comment_analysis nodupkey out=comment_analysis_clean dupout=dups;
by motpid;
run;
proc freq data=comment_analysis_clean;
table MOTPComponentDescription;
run;
```

- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Highlight
- Email to a Friend
- Report Inappropriate Content

01-14-2017 02:15 AM

@LizGagne Do note that the SQL solution is the same answer, without multiple steps.

- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Highlight
- Email to a Friend
- Report Inappropriate Content

01-14-2017 06:42 AM

Many ways to skin the cat here. My two-step PROC FREQ also gets those numbers if you simplify the second PROC FREQ:

proc freq data=summarized_counts;

tables MOTPComponentDescription;

run;