turn on suggestions

Auto-suggest helps you quickly narrow down your search results by suggesting possible matches as you type.

Showing results for

Find a Community

- Home
- /
- SAS Programming
- /
- Base SAS Programming
- /
- PROC MEANS DISTINCT VALUES

Topic Options

- Subscribe to RSS Feed
- Mark Topic as New
- Mark Topic as Read
- Float this Topic for Current User
- Bookmark
- Subscribe
- Printer Friendly Page

- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Highlight
- Email to a Friend
- Report Inappropriate Content

04-18-2017 03:36 PM

Hello,

I have the following data set and need to summarize in the manner of the following code. However, instead of outputting to total for the variable PS, I need the distinct total only.

**proc** **means** data=WORK.TEST noprint;

by R TC RN SS SH;

var PS;

output out= TEST1 sum (E)=E_F n(PS)=NB_PS;

**run**;

RN | E | TC | R | SS | SH | PS |

1 | 707 | 5003 | A02 | SS | 2028106601 | |

1 | 698 | 5003 | A02 | SS | 2028106601 | |

1 | 1119 | 5003 | A02 | SS | 2028106601 | |

1 | 2099 | 5003 | A02 | 2028612941 | ||

1 | 1370 | 5003 | A02 | 2024667991 | ||

1 | 865 | 5003 | A02 | 2028221751 | ||

1 | 3026 | 5003 | A02 | 2028221751 | ||

1 | 1008 | 5003 | A02 | 2025744721 | ||

1 | 604 | 6003 | A09 | SH | 2027282411 | |

1 | 1285 | 6003 | A09 | SH | 2027282411 | |

1 | 2107 | 6003 | A09 | 2025877621 | ||

1 | 854 | 6003 | A09 | 2025250411 | ||

1 | 955 | 6003 | A09 | 2024224891 | ||

1 | 502 | 6003 | A09 | 2022288221 |

So instead of getting the output:

R | TC | RN | SS | SH | E_F | NB_PS |

A02 | 5003 | 1 | SS | 8368 | 3 | |

A02 | 5003 | 1 | 2524 | 5 | ||

A09 | 6003 | 1 | SH | 1889 | 2 | |

A09 | 6003 | 1 | 4418 | 4 |

I need the output:

R | TC | RN | SS | SH | E_F | NB_PS |

A02 | 5003 | 1 | SS | 8368 | 1 | |

A02 | 5003 | 1 | 2524 | 4 | ||

A09 | 6003 | 1 | SH | 1889 | 1 | |

A09 | 6003 | 1 | 4418 | 4 |

Thank you for any help!

Accepted Solutions

Solution

04-18-2017
04:50 PM

- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Highlight
- Email to a Friend
- Report Inappropriate Content

Posted in reply to jacobfitz

04-18-2017 04:20 PM

Here's the structure for a PROC SQL solution.

Proc sql;

create table want as

select R, TC, RN, SS, SH, /*grouping variables*/

/*aggregate functions*/

sum(e) as e_f,

count(distinct PS) as N_PS

from test

group by R, TC, RN, SS, SH

quit;

All Replies

- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Highlight
- Email to a Friend
- Report Inappropriate Content

Posted in reply to jacobfitz

04-18-2017 03:41 PM

Explain your logic please?

Where does RN come from?

Do you need the SUM for all records or only the distinct records and how do you identify the distinct records if this is the case?

Proc means cannot do a distinct count, you can try SQL or a double PROC FREQ instead.

- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Highlight
- Email to a Friend
- Report Inappropriate Content

Posted in reply to Reeza

04-18-2017 03:46 PM - edited 04-18-2017 03:49 PM

RN is a number assigned to each PS. It can be 1,51,55 etc. In this specific example they are all 1.

I need E_F to be the sum of all E but NB_PS needs to be the count of distinct PS. I am not sure how to get the output I need using SQL or PROC FREQ.

Solution

04-18-2017
04:50 PM

- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Highlight
- Email to a Friend
- Report Inappropriate Content

Posted in reply to jacobfitz

04-18-2017 04:20 PM

Here's the structure for a PROC SQL solution.

Proc sql;

create table want as

select R, TC, RN, SS, SH, /*grouping variables*/

/*aggregate functions*/

sum(e) as e_f,

count(distinct PS) as N_PS

from test

group by R, TC, RN, SS, SH

quit;

- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Highlight
- Email to a Friend
- Report Inappropriate Content

Posted in reply to Reeza

04-18-2017 04:51 PM

Thank you! This is exactly what I needed.

- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Highlight
- Email to a Friend
- Report Inappropriate Content

Posted in reply to jacobfitz

04-18-2017 03:43 PM

This is a job for PROC FREQ, with a BY statement.

The first run of PROC FREQ will tell you, for example, that 2028106601 appears 3 times in it's BY group. The second run of PROC FREQ will tell you that there is only one value of NB_PS in that BY group.