turn on suggestions

Auto-suggest helps you quickly narrow down your search results by suggesting possible matches as you type.

Showing results for

Find a Community

- Home
- /
- SAS Programming
- /
- SAS Procedures
- /
- Proc SQL: sum of distinct values

Topic Options

- Subscribe to RSS Feed
- Mark Topic as New
- Mark Topic as Read
- Float this Topic for Current User
- Bookmark
- Subscribe
- Printer Friendly Page

- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Highlight
- Email to a Friend
- Report Inappropriate Content

02-08-2011 05:59 AM

Hello,

with Proc SQL is possible to have the SUM of distinct values of a specific variable. I need an option in this Proc that sum values of a variable referring to different values of another variable. Another way could be the proc sort nodupkey and then the proc freq.

Thank you

Simone

with Proc SQL is possible to have the SUM of distinct values of a specific variable. I need an option in this Proc that sum values of a variable referring to different values of another variable. Another way could be the proc sort nodupkey and then the proc freq.

Thank you

Simone

- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Highlight
- Email to a Friend
- Report Inappropriate Content

Posted in reply to HDSimo

02-08-2011 06:19 AM

when selecting the set of distinct values, or when using the nodupkeys option of proc sort, how do you decide which row to keep or which rows to drop?

I would be concerned that neither method defines which of "the top equal salesmen" (for example) would be chosen to be included on the input to your final summary.

I would be concerned that neither method defines which of "the top equal salesmen" (for example) would be chosen to be included on the input to your final summary.

- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Highlight
- Email to a Friend
- Report Inappropriate Content

Posted in reply to Peter_C

02-08-2011 06:34 AM

Hi Peter,

each row that i drop has the same values of the row that I keep. Let me show you a situation similar to my database:

variable ID identify a person

variable X specify a weight

In the database there are a lot of duplicated record, It's not important wich rows i've deleted. I need to calculate the sum of x but i can't use X as distinct values option because identical values of X are linked to different ID.

Thank you

Simone

each row that i drop has the same values of the row that I keep. Let me show you a situation similar to my database:

variable ID identify a person

variable X specify a weight

In the database there are a lot of duplicated record, It's not important wich rows i've deleted. I need to calculate the sum of x but i can't use X as distinct values option because identical values of X are linked to different ID.

Thank you

Simone

- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Highlight
- Email to a Friend
- Report Inappropriate Content

Posted in reply to HDSimo

02-08-2011 06:43 AM

Hi Simone.

Maybe a subquery would help :

[pre]PROC SQL ;

SELECT SUM(x) FROM (

SELECT DISTINCT id, x FROM myData

);

QUIT ;[/pre]

This is very similar to executing a PROC SORT with the NODUPRECS option prior to summing.

Hope this helps.

Olivier Message was edited by: Olivier

Maybe a subquery would help :

[pre]PROC SQL ;

SELECT SUM(x) FROM (

SELECT DISTINCT id, x FROM myData

);

QUIT ;[/pre]

This is very similar to executing a PROC SORT with the NODUPRECS option prior to summing.

Hope this helps.

Olivier Message was edited by: Olivier

- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Highlight
- Email to a Friend
- Report Inappropriate Content

Posted in reply to Olivier

02-08-2011 06:47 AM

mmm... simple!

Thank you very much Olivier!

Simone

Thank you very much Olivier!

Simone