Solved
Contributor
Posts: 67

# Domain function mean(var3) over (patrition by var1,var2)

Hi,

Could anyone please tell me if it is possible to use such domain functions in SAS as in sql

for instance:

'mean(var3) over (patrition by var1,var2)'

Thank you.

Accepted Solutions
Solution
‎08-07-2017 02:07 PM
Posts: 5,519

## Re: Domain function mean(var3) over (patrition by var1,var2)

Looks like you want something like:

proc sql;
select
Cst_name,
coalesce(mean(case when data_1 = "NA" then . else skrz end), 0) as avg_data_1_cst,
coalesce(mean(case when data_2 = "NA" then . else skrz end), 0) as avg_data_2_cst,
coalesce(mean(case when data_3 = "NA" then . else skrz end), 0) as avg_data_3_cst,
coalesce(max(case when data_1 = "NA" then . else skrz end), 0) as max_data_1_cst
from test
group by Cst_name;
quit;

PG

All Replies
Super User
Posts: 10,209

## Re: Domain function mean(var3) over (patrition by var1,var2)

Do you mean something like

proc sql;
create table want as
select
var1,
var2,
avg(var3)
from have
group by
var1,
var2
;
quit;

?

---------------------------------------------------------------------------------------------
Maxims of Maximally Efficient SAS Programmers
How to convert datasets to data steps
How to post code
Contributor
Posts: 67

## Re: Domain function mean(var3) over (patrition by var1,var2)

Posted in reply to KurtBremser

Not really, rather then grouping them I have to get statistics over multiple variables for eg.

case when data_2 <> 'non' then round(avg(num) over (partition by wyn.id_cst, wyn.data_2),2)
else 0 end as sred_skzr_6m,
case when data_3 <> 'non' then round(avg(num) over (partition by wyn.id_cst, wyn.data_3),2)
else 0 end as sred_skzr_9m,
case when data_4 = 'sth' then round(avg(num) over (partition by wyn.id_cst, wyn.data_4),2)
else 0 end as sred_skzr_12m,

Super User
Posts: 5,876

## Re: Domain function mean(var3) over (patrition by var1,var2)

Since most of us are SAS specialists, example syntax from other SW is of little help.
Please exemplify with sample data and desired output, paired with a more verbal description of what you want to accomplish.
Data never sleeps
Contributor
Posts: 67

## Re: Domain function mean(var3) over (patrition by var1,var2)

ok, here you can see the example of the data:

 Cst_Name data_1 data_2 data_3 skrz John 3m 6m 9m 100 John NA 6m 9m 87 Alfred NA 6m NA 95 Alfred NA 6m NA 57 Alfred NA 6m NA 95 Alice 3m 6m NA 46 Barbara 3m NA 9m 78 Barbara 3m NA 9m 45 William NA NA 9m 58 Frnak 3m NA 9m 88 Frank NA NA 9m 58 Rose NA 6m 9m 79 Rose NA 6m NA 128 Sarah NA 6m 9m 78 Sarah 3m NA 9m 89

and expected results:

 Cst_Name avg_data_1_cst avg_data_2_cst avg_data_3_cst max_data_1_cst std_data_2_cst John 100 93,5 93,5 100 6,5 Alfred 0 82,33333333 0 0 17,91337179 Alice 46 46 0 46 0 Barbara 61,5 0 61,5 78 0 William 0 0 58 0 0 Frnak 88 0 73 88 0 Rose 0 103,5 79 0 24,5 Sarah 89 78 83,5 89 0

in Oracle SQL you can easily do this with over (partition by ) command.

What I would like to do is compute all statistics like (std, max, mean) over customer id and any other variable.

Super User
Posts: 13,498

## Re: Domain function mean(var3) over (patrition by var1,var2)

Instructions here: https://communities.sas.com/t5/SAS-Communities-Library/How-to-create-a-data-step-version-of-your-dat... will show how to turn an existing SAS data set into data step code that can be pasted into a forum code box using the {i} icon or attached as text to show exactly what you have and that we can test code against.

I include those instructions as the data you show only has one variable that could be considered numeric by SAS unless your "3m" is supposed to be 3 million or such and is appearing as a FORMATTED value and the NA is appearing because you have a custom format to show missing as NA. Otherwise 3m would be character and mean or other statistical function would yeild a missing value.

Or possibly you want Proc summary (or means) as it will create statistics for groups of class variables and autoname them if that option is used. Here's a small example using a SAS supplied dataset you should have access to:

proc summary data=sashelp.class;
class age sex;
var height weight;
output out=classsummary  max= min= mean= std= /autoname;
run;

The output has summaries at all of the combinations of the grouping variables (the ones you want to partition over maybe?) The Var statement lists the numeric variables to summarize, output says to create an output data set and provide the name of the set (out=), the requested statistics (max= min= etc) and the /autoname tells SAS to create names by appending the statistic abbreviation to the variable name.

You would have to go into a lot detail to explain wher any of the statistics for your data_1, data_2 and data_3 come from given that example data and "results".

Contributor
Posts: 67

## Re: Domain function mean(var3) over (patrition by var1,var2)

Hi,

data_1, data_2 etc. are not num variables but characters 3m means 3 months, the thing is when data_1 is NA like in example:

 Frnak 3m NA 9m 88 Frank NA NA 9m 58

avg_data_1 is 88,

when both data_2 are NA the avg_data_2=0

when data_3 avg_data_3 = 73.

Solution
‎08-07-2017 02:07 PM
Posts: 5,519

## Re: Domain function mean(var3) over (patrition by var1,var2)

Looks like you want something like:

proc sql;
select
Cst_name,
coalesce(mean(case when data_1 = "NA" then . else skrz end), 0) as avg_data_1_cst,
coalesce(mean(case when data_2 = "NA" then . else skrz end), 0) as avg_data_2_cst,
coalesce(mean(case when data_3 = "NA" then . else skrz end), 0) as avg_data_3_cst,
coalesce(max(case when data_1 = "NA" then . else skrz end), 0) as max_data_1_cst
from test
group by Cst_name;
quit;

PG
Contributor
Posts: 67

## Re: Domain function mean(var3) over (patrition by var1,var2)

Many thanks, that is exactly what I wanted to get.
Contributor
Posts: 67

## Re: Domain function mean(var3) over (patrition by var1,var2)

I post it here because I  am struggling with the similar problem. Could you please tell me an efficient  way to count stats like std, mean, max by a group and add it to the original dataset. I found this http://support.sas.com/resources/papers/proceedings11/019-2011.pdf but their only show how to count or sum, the other problem is efficiency, I have to create over 500 variables like mentioned one, working with more than 20 large tables(over 10 mln obs in each) so sorting and merging is time-consuming(SQL package process everything in about 15h, but there is need to rewrite everything into 4GL).

In the table below, you will find desirable results.

 Name Sex Age Weight Height mean_age_by_sex std_weight_by_age Philip M 16 72 150 13,4 0 Janet F 15 62,5 112,5 13,22 1,81573 Mary F 15 66,5 112 13,22 1,81573 Ronald M 15 67 133 13,4 1,81573 William M 15 66,5 112 13,4 1,81573 Carol F 14 62,8 102,5 13,22 2,425902 Judy F 14 64,3 90 13,22 2,425902 Alfred M 14 69 112,5 13,4 2,425902 Henry M 14 63,5 102,5 13,4 2,425902 Alice F 13 56,5 84 13,22 3,670907 Barbara F 13 65,3 98 13,22 3,670907 Jeffrey M 13 62,5 84 13,4 3,670907 Jane F 12 59,8 84,5 13,22 2,949305 Louise F 12 56,3 77 13,22 2,949305 James M 12 57,3 83 13,4 2,949305 John M 12 59 99,5 13,4 2,949305 Robert M 12 64,8 128 13,4 2,949305 Joyce F 11 51,3 50,5 13,22 3,1 Thomas M 11 57,5 85 13,4 3,1
☑ This topic is solved.

Need further help from the community? Please ask a new question.

Discussion stats
• 9 replies
• 194 views
• 1 like
• 5 in conversation