data compustat;
input firm sales year;
datalines;
A 2400 2001
A 2500 2002
A 2600 2003
A 2700 2004
A 2800 2005
B 1500 2001
B 1600 2002
B 1700 2003
C 3100 2001
C 3200 2002
C 3300 2003
C 3400 2004
;
I want a column with average for each group of firm for every sales. can someone please help with that ? here as an example of the output.
firm | sales | year | Average sales |
A | 2400 | 2001 | 2600 |
A | 2500 | 2002 | 2600 |
A | 2600 | 2003 | 2600 |
A | 2700 | 2004 | 2600 |
A | 2800 | 2005 | 2600 |
B | 1500 | 2001 | 1600 |
B | 1600 | 2002 | 1600 |
B | 1700 | 2003 | 1600 |
C | 3100 | 2001 | 3250 |
C | 3200 | 2002 | 3250 |
C | 3300 | 2003 | 3250 |
C | 3400 | 2004 | 3250 |
data compustat;
input firm $ sales year;
datalines;
A 2400 2001
A 2500 2002
A 2600 2003
A 2700 2004
A 2800 2005
B 1500 2001
B 1600 2002
B 1700 2003
C 3100 2001
C 3200 2002
C 3300 2003
C 3400 2004
;
proc sql;
create table want as
select *, avg(sales) as average_sales
from compustat
group by firm
order by firm,year;
quit;
data compustat;
input firm $ sales year;
datalines;
A 2400 2001
A 2500 2002
A 2600 2003
A 2700 2004
A 2800 2005
B 1500 2001
B 1600 2002
B 1700 2003
C 3100 2001
C 3200 2002
C 3300 2003
C 3400 2004
;
proc sql;
create table want as
select *, avg(sales) as average_sales
from compustat
group by firm
order by firm,year;
quit;
One follow-up question...how would the code change if you only wanted to group by Firm and only show the average for each firm grouping once? See the end result is below.
Firm average_sales
A 2600
B 1600
C 3250
@neveragain wrote:
One follow-up question...how would the code change if you only wanted to group by Firm and only show the average for each firm grouping once? See the end result is below.
Firm average_sales
A 2600
B 1600
C 3250
Then remove implicit or explicit references to year and/or sales in the result table by changing
select *, avg(sales) as average_sales
to
select firm, avg(sales) as average_sales
and drop the
order by firm,year
clause.
Excellent! Thank you for the help. I just thought of another question...what if I wanted to derive the standard deviation or another calculation? End result below:
Firm average_sales standard deviation
A 2600 158
B 1600 100
C 3250 129
"One followup question"
...
"I just thought of another question"
This is not a good habit for a person with the userid @neveragain .
I think it's time to start researching the documentation on functions, which you can apply in a similar way to @novinosrin 's use of the AVG function.
proc summary data=compustat nway;
class firm;
var sales;
output out=_stats_ mean=average stddev=std;
run;
data want;
merge compustat _stats_;
by firm;
run;
The data set _stats_ that is created is the one you asked for with one record per firm.
Join us for SAS Innovate 2025, our biggest and most exciting global event of the year, in Orlando, FL, from May 6-9.
Lock in the best rate now before the price increases on April 1.
SAS' Charu Shankar shares her PROC SQL expertise by showing you how to master the WHERE clause using real winter weather data.
Find more tutorials on the SAS Users YouTube channel.