The means you could get using PROC MEANS or SQL. But when you also want last minus first for another variable, you may as well do the whole thing in a datastep, e.g.: Data want;
Sum_var1=0; Count_var1=0;
do until(last.customer);
set have;
by customer; if not missing(Variable1) then do;
Sum_var1=Sum_var1+Variable1; Count_var1=Count_var1+1; end; if first.customer then First_var2=Variable2;
end;
Diff_var2=Variable2-First_var2; /* last value minus first value */ if Count_var1 then Mean_var1=Sum_var1/Count_var1; keep customer Mean_var1 Diff_var2;
run; The data should be sorted by customer and month, of course. It can also be done in SQL, though: proc sql;
create table want as select
a.*,
mean(a.Variable1) as mean_var1,
Last.Variable2-First.variable2 as Diff_var2
from
have a,
have first,
have last
where first.customer=a.customer
and last.customer=a.customer
group by a.customer
having first.month=min(a.month)
and last.month=max(a.month) order by a.customer, a.month
;
quit;
Here, you can easily get all the original rows without remerging with the original data afterwards. If you only want the two summary rows, you should add a DISTINCT after SELECT, replace a.* with a.customer, and drop a.month from the ORDER BY clause.
... View more