I can't find a way to summarize the same variable using different weights.
I try to explain it with an example (of 3 records):
data pippo;
a=10;
wgt1=0.5;
wgt2=1;
wgt3=0;
output;
a=3;
wgt1=0;
wgt2=0;
wgt3=1;
output;
a=8.9;
wgt1=1.2;
wgt2=0.3;
wgt3=0.1;
output;
run;
I tried the following:
proc summary data=pippo missing nway;
var a /weight=wgt1;
var a /weight=wgt2;
var a /weight=wgt3;
output out=pluto (drop=_freq_ _type_) sum()=;
run;
Obviously it gives me a warning because I used the same variable "a" (I can't rename it!).
I can do it but just for one variable:
proc summary data=pippo missing nway;
var wgt1-wgt3 /weight=a;
output out=pluto (drop=_freq_ _type_) sum(wgt1-wgt3)=a1-a3;
run;
But if I introduce a second variable I can't use that trick
proc summary data=pippo missing nway;
var wgt1-wgt3 /weight=a;
var wgt1-wgt3 /weight=b; <---- WRONG!!!
output out=pluto (drop=_freq_ _type_)
sum(wgt1-wgt3)=a1-a3
sum(wgt1-wgt3)=b1-b3;<---SAME NAME, WRONG!!!
run;
I've to save a huge amount of data, I've not so much physical space and I should construct like 120 field (a0-a6,b0-b6 etc) that are the same variables just with fixed weight (wgt0-wgt5).
I want to store a dataset with 20 columns (a,b,c..) and 6 weight (wgt0-wgt5) and, on demand, processing a "summary" without an intermediate datastep that oblige me to create 120 fields.
Due to the huge amount of data I'd like also not to use proc sql statement (even because I've to write it manually for each variable):
proc sql;
create table pluto
as select sum(db.a * wgt1) as a1, sum(db.a * wgt2) as a2 , etc.
quit;
Is There a "Super proc summary" that can summarize the same field with different weights?
Thanks in advance,
Paolo
One suspects that you have a poorly designed process to start with. So go with the fact that you will not get a nice short bit of code to deal with such.
There might be a way with Proc IML but out of my experience and would require a much more complete description of your data. Since your provide an example data set then start adding other requirements (the variable B) as a counter example then a really representative data set is needed. Likely reshape (transpose) and summarize but you haven't really shown the expected results.
As output I need a dataset summarized by one ore more variable.
Writing another example
Input dataset:
options nolabel;
data pippo(drop=r);
set sashelp.cars(keep=make model MSRP invoice enginesize cylinders horsepower length wheelbase
rename=(make=class1 model=class2 MSRP=a invoice=b enginesize=c
cylinders=wgt1 horsepower=wgt2 length=wgt3 wheelbase=wgt4)
);
array wgt_(4) wgt1-wgt4;
do r=1 to 4;
wgt_(r)=sum(wgt_(r),0)/1000;*avoid missing with sum;
end;
run;
So you have a starting dataset with 2 class variables (class1 and class2), 3 variables (a b c) that need to be weighted and summarized and then 4 weights.
Output dataset:
I can achieve the output result trought a datastep (the part that I would like to remove):
data pippo(drop=f a b c wgt1-wgt4);
set pippo;
array var(16) wgt1-wgt4 a1-a4 b1-b4 c1-c4;
do f=1 to 4;
var(f+4)=a*var(f);
var(f+8)=b*var(f);
var(f+12)=c*var(f);
end;
run;
proc summary data=pippo missing nway;
class class1;
var a1-a4 b1-b4 c1-c4;
output out=final (drop=_freq_ _type_) sum=;
run;
What I need is the dataset that is called "final".
Thanks,
Paolo
It's finally time to hack! Remember to visit the SAS Hacker's Hub regularly for news and updates.
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.
Ready to level-up your skills? Choose your own adventure.