Hi,
I will like to create the averages of all possible combinations using the dataset which can be created from the SAS code below please.
data input_data;
subject = 1; diff = 4.0; output;
subject = 2; diff = 10.4; output;
subject = 3; diff = 11.8; output;
subject = 4; diff = 11.9; output;
subject = 5; diff = 17.4; output;
subject = 6; diff = 20.4; output;
subject = 7; diff = 21.1; output;
subject = 8; diff = 25.0; output;
subject = 9; diff = 25.6; output;
subject = 10; diff = 28.8; output;
subject = 11; diff = 29.8; output;
run;
I am trying to get my output to be something similar to the image below please. I only want to calculate the averages between every combination once, like in the image below. I have tried doing this by using the cartesian join, however this creates every single average, and I only need 66 of them. I think I may have to use the allcombi function or Proc IML?
Many thanks in advanced for your help.
Kriss
Here is one way:
data input_data; subject = 1; diff = 4.0; output; subject = 2; diff = 10.4; output; subject = 3; diff = 11.8; output; subject = 4; diff = 11.9; output; subject = 5; diff = 17.4; output; subject = 6; diff = 20.4; output; subject = 7; diff = 21.1; output; subject = 8; diff = 25.0; output; subject = 9; diff = 25.6; output; subject = 10; diff = 28.8; output; subject = 11; diff = 29.8; output; run; proc sql; create table want as select a.subject,b.subject as subject2, mean(a.diff,b.diff) as average from input_data a, input_data b where a.subject le b.subject order by subject,subject2 ; quit; proc transpose data=want out=want (drop=_:) prefix=Subject; var average; by subject; id subject2; run; proc print data=want; run;
Art, CEO, AnalystFinder.com
Here is one way:
data input_data; subject = 1; diff = 4.0; output; subject = 2; diff = 10.4; output; subject = 3; diff = 11.8; output; subject = 4; diff = 11.9; output; subject = 5; diff = 17.4; output; subject = 6; diff = 20.4; output; subject = 7; diff = 21.1; output; subject = 8; diff = 25.0; output; subject = 9; diff = 25.6; output; subject = 10; diff = 28.8; output; subject = 11; diff = 29.8; output; run; proc sql; create table want as select a.subject,b.subject as subject2, mean(a.diff,b.diff) as average from input_data a, input_data b where a.subject le b.subject order by subject,subject2 ; quit; proc transpose data=want out=want (drop=_:) prefix=Subject; var average; by subject; id subject2; run; proc print data=want; run;
Art, CEO, AnalystFinder.com
Hi @art297 , Brilliant!, If you don't mind and have the time, Could you please offer a line or two explanation to help me understand the proc sql solution picks all combinations. I so want to understand Please!
@MarkWik: It's called a Cartesian product. The SAS note (http://support.sas.com/kb/25/270.html) describes the result as "
When joining multiple tables, the default behavior of PROC SQL is to build all possible combinations between the tables. This is also known as the Cartesian Product.
The following note will be written to the SAS log when a Cartesian Product is created:
NOTE: The execution of this query involves performing one or more Cartesian product joins that can not be optimized.
"
As such, the code I suggested does such a join (including computing the averages) then, according to @djrisks's specs, eliminated the lower half of the diagonal, and ordered the rows and columns by subject number.
Art, CEO, AnalystFinder.com
Thank you @art297! This is so helpful! @Ksharp I like the IML solution too, and thank you for helping me with it! @Astounding thank you for the array method!
🙂
There are definitely ways to get ALLCOMBI to do this, but I have always programmed it by brute force. This will get you the data, but getting the report is up to you.
data want;
array diffs {11} diff01-diff11;
array inc {11} include01-include11;
do until (done);
set have end=done;
diffs{subject}=diff;
end;
length subjlist $ 11;
do include01=0, 1;
do include02=0, 1;
do include03=0, 1;
do include04=0, 1;
do include05=0, 1;
do include06=0, 1;
do include07=0, 1;
do include08=0, 1;
do include09=0, 1;
do include10=0, 1;
do include11=0, 1;
total=0;
subjlist=' ';
do i=1 to 11;
total + inc{i} * diffs{i};
substr(subjlist, i, 1) = put(inc{i}, 1.);
end;
mean = total / sum(of inc01-inc11);
output;
end; end; end; end; end; end; end; end; end; end; end;
keep diff01-diff11 subjlist mean;
run;
It's untested code, so might need some tweaking. But it should generate all the information you're seeking (plus an extra record ... the empty set with a mean of 0).
With a correction above.
Since @art297 has already given you sql solution.
I want present IML solution.
data input_data;
subject = 1; diff = 4.0; output;
subject = 2; diff = 10.4; output;
subject = 3; diff = 11.8; output;
subject = 4; diff = 11.9; output;
subject = 5; diff = 17.4; output;
subject = 6; diff = 20.4; output;
subject = 7; diff = 21.1; output;
subject = 8; diff = 25.0; output;
subject = 9; diff = 25.6; output;
subject = 10; diff = 28.8; output;
subject = 11; diff = 29.8; output;
run;
options missing=' ';
proc iml;
use input_data nobs nobs;
read all var {diff};
close;
t=t(1:nobs);
w=expandgrid(diff,diff)[,:]||expandgrid(t,t);
want=full(w);
idx=loc(row(want)>col(want));
want[idx]=.;
char_diff=char(diff);
mattrib want r=char_diff c=char_diff l='';
print want;
quit;
Registration is now open for SAS Innovate 2025 , our biggest and most exciting global event of the year! Join us in Orlando, FL, May 6-9.
Sign up by Dec. 31 to get the 2024 rate of just $495.
Register now!
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.
Ready to level-up your skills? Choose your own adventure.