BookmarkSubscribeRSS Feed
🔒 This topic is solved and locked. Need further help from the community? Please sign in and ask a new question.
djrisks
Barite | Level 11

Hi,

 

I will like to create the averages of all possible combinations using the dataset which can be created from the SAS code below please.

 

data input_data;
  subject = 1; diff = 4.0; output;
  subject = 2; diff = 10.4; output;
  subject = 3; diff = 11.8; output;
  subject = 4; diff = 11.9; output;
  subject = 5; diff = 17.4; output;
  subject = 6; diff = 20.4; output;
  subject = 7; diff = 21.1; output;
  subject = 8; diff = 25.0; output;
  subject = 9; diff = 25.6; output;
  subject = 10; diff = 28.8; output;
  subject = 11; diff = 29.8; output;
run;

I am trying to get my output to be something similar to the image below please. I only want to calculate the averages between every combination once, like in the image below. I have tried doing this by using the cartesian join, however this creates every single average, and I only need 66 of them. I think I may have to use the allcombi function or Proc IML?

 

picture2.PNG

 

Many thanks in advanced for your help.

 

Kriss

1 ACCEPTED SOLUTION

Accepted Solutions
art297
Opal | Level 21

Here is one way:

data input_data;
  subject = 1; diff = 4.0; output;
  subject = 2; diff = 10.4; output;
  subject = 3; diff = 11.8; output;
  subject = 4; diff = 11.9; output;
  subject = 5; diff = 17.4; output;
  subject = 6; diff = 20.4; output;
  subject = 7; diff = 21.1; output;
  subject = 8; diff = 25.0; output;
  subject = 9; diff = 25.6; output;
  subject = 10; diff = 28.8; output;
  subject = 11; diff = 29.8; output;
run;

proc sql;
  create table want as
   select a.subject,b.subject as subject2,
          mean(a.diff,b.diff) as average
     from input_data a, input_data b
       where a.subject le b.subject
         order by subject,subject2
  ;
quit;

proc transpose data=want 
               out=want (drop=_:) prefix=Subject;
  var average;
  by subject;
  id subject2;
run;

proc print data=want;
run;

Art, CEO, AnalystFinder.com

 

View solution in original post

6 REPLIES 6
art297
Opal | Level 21

Here is one way:

data input_data;
  subject = 1; diff = 4.0; output;
  subject = 2; diff = 10.4; output;
  subject = 3; diff = 11.8; output;
  subject = 4; diff = 11.9; output;
  subject = 5; diff = 17.4; output;
  subject = 6; diff = 20.4; output;
  subject = 7; diff = 21.1; output;
  subject = 8; diff = 25.0; output;
  subject = 9; diff = 25.6; output;
  subject = 10; diff = 28.8; output;
  subject = 11; diff = 29.8; output;
run;

proc sql;
  create table want as
   select a.subject,b.subject as subject2,
          mean(a.diff,b.diff) as average
     from input_data a, input_data b
       where a.subject le b.subject
         order by subject,subject2
  ;
quit;

proc transpose data=want 
               out=want (drop=_:) prefix=Subject;
  var average;
  by subject;
  id subject2;
run;

proc print data=want;
run;

Art, CEO, AnalystFinder.com

 

MarkWik
Quartz | Level 8

Hi @art297 , Brilliant!, If you don't mind and have the time, Could you please offer a line or two explanation to help me understand the proc sql solution picks all combinations. I so want to understand Please!

art297
Opal | Level 21

@MarkWik: It's called a Cartesian product. The SAS note (http://support.sas.com/kb/25/270.html) describes the result as "

When joining multiple tables, the default behavior of PROC SQL is to build all possible combinations between the tables. This is also known as the Cartesian Product.

The following note will be written to the SAS log when a Cartesian Product is created:

  
NOTE: The execution of this query involves performing one or
      more Cartesian product joins that can not be optimized. 

"

As such, the code I suggested does such a join (including computing the averages) then, according to @djrisks's specs, eliminated the lower half of the diagonal, and ordered the rows and columns by subject number.

 

Art, CEO, AnalystFinder.com

djrisks
Barite | Level 11

Thank you @art297! This is so helpful! @Ksharp I like the IML solution too, and thank you for helping me with it! @Astounding thank you for the array method!

 

🙂

Astounding
PROC Star

There are definitely ways to get ALLCOMBI to do this, but I have always programmed it by brute force.  This will get you the data, but getting the report is up to you.

 

data want;

array diffs {11} diff01-diff11;

array inc {11} include01-include11;

do until (done);

   set have end=done;

   diffs{subject}=diff;

end;

length subjlist $ 11;

do include01=0, 1;

do include02=0, 1;

do include03=0, 1;

do include04=0, 1;

do include05=0, 1;

do include06=0, 1;

do include07=0, 1;

do include08=0, 1;

do include09=0, 1;

do include10=0, 1;

do include11=0, 1;

   total=0;

   subjlist=' ';

   do i=1 to 11;

       total + inc{i} * diffs{i};

       substr(subjlist, i, 1) = put(inc{i}, 1.);

   end;

   mean = total / sum(of inc01-inc11);

   output;

end; end; end; end; end; end; end; end; end; end; end;

keep diff01-diff11 subjlist mean;

run;

 

It's untested code, so might need some tweaking.  But it should generate all the information you're seeking (plus an extra record ... the empty set with a mean of 0).

 

With a correction above.

Ksharp
Super User

Since @art297 has already given you sql solution. 

I want present IML solution.

 

data input_data;
  subject = 1; diff = 4.0; output;
  subject = 2; diff = 10.4; output;
  subject = 3; diff = 11.8; output;
  subject = 4; diff = 11.9; output;
  subject = 5; diff = 17.4; output;
  subject = 6; diff = 20.4; output;
  subject = 7; diff = 21.1; output;
  subject = 8; diff = 25.0; output;
  subject = 9; diff = 25.6; output;
  subject = 10; diff = 28.8; output;
  subject = 11; diff = 29.8; output;
run;
options missing=' ';
proc iml;
use input_data nobs nobs;
read all var {diff};
close;
t=t(1:nobs);
w=expandgrid(diff,diff)[,:]||expandgrid(t,t);
want=full(w);
idx=loc(row(want)>col(want));
want[idx]=.;

char_diff=char(diff);
mattrib want r=char_diff c=char_diff l='';
print want;
quit;

x.png 

SAS Innovate 2025: Register Now

Registration is now open for SAS Innovate 2025 , our biggest and most exciting global event of the year! Join us in Orlando, FL, May 6-9.
Sign up by Dec. 31 to get the 2024 rate of just $495.
Register now!

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

SAS Training: Just a Click Away

 Ready to level-up your skills? Choose your own adventure.

Browse our catalog!

Discussion stats
  • 6 replies
  • 1998 views
  • 8 likes
  • 5 in conversation