<?xml version="1.0" encoding="UTF-8"?>
<rss xmlns:content="http://purl.org/rss/1.0/modules/content/" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" xmlns:taxo="http://purl.org/rss/1.0/modules/taxonomy/" version="2.0">
  <channel>
    <title>topic Re: Calculating Product Overlap in a Basket in SAS Programming</title>
    <link>https://communities.sas.com/t5/SAS-Programming/Calculating-Product-Overlap-in-a-Basket/m-p/631542#M187135</link>
    <description>&lt;P&gt;Hi&amp;nbsp;&lt;a href="https://communities.sas.com/t5/user/viewprofilepage/user-id/209336"&gt;@Marc_y&lt;/a&gt;&amp;nbsp; Great precise answer by&amp;nbsp;&lt;a href="https://communities.sas.com/t5/user/viewprofilepage/user-id/18408"&gt;@Ksharp&lt;/a&gt;&amp;nbsp; and a nice detailed thought out by&amp;nbsp;&lt;a href="https://communities.sas.com/t5/user/viewprofilepage/user-id/292097"&gt;@ed_sas_member&lt;/a&gt;&amp;nbsp;. To my mind, a way to understand the problem as we break into pieces with your initial thought of arrays would perhaps be to devise an algorithm in &lt;EM&gt;steps like&lt;/EM&gt;&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;&lt;EM&gt;1. Look at pairs&lt;/EM&gt;&lt;/P&gt;
&lt;P&gt;&lt;EM&gt;2. Check for even outcome&lt;/EM&gt;&lt;/P&gt;
&lt;P&gt;&lt;EM&gt;3. Roll up&amp;nbsp;&lt;/EM&gt;&lt;/P&gt;
&lt;P&gt;&lt;EM&gt;4. Transpose to wide to meet the stated requirement&lt;/EM&gt;&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;i.e in SAS syntax, the possibility is&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;PRE&gt;&lt;CODE class=" language-sas"&gt;

data have;
  input basket g1 g2 g3 g4;
datalines;
1 1 0 1 0
2 0 0 1 1
3 1 1 1 0
4 1 1 0 1
5 0 1 1 0
6 0 0 0 1
;
run;
/*Get/compare pairs*/
data temp;
 set have;
 array t g1-g4;
 length item $5;
 do g=1 to dim(t);
  do j=1 to dim(t);
   item=cats('g',g);
   p1=t(g);
   p2=t(j);
   if sum(p1,p2)=2 then output;
  end;
 end;
 keep item j p:;
run;
/*Roll up eligible pairs*/
proc sql;
create table temp1 as
select item,j,sum(p1) as s
from temp
group by item,j;
quit;
/*To get the wide structure*/
proc transpose data=temp1 out=want(drop=_:) prefix=g;
by item;
var s;
id j;
run;&lt;/CODE&gt;&lt;/PRE&gt;
&lt;P&gt;Of course the above may require a slight adjustments to get the &lt;STRONG&gt;appropriate variable names&lt;/STRONG&gt; in place as is in your original dataset. The idea is to validate the thought process/approach &lt;span class="lia-unicode-emoji" title=":slightly_smiling_face:"&gt;🙂&lt;/span&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;For someone who mentions "&lt;SPAN&gt;&amp;nbsp;&lt;EM&gt;but my knowledge in SAS is too limited to work out how exactly this is done&lt;/EM&gt;", I really do appreciate your thought process is very noteworthy&lt;/SPAN&gt;&lt;/P&gt;</description>
    <pubDate>Thu, 12 Mar 2020 13:06:29 GMT</pubDate>
    <dc:creator>novinosrin</dc:creator>
    <dc:date>2020-03-12T13:06:29Z</dc:date>
    <item>
      <title>Calculating Product Overlap in a Basket</title>
      <link>https://communities.sas.com/t5/SAS-Programming/Calculating-Product-Overlap-in-a-Basket/m-p/631515#M187122</link>
      <description>&lt;P&gt;Hi,&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;I am trying to calculate what the overlap is between a set of products in various baskets of goods. In total, I have 110 products across 150 million baskets. I want to end up with a table that shows the count of baskets where there is an overlap between any combination of products (including with themselves - so a 110 column x 110 row table.&amp;nbsp;&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;I think I should be using arrays to achieve this, but my knowledge in SAS is too limited to work out how exactly this is done. I have tried several different approaches, but none seem to give me the desired outputs - I seem to get stuck on reducing the baskets down to one cell. I tried using a sumproduct as the values that can exist in the basket data are 0 and 1 - thus the sumproduct would give me the number of baskets where there is overlap.&amp;nbsp;&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;What I have butchered together so far is this (which does not tie to the example data below which only looks at 4 variables):&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;PRE&gt;DATA test;
  SET SAMPLE (DROP=BASKET_ID CUSTOMER_NO);

  ARRAY BASE BU54 -- SG1043;
  ARRAY COMPARE BU54 -- SG1043;

  ARRAY OUTPUTS BU54 -- SG1043;

  DO i=1 TO 110;
    DO p=1 TO 110;

      OUTPUTS{p} = SUM(BASE{i} + COMPARE{p});
	  IF p = 110 THEN OUTPUT;

	END;
  END;

RUN;&lt;/PRE&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;Here is some example data:&lt;/P&gt;&lt;PRE&gt;data have;
  input basket g1 g2 g3 g4;
datalines;
1 1 0 1 0
2 0 0 1 1
3 1 1 1 0
4 1 1 0 1
5 0 1 1 0
6 0 0 0 1
;
run;&lt;/PRE&gt;&lt;P&gt;Essentially, what I want to get to with the above example data would be something that looks like this:&amp;nbsp;&lt;/P&gt;&lt;PRE&gt;data want;
  input item $ g1 g2 g3 g4;
datalines;
g1 3 2 2 1
g2 2 3 2 1
g3 2 2 4 1
g4 1 1 1 3
;
run;&lt;/PRE&gt;&lt;P&gt;I am sorry if I have not included data in the right format - this is my first time using this community. Please let me know if there is anything I can add to make this more clear.&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;Any help would be greatly appreciated!&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;Thank you.&lt;/P&gt;</description>
      <pubDate>Thu, 12 Mar 2020 11:19:00 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Programming/Calculating-Product-Overlap-in-a-Basket/m-p/631515#M187122</guid>
      <dc:creator>Marc_y</dc:creator>
      <dc:date>2020-03-12T11:19:00Z</dc:date>
    </item>
    <item>
      <title>Re: Calculating Product Overlap in a Basket</title>
      <link>https://communities.sas.com/t5/SAS-Programming/Calculating-Product-Overlap-in-a-Basket/m-p/631529#M187126</link>
      <description>&lt;P&gt;Hi&amp;nbsp;&lt;a href="https://communities.sas.com/t5/user/viewprofilepage/user-id/209336"&gt;@Marc_y&lt;/a&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;Here is an attempt to achieve this.&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;PRE&gt;&lt;CODE class=" language-sas"&gt;data have;
  input basket g1 g2 g3 g4;
datalines;
1 1 0 1 0
2 0 0 1 1
3 1 1 1 0
4 1 1 0 1
5 0 1 1 0
6 0 0 0 1
;
run;

/* Retrieve the list of variables (except basket): g1 g2 g3 g4 ... in macrovariable &amp;amp;list_letters */

proc contents data=have out=have_contents noprint;
run;

proc sql noprint;
	select distinct name
	into: list_letters separated by " "
	from have_contents
	where name ne "basket";
quit;

/* Identify the frequency of each couple (e.g. g1-g2, ...) */

data have_tr;
	set have;
	
	array _letter(*) &amp;amp;list_letters;

	total = sum(of _letter(*));

	do i=1 to dim(_letter);

		do j=i to dim(_letter);

			if _letter(i)&amp;gt;0 and _letter(j)&amp;gt;0 then
				do;
					couple_1=vname(_letter(i));
					couple_2=vname(_letter(j));
					if total &amp;gt; 0 then do;
						couple=compress(catx("_", couple_1, couple_2));
						output;
					end;
				end;
		end;
	end;
	keep basket couple;
run;

proc freq data=have_tr;
	table couple / noprint out=have_freq (keep=couple count);
run;

/* Retrieve the list of distinct couples (g1g1, ...) in macrovariable &amp;amp;list_couple */

proc sql noprint;
	select distinct couple into: list_couple separated by " " from have_freq;
quit;

/* Create the matrix table */

proc transpose data=have_freq out=have_tr2 (drop=_name_ _label_);
	var count;
	ID couple;
run;

proc transpose data=have(drop=basket) out=structure (keep=_name_ rename=(_name_=V1));
	var _numeric_;
run;

data want;
	set structure;
	if _n_ = 1 then set have_tr2;
	array _matrix(*) &amp;amp;list_letters; /* g1 g2 g3 g4 */
	array _ref(*) &amp;amp;list_couple; 	/* g1_g1 g1_g2 g1_g3 g1_g4 g2_g2 g2_g3 g2_g4 g3_g3 g3_g4 g4_g4 */
	do i=1 to dim(_matrix);
		do j=1 to dim(_ref);
			if (scan(vname(_ref(j)),1,"_") = V1 
			   and scan(vname(_ref(j)),2,"_") = vname(_matrix(i)))
			  or
			   (scan(vname(_ref(j)),2,"_") = V1 
			   and scan(vname(_ref(j)),1,"_") = vname(_matrix(i)))
			   then _matrix(i) = _ref(j);
			if _matrix(i) = . then _matrix(i) = 0;
		end;
	end;
	keep V1 &amp;amp;list_letters;
run;

proc print;&lt;/CODE&gt;&lt;/PRE&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;&lt;span class="lia-inline-image-display-wrapper lia-image-align-inline" image-alt="Capture d’écran 2020-03-12 à 13.18.02.png" style="width: 126px;"&gt;&lt;img src="https://communities.sas.com/t5/image/serverpage/image-id/36793i466676864D94C6DC/image-size/large?v=v2&amp;amp;px=999" role="button" title="Capture d’écran 2020-03-12 à 13.18.02.png" alt="Capture d’écran 2020-03-12 à 13.18.02.png" /&gt;&lt;/span&gt;&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;Best,&lt;/P&gt;</description>
      <pubDate>Thu, 12 Mar 2020 12:48:38 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Programming/Calculating-Product-Overlap-in-a-Basket/m-p/631529#M187126</guid>
      <dc:creator>ed_sas_member</dc:creator>
      <dc:date>2020-03-12T12:48:38Z</dc:date>
    </item>
    <item>
      <title>Re: Calculating Product Overlap in a Basket</title>
      <link>https://communities.sas.com/t5/SAS-Programming/Calculating-Product-Overlap-in-a-Basket/m-p/631530#M187127</link>
      <description>&lt;PRE&gt;&lt;CODE class=" language-sas"&gt;data have;
  input basket g1 g2 g3 g4;
datalines;
1 1 0 1 0
2 0 0 1 1
3 1 1 1 0
4 1 1 0 1
5 0 1 1 0
6 0 0 0 1
;
run;
proc corr data=have out=want(where=(_type_='SSCP'))     sscp noprint ;
var g1-g4;
run;
proc print data=want;run;&lt;/CODE&gt;&lt;/PRE&gt;</description>
      <pubDate>Thu, 12 Mar 2020 12:19:08 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Programming/Calculating-Product-Overlap-in-a-Basket/m-p/631530#M187127</guid>
      <dc:creator>Ksharp</dc:creator>
      <dc:date>2020-03-12T12:19:08Z</dc:date>
    </item>
    <item>
      <title>Re: Calculating Product Overlap in a Basket</title>
      <link>https://communities.sas.com/t5/SAS-Programming/Calculating-Product-Overlap-in-a-Basket/m-p/631534#M187129</link>
      <description>&lt;P&gt;Maxim 7 at work.&lt;/P&gt;</description>
      <pubDate>Thu, 12 Mar 2020 12:32:03 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Programming/Calculating-Product-Overlap-in-a-Basket/m-p/631534#M187129</guid>
      <dc:creator>Kurt_Bremser</dc:creator>
      <dc:date>2020-03-12T12:32:03Z</dc:date>
    </item>
    <item>
      <title>Re: Calculating Product Overlap in a Basket</title>
      <link>https://communities.sas.com/t5/SAS-Programming/Calculating-Product-Overlap-in-a-Basket/m-p/631535#M187130</link>
      <description>Thanks a lot for this detailed answer. I tried Ksharp's answer and it works with a lot less code - so will go ahead with that.&lt;BR /&gt;&lt;BR /&gt;Really appreciate the help!</description>
      <pubDate>Thu, 12 Mar 2020 12:38:29 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Programming/Calculating-Product-Overlap-in-a-Basket/m-p/631535#M187130</guid>
      <dc:creator>Marc_y</dc:creator>
      <dc:date>2020-03-12T12:38:29Z</dc:date>
    </item>
    <item>
      <title>Re: Calculating Product Overlap in a Basket</title>
      <link>https://communities.sas.com/t5/SAS-Programming/Calculating-Product-Overlap-in-a-Basket/m-p/631536#M187131</link>
      <description>This is perfect - thank you! You have helped me a ton!</description>
      <pubDate>Thu, 12 Mar 2020 12:38:49 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Programming/Calculating-Product-Overlap-in-a-Basket/m-p/631536#M187131</guid>
      <dc:creator>Marc_y</dc:creator>
      <dc:date>2020-03-12T12:38:49Z</dc:date>
    </item>
    <item>
      <title>Re: Calculating Product Overlap in a Basket</title>
      <link>https://communities.sas.com/t5/SAS-Programming/Calculating-Product-Overlap-in-a-Basket/m-p/631542#M187135</link>
      <description>&lt;P&gt;Hi&amp;nbsp;&lt;a href="https://communities.sas.com/t5/user/viewprofilepage/user-id/209336"&gt;@Marc_y&lt;/a&gt;&amp;nbsp; Great precise answer by&amp;nbsp;&lt;a href="https://communities.sas.com/t5/user/viewprofilepage/user-id/18408"&gt;@Ksharp&lt;/a&gt;&amp;nbsp; and a nice detailed thought out by&amp;nbsp;&lt;a href="https://communities.sas.com/t5/user/viewprofilepage/user-id/292097"&gt;@ed_sas_member&lt;/a&gt;&amp;nbsp;. To my mind, a way to understand the problem as we break into pieces with your initial thought of arrays would perhaps be to devise an algorithm in &lt;EM&gt;steps like&lt;/EM&gt;&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;&lt;EM&gt;1. Look at pairs&lt;/EM&gt;&lt;/P&gt;
&lt;P&gt;&lt;EM&gt;2. Check for even outcome&lt;/EM&gt;&lt;/P&gt;
&lt;P&gt;&lt;EM&gt;3. Roll up&amp;nbsp;&lt;/EM&gt;&lt;/P&gt;
&lt;P&gt;&lt;EM&gt;4. Transpose to wide to meet the stated requirement&lt;/EM&gt;&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;i.e in SAS syntax, the possibility is&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;PRE&gt;&lt;CODE class=" language-sas"&gt;

data have;
  input basket g1 g2 g3 g4;
datalines;
1 1 0 1 0
2 0 0 1 1
3 1 1 1 0
4 1 1 0 1
5 0 1 1 0
6 0 0 0 1
;
run;
/*Get/compare pairs*/
data temp;
 set have;
 array t g1-g4;
 length item $5;
 do g=1 to dim(t);
  do j=1 to dim(t);
   item=cats('g',g);
   p1=t(g);
   p2=t(j);
   if sum(p1,p2)=2 then output;
  end;
 end;
 keep item j p:;
run;
/*Roll up eligible pairs*/
proc sql;
create table temp1 as
select item,j,sum(p1) as s
from temp
group by item,j;
quit;
/*To get the wide structure*/
proc transpose data=temp1 out=want(drop=_:) prefix=g;
by item;
var s;
id j;
run;&lt;/CODE&gt;&lt;/PRE&gt;
&lt;P&gt;Of course the above may require a slight adjustments to get the &lt;STRONG&gt;appropriate variable names&lt;/STRONG&gt; in place as is in your original dataset. The idea is to validate the thought process/approach &lt;span class="lia-unicode-emoji" title=":slightly_smiling_face:"&gt;🙂&lt;/span&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;For someone who mentions "&lt;SPAN&gt;&amp;nbsp;&lt;EM&gt;but my knowledge in SAS is too limited to work out how exactly this is done&lt;/EM&gt;", I really do appreciate your thought process is very noteworthy&lt;/SPAN&gt;&lt;/P&gt;</description>
      <pubDate>Thu, 12 Mar 2020 13:06:29 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Programming/Calculating-Product-Overlap-in-a-Basket/m-p/631542#M187135</guid>
      <dc:creator>novinosrin</dc:creator>
      <dc:date>2020-03-12T13:06:29Z</dc:date>
    </item>
  </channel>
</rss>

