<?xml version="1.0" encoding="UTF-8"?>
<rss xmlns:content="http://purl.org/rss/1.0/modules/content/" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" xmlns:taxo="http://purl.org/rss/1.0/modules/taxonomy/" version="2.0">
  <channel>
    <title>topic Re: How to count match between two variables in SAS Programming</title>
    <link>https://communities.sas.com/t5/SAS-Programming/How-to-count-match-between-two-variables/m-p/505433#M135348</link>
    <description>&lt;P&gt;SQL set operations provide a straightforward way to do this:&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;PRE&gt;&lt;CODE class=" language-sas"&gt;DATA HAVE1;
    INPUT (NAME1 NAME2 X1 X2) (:$8.);
    CARDS;
Z J a a
Z J b a
Z J c b
X C b a
X C b c
X C c b
;

proc sql;
select * 
from

(select name1, name2, count(*) as overlap
from
(select name1, name2, x1 as x from have1
 intersect
 select name1, name2, x2 as x from have1)
group by name1, name2) 

natural join

(select name1, name2, count(*) as all
from
(select name1, name2, x1 as x from have1
 union
 select name1, name2, x2 as x from have1)
group by name1, name2);

quit;&lt;/CODE&gt;&lt;/PRE&gt;
&lt;P&gt;Notes: UNION and INTERSECT operations provide DISTINCT observations (unless the ALL keyword is specified). NATURAL JOIN performs a join on all same-named columns.&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;</description>
    <pubDate>Thu, 18 Oct 2018 05:02:57 GMT</pubDate>
    <dc:creator>PGStats</dc:creator>
    <dc:date>2018-10-18T05:02:57Z</dc:date>
    <item>
      <title>How to count match between two variables</title>
      <link>https://communities.sas.com/t5/SAS-Programming/How-to-count-match-between-two-variables/m-p/505359#M135322</link>
      <description>&lt;P&gt;Hello,&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;Seems my last post is complicated. I am trying to simplify my case.&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;I have dataset like this&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;PRE&gt;&lt;CODE class=" language-sas"&gt;DATA HAVE1;
    INPUT (NAME1 NAME2 X1 X2) (:$8.);
    CARDS;
Z J a a
Z J b a
Z J c b
X C b a
X C b c
X C c b

;
run;&lt;/CODE&gt;&lt;/PRE&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;1. I would like to compute the number of OVERLAP between distinct X1 and X2. Group by NAME1, NAME2.&lt;/P&gt;&lt;P&gt;2. Take the result of the 1st step, and divide it by the number of total distinct value in X1 and X2. Group by NAME1, NAME2.&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;Here is the example of what I want.&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;PRE&gt;&lt;CODE class=" language-sas"&gt;DATA WANT;
    INPUT (NAME1 NAME2 OVERLAP OVERLAPALL) (:$8.);
    CARDS;
Z J 2 2/3     
X C 2 2/3
;&lt;/CODE&gt;&lt;/PRE&gt;&lt;PRE&gt;&lt;CODE class=" language-sas"&gt;/* In first row, OVERLAP is 2 because X1 and X2 only have 'a' 'b' in common between Z and J. There is only 2 overlap. 2/3 means OVERLAP divided by the total # of distinct value of both X1 and X2 between Z and J*/&lt;/CODE&gt;&lt;/PRE&gt;&lt;P&gt;So it is like this:&lt;/P&gt;&lt;P&gt;OVERLAP : count(distinct X1 = distinct X2).&lt;/P&gt;&lt;P&gt;OVERLAPALL: count(distinct X1 = distinct X2) / count(distinct X1, X2).&lt;/P&gt;&lt;P&gt;All the computation above is based on GROUP BY NAME1, NAME2.&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;Would you please help me with this? Hopefully I make myself clear.&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;Thank you all.&lt;/P&gt;</description>
      <pubDate>Wed, 17 Oct 2018 21:37:21 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Programming/How-to-count-match-between-two-variables/m-p/505359#M135322</guid>
      <dc:creator>yanshuai</dc:creator>
      <dc:date>2018-10-17T21:37:21Z</dc:date>
    </item>
    <item>
      <title>Re: How to count match between two variables</title>
      <link>https://communities.sas.com/t5/SAS-Programming/How-to-count-match-between-two-variables/m-p/505362#M135325</link>
      <description>&lt;P&gt;Can you format the data have to make it readable plz&lt;/P&gt;</description>
      <pubDate>Wed, 17 Oct 2018 21:29:09 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Programming/How-to-count-match-between-two-variables/m-p/505362#M135325</guid>
      <dc:creator>novinosrin</dc:creator>
      <dc:date>2018-10-17T21:29:09Z</dc:date>
    </item>
    <item>
      <title>Re: How to count match between two variables</title>
      <link>https://communities.sas.com/t5/SAS-Programming/How-to-count-match-between-two-variables/m-p/505363#M135326</link>
      <description>&lt;P&gt;Sorry.&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;I modify it.&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;Hope this works.&lt;/P&gt;</description>
      <pubDate>Wed, 17 Oct 2018 21:32:09 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Programming/How-to-count-match-between-two-variables/m-p/505363#M135326</guid>
      <dc:creator>yanshuai</dc:creator>
      <dc:date>2018-10-17T21:32:09Z</dc:date>
    </item>
    <item>
      <title>Re: How to count match between two variables</title>
      <link>https://communities.sas.com/t5/SAS-Programming/How-to-count-match-between-two-variables/m-p/505367#M135329</link>
      <description>&lt;P&gt;for the group z j&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;Z J a a&lt;BR /&gt;Z J b a&lt;BR /&gt;Z J c b&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;a &amp;amp; b seems to be found? doesn't that make it 2/3?&lt;/P&gt;</description>
      <pubDate>Wed, 17 Oct 2018 21:42:30 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Programming/How-to-count-match-between-two-variables/m-p/505367#M135329</guid>
      <dc:creator>novinosrin</dc:creator>
      <dc:date>2018-10-17T21:42:30Z</dc:date>
    </item>
    <item>
      <title>Re: How to count match between two variables</title>
      <link>https://communities.sas.com/t5/SAS-Programming/How-to-count-match-between-two-variables/m-p/505369#M135331</link>
      <description>&lt;P&gt;My mistake.&lt;/P&gt;&lt;P&gt;I have changed it.&lt;/P&gt;</description>
      <pubDate>Wed, 17 Oct 2018 21:43:12 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Programming/How-to-count-match-between-two-variables/m-p/505369#M135331</guid>
      <dc:creator>yanshuai</dc:creator>
      <dc:date>2018-10-17T21:43:12Z</dc:date>
    </item>
    <item>
      <title>Re: How to count match between two variables</title>
      <link>https://communities.sas.com/t5/SAS-Programming/How-to-count-match-between-two-variables/m-p/505370#M135332</link>
      <description>&lt;P&gt;And you are right.&lt;/P&gt;&lt;P&gt;It is supposed to be 2/3. and 2 OVERLAP.&lt;/P&gt;</description>
      <pubDate>Wed, 17 Oct 2018 21:43:39 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Programming/How-to-count-match-between-two-variables/m-p/505370#M135332</guid>
      <dc:creator>yanshuai</dc:creator>
      <dc:date>2018-10-17T21:43:39Z</dc:date>
    </item>
    <item>
      <title>Re: How to count match between two variables</title>
      <link>https://communities.sas.com/t5/SAS-Programming/How-to-count-match-between-two-variables/m-p/505378#M135335</link>
      <description>&lt;P&gt;Hi for the just OVERLAP&lt;/P&gt;
&lt;P&gt;it's easy&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;PRE&gt;&lt;CODE class=" language-sas"&gt;DATA HAVE1;
    INPUT (NAME1 NAME2 X1 X2) (:$8.);
    CARDS;
Z J a a
Z J b a
Z J c b
X C b a
X C b c
X C c b

;
run;
proc sql;
create table want as
select a.name1,a.name2,count( distinct a.x1=b.x2 )as overlap
from have1 a, have1(keep=name: x2)b
where a.name1=b.name1 and a.name2=b.name2
group by a.name1,a.name2;
quit;&lt;/CODE&gt;&lt;/PRE&gt;
&lt;P&gt;For OVERLAP ALL,&amp;nbsp;&lt;/P&gt;
&lt;P&gt;i. To get the distinct count of all values x1,x2 combined which is a,b,c is 3 will require some logic.&amp;nbsp;&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;I do have a solution in my mind, but if it's not urgent, shall we wait for somebody else to take a shot first ?Coz I am expecting a better solution and robust solution from somebody else. Let's give it sometime&lt;/P&gt;</description>
      <pubDate>Wed, 17 Oct 2018 22:09:50 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Programming/How-to-count-match-between-two-variables/m-p/505378#M135335</guid>
      <dc:creator>novinosrin</dc:creator>
      <dc:date>2018-10-17T22:09:50Z</dc:date>
    </item>
    <item>
      <title>Re: How to count match between two variables</title>
      <link>https://communities.sas.com/t5/SAS-Programming/How-to-count-match-between-two-variables/m-p/505397#M135338</link>
      <description>&lt;P&gt;Thank you very much.&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;At first I though this is can be done in a few steps. But it seems harder than I thought. So I have to come here and ask for some help.&lt;/P&gt;</description>
      <pubDate>Wed, 17 Oct 2018 23:35:07 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Programming/How-to-count-match-between-two-variables/m-p/505397#M135338</guid>
      <dc:creator>yanshuai</dc:creator>
      <dc:date>2018-10-17T23:35:07Z</dc:date>
    </item>
    <item>
      <title>Re: How to count match between two variables</title>
      <link>https://communities.sas.com/t5/SAS-Programming/How-to-count-match-between-two-variables/m-p/505433#M135348</link>
      <description>&lt;P&gt;SQL set operations provide a straightforward way to do this:&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;PRE&gt;&lt;CODE class=" language-sas"&gt;DATA HAVE1;
    INPUT (NAME1 NAME2 X1 X2) (:$8.);
    CARDS;
Z J a a
Z J b a
Z J c b
X C b a
X C b c
X C c b
;

proc sql;
select * 
from

(select name1, name2, count(*) as overlap
from
(select name1, name2, x1 as x from have1
 intersect
 select name1, name2, x2 as x from have1)
group by name1, name2) 

natural join

(select name1, name2, count(*) as all
from
(select name1, name2, x1 as x from have1
 union
 select name1, name2, x2 as x from have1)
group by name1, name2);

quit;&lt;/CODE&gt;&lt;/PRE&gt;
&lt;P&gt;Notes: UNION and INTERSECT operations provide DISTINCT observations (unless the ALL keyword is specified). NATURAL JOIN performs a join on all same-named columns.&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;</description>
      <pubDate>Thu, 18 Oct 2018 05:02:57 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Programming/How-to-count-match-between-two-variables/m-p/505433#M135348</guid>
      <dc:creator>PGStats</dc:creator>
      <dc:date>2018-10-18T05:02:57Z</dc:date>
    </item>
    <item>
      <title>Re: How to count match between two variables</title>
      <link>https://communities.sas.com/t5/SAS-Programming/How-to-count-match-between-two-variables/m-p/505626#M135429</link>
      <description>&lt;PRE&gt;&lt;CODE class=" language-sas"&gt;DATA HAVE1;
    INPUT (NAME1 NAME2 X1 X2) (:$8.);
    CARDS;
Z J a a
Z J b a
Z J c b
X C b a
X C b c
X C c b
;

proc transpose data=have1 out=_have;
by name: x: notsorted;
var x:;
run;

proc sort data=_have out=__have;
by name: col1;
run;

data want;
set __have;
by name: col1;
if first.name2 then do; overlap=0;overlap_all=0;end;
if not first.col1 and lag(_name_) ne _name_ then overlap+1;
if first.col1 then overlap_all+1;
if last.name2;
want=overlap/overlap_all;
drop x: _name_ col1;
run;&lt;/CODE&gt;&lt;/PRE&gt;</description>
      <pubDate>Thu, 18 Oct 2018 15:57:37 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Programming/How-to-count-match-between-two-variables/m-p/505626#M135429</guid>
      <dc:creator>novinosrin</dc:creator>
      <dc:date>2018-10-18T15:57:37Z</dc:date>
    </item>
    <item>
      <title>Re: How to count match between two variables</title>
      <link>https://communities.sas.com/t5/SAS-Programming/How-to-count-match-between-two-variables/m-p/505702#M135460</link>
      <description>&lt;P&gt;Fantastic.&lt;/P&gt;&lt;P&gt;First time know this command. Thank you very much.&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;Then overlap/all will be my goal, which is kinda easy to do.&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;But what if I would like to also compute the following?&lt;/P&gt;&lt;P&gt;1. COUNT(DISTINCT X1) Group by NAME1.&amp;nbsp;&amp;nbsp; So that I could know the ratio of OVERLAP compared with X1.&lt;/P&gt;&lt;P&gt;2. COUNT(DISTINCT X2) Group by NAME1, NAME2. So that I could know the ratio of OVERLAP compared with X2.&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;How to put those things in the same SQL?&lt;/P&gt;</description>
      <pubDate>Thu, 18 Oct 2018 19:01:02 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Programming/How-to-count-match-between-two-variables/m-p/505702#M135460</guid>
      <dc:creator>yanshuai</dc:creator>
      <dc:date>2018-10-18T19:01:02Z</dc:date>
    </item>
    <item>
      <title>Re: How to count match between two variables</title>
      <link>https://communities.sas.com/t5/SAS-Programming/How-to-count-match-between-two-variables/m-p/505728#M135468</link>
      <description>&lt;P&gt;Thank you.&lt;/P&gt;&lt;P&gt;It seems it is a loop way to do this.&lt;/P&gt;&lt;P&gt;Also great.&lt;/P&gt;&lt;P&gt;&amp;nbsp;Thanks.&lt;/P&gt;</description>
      <pubDate>Thu, 18 Oct 2018 19:38:15 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Programming/How-to-count-match-between-two-variables/m-p/505728#M135468</guid>
      <dc:creator>yanshuai</dc:creator>
      <dc:date>2018-10-18T19:38:15Z</dc:date>
    </item>
    <item>
      <title>Re: How to count match between two variables</title>
      <link>https://communities.sas.com/t5/SAS-Programming/How-to-count-match-between-two-variables/m-p/505771#M135483</link>
      <description>&lt;P&gt;Hi&amp;nbsp;&lt;a href="https://communities.sas.com/t5/user/viewprofilepage/user-id/188461"&gt;@yanshuai&lt;/a&gt;&amp;nbsp;Thank you acknowledging the solutions. Can you please mark the genie&amp;nbsp;&amp;nbsp;&lt;a href="https://communities.sas.com/t5/user/viewprofilepage/user-id/462"&gt;@PGStats&lt;/a&gt;&amp;nbsp;solution as answered and close the thread unless you want to wait for more responses. Thanks!&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;</description>
      <pubDate>Thu, 18 Oct 2018 21:07:12 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Programming/How-to-count-match-between-two-variables/m-p/505771#M135483</guid>
      <dc:creator>novinosrin</dc:creator>
      <dc:date>2018-10-18T21:07:12Z</dc:date>
    </item>
    <item>
      <title>Re: How to count match between two variables</title>
      <link>https://communities.sas.com/t5/SAS-Programming/How-to-count-match-between-two-variables/m-p/505773#M135484</link>
      <description>&lt;P&gt;Here's how:&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;PRE&gt;&lt;CODE class=" language-sas"&gt;DATA HAVE1;
    INPUT (NAME1 NAME2 X1 X2) (:$8.);
    CARDS;
Z J a a
Z J b a
Z J c b
X C b a
X C b c
X C c b
;

proc sql;
select a.name1, a.name2, a.overlap, b.all, c.nbX1, c.nbX2
from

(select name1, name2, count(*) as overlap
from
(select name1, name2, x1 as x from have1
 intersect
 select name1, name2, x2 as x from have1)
group by name1, name2) as a

inner join

(select name1, name2, count(*) as all
from
(select name1, name2, x1 as x from have1
 union
 select name1, name2, x2 as x from have1)
group by name1, name2) as b 
    on a.name1=b.name1 and a.name2=b.name2

inner join

(select name1, name2, 
    count(distinct X1) as nbX1,
    count(distinct X2) as nbX2
 from have1
 group by name1, name2) as c 
    on a.name1=c.name1 and a.name2=c.name2

;

quit;
&lt;/CODE&gt;&lt;/PRE&gt;
&lt;P&gt;For some nasty reason, my SAS dies when I try to do this with natural joins &lt;img id="smileysad" class="emoticon emoticon-smileysad" src="https://communities.sas.com/i/smilies/16x16_smiley-sad.png" alt="Smiley Sad" title="Smiley Sad" /&gt;. That's why I went back to explicit joins.&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;</description>
      <pubDate>Thu, 18 Oct 2018 21:17:33 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Programming/How-to-count-match-between-two-variables/m-p/505773#M135484</guid>
      <dc:creator>PGStats</dc:creator>
      <dc:date>2018-10-18T21:17:33Z</dc:date>
    </item>
  </channel>
</rss>

