There are two table , I want to count the policy no that are common in both table based on categories, I also want to count the policy no that are present in one table and not present in another table, as shown below:
First Table: Second Table:
Pol | Category | Pol | Category | |||
3 | g | 3 | x | |||
6 | g | 6 | g | |||
9 | g | 102 | g | |||
12 | g | 12 | g | |||
15 | g | 15 | g | |||
18 | g | 18 | g | |||
1 | x | 1 | x | |||
4 | x | 4 | x | |||
7 | x | 7 | g | |||
10 | x | 10 | x | |||
13 | x | 13 | x | |||
16 | x | 16 | x | |||
2 | y | 21 | y | |||
5 | y | 5 | y | |||
8 | y | 8 | y | |||
11 | y | 11 | y | |||
14 | y | 14 | y | |||
17 | y | 17 | y |
I want my output/result to be :
| Intersection | Only Table1 | only Table2 | |
g | 4 | 2 | 2 | |
x | 5 | 1 | 1 | |
y | 5 | 1 | 1 |
Logic: Category g has 4 common policy between two table, similarly category g has 2 policy in table1 that are not present in second table2, similarly category g has 2 policy in table2 that are not present in first table1......Same logic for policy x and y
there could be simpler way to do this. This is one way to do this
data have1;
infile datalines dsd DLM='09'x;
input Pol Category $1.;
datalines;
3 g
6 g
9 g
12 g
15 g
18 g
1 x
4 x
7 x
10 x
13 x
16 x
2 y
5 y
8 y
11 y
14 y
17 y
;
data have2;
infile datalines dsd DLM='09'x;
input Pol Category $1.;
datalines;
3 x
6 g
102 g
12 g
15 g
18 g
1 x
4 x
7 g
10 x
13 x
16 x
21 y
5 y
8 y
11 y
14 y
17 y
;
proc sql;
create table want as
select coalesce(x.category, y.category, z.category) as category
, intersection, only_table1, only_table2 from
(select a.category, count(a.category) as intersection
from have1 as a
inner join
have2 as b
on a.pol = b.pol
and a.category = b.category
group by a.category)x
full join
(select a.category, count(a.category) as only_table1
from have1 as a
left join
have2 as b
on a.pol = b.pol
and a.category = b.category
where b.pol = .
group by a.category)y
on x.category =y.category
full join
(select b.category, count(b.category) as only_table2
from have1 as a
right join
have2 as b
on a.pol = b.pol
and a.category = b.category
where a.pol = .
group by b.category)z
on x.category =z.category;
/*creating one and two datasets*/
data one two;
do _n_=1 to 2;
input Pol Category $ @@;
if _n_=1 then output one;
else output two;
end;
datalines;
3 g 3 x
6 g 6 g
9 g 102 g
12 g 12 g
15 g 15 g
18 g 18 g
1 x 1 x
4 x 4 x
7 x 7 g
10 x 10 x
13 x 13 x
16 x 16 x
2 y 21 y
5 y 5 y
8 y 8 y
11 y 11 y
14 y 14 y
17 y 17 y
;
proc sort data=one;
by category pol;
run;
proc sort data=two;
by category pol;
run;
data want;
merge one(in=a) two(in=b);
by category pol;
if first.category then call missing(intersection,only_table1,only_table2);
if a and not b then only_table1+1;
else if a and b then intersection+1;
else if not a and b then only_table2+1;
if last.category;
drop pol ;
run;
Proc sql:
data one two;
do _n_=1 to 2;
input Pol Category $ @@;
if _n_=1 then output one;
else output two;
end;
datalines;
3 g 3 x
6 g 6 g
9 g 102 g
12 g 12 g
15 g 15 g
18 g 18 g
1 x 1 x
4 x 4 x
7 x 7 g
10 x 10 x
13 x 13 x
16 x 16 x
2 y 21 y
5 y 5 y
8 y 8 y
11 y 11 y
14 y 14 y
17 y 17 y
;
proc sql;
create table want as
select A.*,only_table1,only_table2
from
(select category,count(category) as intersection
from (select * from one intersect select * from two)
group by category) a
full join
(select category,count(category) as only_table1
from (select * from one except select * from two)
group by category) b
on a.category=b.category
full join
(select category,count(category) as only_table2
from (select * from two except select * from one)
group by category) c
on b.category=c.category;
quit;
Join us for SAS Innovate 2025, our biggest and most exciting global event of the year, in Orlando, FL, from May 6-9. Sign up by March 14 for just $795.
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.
Ready to level-up your skills? Choose your own adventure.