There are two table , I want to count the policy no that are common in both table based on categories, I also want to count the policy no that are present in one table and not present in another table, as shown below:
First Table: Second Table:
Pol | Category | Pol | Category | |||
3 | g | 3 | x | |||
6 | g | 6 | g | |||
9 | g | 102 | g | |||
12 | g | 12 | g | |||
15 | g | 15 | g | |||
18 | g | 18 | g | |||
1 | x | 1 | x | |||
4 | x | 4 | x | |||
7 | x | 7 | g | |||
10 | x | 10 | x | |||
13 | x | 13 | x | |||
16 | x | 16 | x | |||
2 | y | 21 | y | |||
5 | y | 5 | y | |||
8 | y | 8 | y | |||
11 | y | 11 | y | |||
14 | y | 14 | y | |||
17 | y | 17 | y |
I want my output/result to be :
| Intersection | Only Table1 | only Table2 | |
g | 4 | 2 | 2 | |
x | 5 | 1 | 1 | |
y | 5 | 1 | 1 |
Logic: Category g has 4 common policy between two table, similarly category g has 2 policy in table1 that are not present in second table2, similarly category g has 2 policy in table2 that are not present in first table1......Same logic for policy x and y
there could be simpler way to do this. This is one way to do this
data have1;
infile datalines dsd DLM='09'x;
input Pol Category $1.;
datalines;
3 g
6 g
9 g
12 g
15 g
18 g
1 x
4 x
7 x
10 x
13 x
16 x
2 y
5 y
8 y
11 y
14 y
17 y
;
data have2;
infile datalines dsd DLM='09'x;
input Pol Category $1.;
datalines;
3 x
6 g
102 g
12 g
15 g
18 g
1 x
4 x
7 g
10 x
13 x
16 x
21 y
5 y
8 y
11 y
14 y
17 y
;
proc sql;
create table want as
select coalesce(x.category, y.category, z.category) as category
, intersection, only_table1, only_table2 from
(select a.category, count(a.category) as intersection
from have1 as a
inner join
have2 as b
on a.pol = b.pol
and a.category = b.category
group by a.category)x
full join
(select a.category, count(a.category) as only_table1
from have1 as a
left join
have2 as b
on a.pol = b.pol
and a.category = b.category
where b.pol = .
group by a.category)y
on x.category =y.category
full join
(select b.category, count(b.category) as only_table2
from have1 as a
right join
have2 as b
on a.pol = b.pol
and a.category = b.category
where a.pol = .
group by b.category)z
on x.category =z.category;
/*creating one and two datasets*/
data one two;
do _n_=1 to 2;
input Pol Category $ @@;
if _n_=1 then output one;
else output two;
end;
datalines;
3 g 3 x
6 g 6 g
9 g 102 g
12 g 12 g
15 g 15 g
18 g 18 g
1 x 1 x
4 x 4 x
7 x 7 g
10 x 10 x
13 x 13 x
16 x 16 x
2 y 21 y
5 y 5 y
8 y 8 y
11 y 11 y
14 y 14 y
17 y 17 y
;
proc sort data=one;
by category pol;
run;
proc sort data=two;
by category pol;
run;
data want;
merge one(in=a) two(in=b);
by category pol;
if first.category then call missing(intersection,only_table1,only_table2);
if a and not b then only_table1+1;
else if a and b then intersection+1;
else if not a and b then only_table2+1;
if last.category;
drop pol ;
run;
Proc sql:
data one two;
do _n_=1 to 2;
input Pol Category $ @@;
if _n_=1 then output one;
else output two;
end;
datalines;
3 g 3 x
6 g 6 g
9 g 102 g
12 g 12 g
15 g 15 g
18 g 18 g
1 x 1 x
4 x 4 x
7 x 7 g
10 x 10 x
13 x 13 x
16 x 16 x
2 y 21 y
5 y 5 y
8 y 8 y
11 y 11 y
14 y 14 y
17 y 17 y
;
proc sql;
create table want as
select A.*,only_table1,only_table2
from
(select category,count(category) as intersection
from (select * from one intersect select * from two)
group by category) a
full join
(select category,count(category) as only_table1
from (select * from one except select * from two)
group by category) b
on a.category=b.category
full join
(select category,count(category) as only_table2
from (select * from two except select * from one)
group by category) c
on b.category=c.category;
quit;
Are you ready for the spotlight? We're accepting content ideas for SAS Innovate 2025 to be held May 6-9 in Orlando, FL. The call is open until September 25. Read more here about why you should contribute and what is in it for you!
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.