Comapare two table on observation wise and also calculate difference f...

subrat1 · Posted 02-03-2018 03:22 PM

I have two table as listed below:

first table:

Pol	Category	premium
3	g	10
6	g	3
9	h	7
12	g	20
15	g	12
18	g	100

Second table:

Pol	Category	Premium
3	x	4
6	g	2
102	g
15	z	12
12	g	15
18	g	200

I want my output to be :

Pol	Category	premium	Pol	Category	Premium	Premium_Difference
3	g	10	3	x	4	6
6	g	3	6	g	2	1
9	h	7	NA	NA	NA	7
NA	NA	NA	102	g	10	10
12	g	20	12	g	15	5
15	g	12	15	z	12	0
18	g	100	18	g	200	-100

Logic: If same pol is present in both table then put them in same row and calculate the difference, but if it present in one table then put NA for other table

Tom · Posted 02-03-2018 03:34 PM

Did you try PROC COMPARE?

subrat1 · Posted 02-03-2018 04:00 PM

Yes, i tried proc compare but it does not give the output that is
required

Shmuel · Posted 02-03-2018 04:26 PM

If the relation between the tables is 1:1 (no duplicate POL in any of the two tables) then merge

the tables by POL, rename variables of 2nd table and do compare the variables, one by one, in a datastep.

You may need create a foemat to show 'NA' instead a missing value.

Tom · Posted 02-03-2018 05:29 PM

If you don't like the standard PROC COMPARE output then use it to produce a difference dataset and reform that.

data one;
  input Pol Category $ premium @@;
cards;
3 g 10 6 g 3 9 h 7 12 g 20 15 g 12 18 g 100
;

data two ;
  input Pol Category $ premium @@;
cards;
3 x 4 6 g 2 102 g . 15 z 12 12 g 15 18 g 200
;

proc compare data=one compare=two noprint  out=diffs outall;
run;

proc transpose data=diffs out=tall ;
  where _type_ ne 'PERCENT';
  by _obs_ _type_;
  var pol category premium;
run;
proc transpose data=tall out=want(drop=_name_) delim=_;
  by _obs_  ;
  id _name_ _type_ ;
  var col1 ;
run;

kiranv_ · Posted 02-03-2018 06:57 PM

Another way

data have1;
infile datalines dsd  DLM='09'x;
input Pol	Category $	premium;
datalines;	 	 	 
3	g	10
6	g	3
9	h	7
12	g	20
15	g	12
18	g	100
 ;

data have2;
infile datalines dsd  DLM='09'x;
input Pol	Category $	premium;
datalines;
3	x	4
6	g	2
102	g	.
15	z	12
12	g	15
18	g	200
;

proc sql;
select 
case when a.pol is missing then 'NA' 
     else put(a.pol,8.)
end as pol,
case when a.category is missing 
     then 'NA' else a.category
end as category,
case when a.premium is missing 
     then 'NA' else put(a.premium,8.)
end as premium,
case when b.pol is missing 
     then 'NA' else put(b.pol,8.)
end as pol,
case when b.category is missing 
      then 'NA' else b.category
end as category,
case when b.premium is missing 
     then 'NA' else put(b.premium,8.)
end as premium

from have1 as a
full join 
have2 as b
on a.pol = b.pol;

Reeza · Posted 02-03-2018 08:26 PM

1. Sort your data

2. Rename your variables so that when they're merged they don't overwrite each other.

3. Merge

4. Calculate Difference and Assign NA to missing values.

Note that numeric variables can't have character values so you'll need a format or convert everything to characters. I've left that as an exercise to you.

proc sort data=one; 
by pol;

proc sort data=two;
by pol;
run;

data want;
merge one (rename=(Category = Cat1 Premium=Prem1)) /*rename*/
        two (rename=(Category = Cat2 Premium=Prem2));

by Pol;

Diff = Prem1 - Prem2;

run;

@subrat1 wrote:

I have two table as listed below:

first table:

Pol Category premium

3 g 10

6 g 3

9 h 7

12 g 20

15 g 12

18 g 100

Second table:

Pol Category Premium

3 x 4

6 g 2

102 g

15 z 12

12 g 15

18 g 200

I want my output to be :

Pol Category premium Pol Category Premium Premium_Difference

3 g 10 3 x 4 6

6 g 3 6 g 2 1

9 h 7      NA NA           NA 7

     NA NA            NA 102 g 10 10

12 g 20 12 g 15 5

15 g 12 15 z 12 0

18 g 100 18 g 200 -100

Logic: If same pol is present in both table then put them in same row and calculate the difference, but if it present in one table then put NA for other table

ballardw · Posted 02-05-2018 11:45 AM

@subrat1 wrote:

I have two table as listed below:

first table:

Pol Category premium

3 g 10

6 g 3

9 h 7

12 g 20

15 g 12

18 g 100

Second table:

Pol Category Premium

3 x 4

6 g 2

102 g

15 z 12

12 g 15

18 g 200

I want my output to be :

Pol Category premium Pol Category Premium Premium_Difference

3 g 10 3 x 4 6

6 g 3 6 g 2 1

9 h 7      NA NA           NA 7

     NA NA            NA 102 g 10 10

12 g 20 12 g 15 5

15 g 12 15 z 12 0

18 g 100 18 g 200 -100

Logic: If same pol is present in both table then put them in same row and calculate the difference, but if it present in one table then put NA for other table

You can only have one variable name POL, Category or Premium. Change the names in your output to clearly identify which data set contributes which value. Also if your variable premium in NUMERIC it cannot have the Value NA, It may have a value of MISSING, which you can display as NA but not the value.

Comapare two table on observation wise and also calculate difference for numeric columns

Re: Comapare two table on observation wise and also calculate difference for numeric columns

Re: Comapare two table on observation wise and also calculate difference for numeric columns

Re: Comapare two table on observation wise and also calculate difference for numeric columns

Re: Comapare two table on observation wise and also calculate difference for numeric columns

Re: Comapare two table on observation wise and also calculate difference for numeric columns

Re: Comapare two table on observation wise and also calculate difference for numeric columns

Re: Comapare two table on observation wise and also calculate difference for numeric columns

SAS Innovate 2025: Save the Date

SAS Training: Just a Click Away