I have two table as listed below:
first table:
Pol | Category | premium |
3 | g | 10 |
6 | g | 3 |
9 | h | 7 |
12 | g | 20 |
15 | g | 12 |
18 | g | 100 |
Second table:
Pol | Category | Premium |
3 | x | 4 |
6 | g | 2 |
102 | g | |
15 | z | 12 |
12 | g | 15 |
18 | g | 200 |
I want my output to be :
Pol | Category | premium | Pol | Category | Premium | Premium_Difference |
3 | g | 10 | 3 | x | 4 | 6 |
6 | g | 3 | 6 | g | 2 | 1 |
9 | h | 7 | NA | NA | NA | 7 |
NA | NA | NA | 102 | g | 10 | 10 |
12 | g | 20 | 12 | g | 15 | 5 |
15 | g | 12 | 15 | z | 12 | 0 |
18 | g | 100 | 18 | g | 200 | -100 |
Logic: If same pol is present in both table then put them in same row and calculate the difference, but if it present in one table then put NA for other table
Did you try PROC COMPARE?
If the relation between the tables is 1:1 (no duplicate POL in any of the two tables) then merge
the tables by POL, rename variables of 2nd table and do compare the variables, one by one, in a datastep.
You may need create a foemat to show 'NA' instead a missing value.
If you don't like the standard PROC COMPARE output then use it to produce a difference dataset and reform that.
data one;
input Pol Category $ premium @@;
cards;
3 g 10 6 g 3 9 h 7 12 g 20 15 g 12 18 g 100
;
data two ;
input Pol Category $ premium @@;
cards;
3 x 4 6 g 2 102 g . 15 z 12 12 g 15 18 g 200
;
proc compare data=one compare=two noprint out=diffs outall;
run;
proc transpose data=diffs out=tall ;
where _type_ ne 'PERCENT';
by _obs_ _type_;
var pol category premium;
run;
proc transpose data=tall out=want(drop=_name_) delim=_;
by _obs_ ;
id _name_ _type_ ;
var col1 ;
run;
Another way
data have1;
infile datalines dsd DLM='09'x;
input Pol Category $ premium;
datalines;
3 g 10
6 g 3
9 h 7
12 g 20
15 g 12
18 g 100
;
data have2;
infile datalines dsd DLM='09'x;
input Pol Category $ premium;
datalines;
3 x 4
6 g 2
102 g .
15 z 12
12 g 15
18 g 200
;
proc sql;
select
case when a.pol is missing then 'NA'
else put(a.pol,8.)
end as pol,
case when a.category is missing
then 'NA' else a.category
end as category,
case when a.premium is missing
then 'NA' else put(a.premium,8.)
end as premium,
case when b.pol is missing
then 'NA' else put(b.pol,8.)
end as pol,
case when b.category is missing
then 'NA' else b.category
end as category,
case when b.premium is missing
then 'NA' else put(b.premium,8.)
end as premium
from have1 as a
full join
have2 as b
on a.pol = b.pol;
1. Sort your data
2. Rename your variables so that when they're merged they don't overwrite each other.
3. Merge
4. Calculate Difference and Assign NA to missing values.
Note that numeric variables can't have character values so you'll need a format or convert everything to characters. I've left that as an exercise to you.
proc sort data=one;
by pol;
proc sort data=two;
by pol;
run;
data want;
merge one (rename=(Category = Cat1 Premium=Prem1)) /*rename*/
two (rename=(Category = Cat2 Premium=Prem2));
by Pol;
Diff = Prem1 - Prem2;
run;
@subrat1 wrote:
I have two table as listed below:
first table:
Pol Category premium 3 g 10 6 g 3 9 h 7 12 g 20 15 g 12 18 g 100
Second table:
Pol Category Premium 3 x 4 6 g 2 102 g 15 z 12 12 g 15 18 g 200
I want my output to be :
Pol Category premium Pol Category Premium Premium_Difference 3 g 10 3 x 4 6 6 g 3 6 g 2 1 9 h 7 NA NA NA 7 NA NA NA 102 g 10 10 12 g 20 12 g 15 5 15 g 12 15 z 12 0 18 g 100 18 g 200 -100
Logic: If same pol is present in both table then put them in same row and calculate the difference, but if it present in one table then put NA for other table
@subrat1 wrote:
I have two table as listed below:
first table:
Pol Category premium 3 g 10 6 g 3 9 h 7 12 g 20 15 g 12 18 g 100
Second table:
Pol Category Premium 3 x 4 6 g 2 102 g 15 z 12 12 g 15 18 g 200
I want my output to be :
Pol Category premium Pol Category Premium Premium_Difference 3 g 10 3 x 4 6 6 g 3 6 g 2 1 9 h 7 NA NA NA 7 NA NA NA 102 g 10 10 12 g 20 12 g 15 5 15 g 12 15 z 12 0 18 g 100 18 g 200 -100
Logic: If same pol is present in both table then put them in same row and calculate the difference, but if it present in one table then put NA for other table
You can only have one variable name POL, Category or Premium. Change the names in your output to clearly identify which data set contributes which value. Also if your variable premium in NUMERIC it cannot have the Value NA, It may have a value of MISSING, which you can display as NA but not the value.
SAS Innovate 2025 is scheduled for May 6-9 in Orlando, FL. Sign up to be first to learn about the agenda and registration!
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.
Ready to level-up your skills? Choose your own adventure.