BookmarkSubscribeRSS Feed
☑ This topic is solved. Need further help from the community? Please sign in and ask a new question.
alepage
Barite | Level 11

Hello,

 

I have two datasets out1 and out2 which are sorted by LOB MAIN_INSRR_CO_NBR PROVINCE CIEDISTR

Each dataset has 40 observations.

 

Among the aboved-mentioned variables, I have N and FTPREM which are numerical variables.

 

I would like to get a report of the difference and the percentage of N and FTPREM and that for each observations even if the difference is zero but no comparison for the character variables such as LOB MAIN_INSRR_CO_NBR PROVINCE CIEDISTR.

 

proc compare data=OUT1 compare=OUT2 out=report outall ;
id rownumber;
run;

What I would like to get is only;

 

       rownumber                  base    compare   diff %diff

                                                  N           N

1

2

3

...

38

39

40

 

       rownumber                  base                   compare   diff %diff

                                              FTPREM           FTPREM

1

2

3

...

38

39

40

 

 

1 ACCEPTED SOLUTION

Accepted Solutions
ballardw
Super User

The structure of the output data set is what it is. Reshaping the data afterwards if desired is another step.

Did you try either of a data step merge or Sql join?

With merge and the same named variables to use both

 

data combined
    merge data1 
               date2 (rename=(var1=var1from2 var2=var2from2)
   ;
   by <by variables>;
  /* compare*/
  diffvar1 = var1-var1from2;
  diffvar2 = var2-var2from2;
run;

Calculation of percentage is left to the interested reader but follows the pattern of diff.

If you have cases where both sets don't contribute the by variables the results are missing. If you only want to compare where the by variables are present then use the IN= data set options to get only the observations where both sets contribute.

View solution in original post

4 REPLIES 4
ballardw
Super User

If you only want to compare some variables place their names on a VAR statement. If you want to compare variables on the VAR statement with specific other variables us a WITH statement to provide the names.

 

If the observations are to be paired use a BY statement to make sure that you are comparing Bob to Bob and not Bob to Mary. These would be the variables sorted by.

 

Did you try the OUTPERCENT or OUTDIF options?

The data set will not look like your example.

 

alepage
Barite | Level 11

I have tried this:

proc compare data=OUT1 compare=OUT2 out=report allobs outdiff outpercent  ;
id rownumber;
run;

But here's what I get in report:

 

_TYPE_ _OBS_ Rownumber MAIN_INSRR_CO_NBR N LOB PROVINCE FTPREM CIEDISTR
DIF 1 1 ..... -20 ........ ..... -27290 .
PERCENT 1 1 ..... -0.021734878 ........ ..... -0.024059115 .
DIF 2 2 ..... 0 ........ ..... -1941 .
PERCENT 2 2 ..... 0 ........ ..... -0.001226494 .
DIF 3 3 ..... 0 ........ ..... 0 .
PERCENT 3 3 ..... 0 ........ ..... 0 .
DIF 4 4 ..... 0 ........ ..... 0 .
PERCENT 4 4 ..... 0 ........ ..... 0 .
DIF 5 5 ..... 0 ........ ..... 0 .
PERCENT 5 5 ..... 0 ........ ..... 0 .
DIF 6 6 ..... 0 ........ ..... 0 .
PERCENT 6 6 ..... 0 ........ ..... 0 .
DIF 7 7 ..... 0 ........ ..... -1099 .
PERCENT 7 7 ..... 0 ........ ..... -0.000992398 .
DIF 8 8 ..... 0 ........ ..... 0 .
PERCENT 8 8 ..... 0 ........ ..... 0 .
DIF 9 9 ..... 0 ........ ..... 0 .
PERCENT 9 9 ..... 0 ........ ..... 0 .
DIF 10 10 ..... 0 ........ ..... 0 .
PERCENT 10 10 ..... 0 ........ ..... 0 .
DIF 11 11 ..... -3 ........ ..... -9179 .
PERCENT 11 11 ..... -0.000619437 ........ ..... -0.00081673 .
DIF 12 12 ..... 0 ........ ..... 0 .
PERCENT 12 12 ..... 0 ........ ..... 0 .
DIF 13 13 ..... 0 ........ ..... 0 .
PERCENT 13 13 ..... 0 ........ ..... 0 .
DIF 14 14 ..... 0 ........ ..... 0 .
PERCENT 14 14 ..... 0 ........ ..... 0 .
DIF 15 15 ..... 0 ........ ..... 0 .
PERCENT 15 15 ..... 0 ........ ..... 0 .
DIF 16 16 ..... -3 ........ ..... -3444 .
PERCENT 16 16 ..... -0.001229075 ........ ..... -0.001145078 .
DIF 17 17 ..... 0 ........ ..... 0 .
PERCENT 17 17 ..... 0 ........ ..... 0 .
DIF 18 18 ..... 0 ........ ..... 0 .
PERCENT 18 18 ..... 0 ........ ..... 0 .
DIF 19 19 ..... 0 ........ ..... 0 .
PERCENT 19 19 ..... 0 ........ ..... 0 .
DIF 20 20 ..... 0 ........ ..... -192 .
PERCENT 20 20 ..... 0 ........ ..... -0.000345494 .
DIF 21 21 ..... -3 ........ ..... -7230 .
PERCENT 21 21 ..... -0.005198406 ........ ..... -0.008828075 .
DIF 22 22 ..... 0 ........ ..... 0 .
PERCENT 22 22 ..... 0 ........ ..... 0 .
DIF 23 23 ..... 0 ........ ..... 0 .
PERCENT 23 23 ..... 0 ........ ..... 0 .
DIF 24 24 ..... -5 ........ ..... 0 .
PERCENT 24 24 ..... -0.018471996 ........ ..... 0 .
DIF 25 25 ..... 0 ........ ..... 0 .
PERCENT 25 25 ..... 0 ........ ..... 0 .
DIF 26 26 ..... 0 ........ ..... 0 .
PERCENT 26 26 ..... 0 ........ ..... 0 .
DIF 27 27 ..... 0 ........ ..... 0 .
PERCENT 27 27 ..... 0 ........ ..... 0 .
DIF 28 28 ..... -1 ........ ..... -289 .
PERCENT 28 28 ..... -0.002324716 ........ ..... -0.000471576 .
DIF 29 29 ..... 0 ........ ..... 0 .
PERCENT 29 29 ..... 0 ........ ..... 0 .
DIF 30 30 ..... 0 ........ ..... 0 .
PERCENT 30 30 ..... 0 ........ ..... 0 .
DIF 31 31 ..... 0 ........ ..... 0 .
PERCENT 31 31 ..... 0 ........ ..... 0 .
DIF 32 32 ..... -29 ........ ..... -8298 .
PERCENT 32 32 ..... -0.010134509 ........ ..... -0.002834086 .
DIF 33 33 ..... 0 ........ ..... 0 .
PERCENT 33 33 ..... 0 ........ ..... 0 .
DIF 34 34 ..... 0 ........ ..... 0 .
PERCENT 34 34 ..... 0 ........ ..... 0 .
DIF 35 35 ..... 0 ........ ..... 0 .
PERCENT 35 35 ..... 0 ........ ..... 0 .
DIF 36 36 ..... 0 ........ ..... 0 .
PERCENT 36 36 ..... 0 ........ ..... 0 .
DIF 37 37 ..... -20 ........ ..... -383 .
PERCENT 37 37 ..... -0.011139207 ........ ..... -0.00021724 .
DIF 38 38 ..... 0 ........ ..... 0 .
PERCENT 38 38 ..... 0 ........ ..... 0 .
DIF 39 39 ..... -16 ........ ..... -583 .
PERCENT 39 39 ..... -0.019266895 ........ ..... -0.000709676 .
DIF 40 40 ..... 0 ........ ..... 0 .
PERCENT 40 40 ..... 0 ........ ..... 0 .

I dont see any value for ciedistr province, lob and Main_insrr_co_Nbr.  Also, the diff and percentage are not on the same line

ballardw
Super User

The structure of the output data set is what it is. Reshaping the data afterwards if desired is another step.

Did you try either of a data step merge or Sql join?

With merge and the same named variables to use both

 

data combined
    merge data1 
               date2 (rename=(var1=var1from2 var2=var2from2)
   ;
   by <by variables>;
  /* compare*/
  diffvar1 = var1-var1from2;
  diffvar2 = var2-var2from2;
run;

Calculation of percentage is left to the interested reader but follows the pattern of diff.

If you have cases where both sets don't contribute the by variables the results are missing. If you only want to compare where the by variables are present then use the IN= data set options to get only the observations where both sets contribute.

alepage
Barite | Level 11
I have decided to go for a SQL script from which I have more control.

SAS Innovate 2025: Call for Content

Are you ready for the spotlight? We're accepting content ideas for SAS Innovate 2025 to be held May 6-9 in Orlando, FL. The call is open until September 25. Read more here about why you should contribute and what is in it for you!

Submit your idea!

SAS Enterprise Guide vs. SAS Studio

What’s the difference between SAS Enterprise Guide and SAS Studio? How are they similar? Just ask SAS’ Danny Modlin.

Find more tutorials on the SAS Users YouTube channel.

Click image to register for webinarClick image to register for webinar

Classroom Training Available!

Select SAS Training centers are offering in-person courses. View upcoming courses for:

View all other training opportunities.

Discussion stats
  • 4 replies
  • 471 views
  • 0 likes
  • 2 in conversation