DATA Step, Macro, Functions and more

PROC compare giving a difference even though values are same.

Accepted Solution Solved
Reply
Contributor
Posts: 53
Accepted Solution

PROC compare giving a difference even though values are same.

HI,

My PROC compare is giving a difference, though both base & compare datasets have "0". I have used strip too.

Var2 is defined as  30$.

Base Value            Compare Value
Var2                 Var2
   
 0  0
 0  0

 

Please advise.

 

Thanks,

Archana


Accepted Solutions
Solution
‎04-21-2016 03:09 PM
Trusted Advisor
Posts: 1,115

Re: PROC compare giving a difference even though values are same.


ArchanaSudhir wrote:

Both var1 ans var2 are 0 for my dataset.


Var1 does not look like 0 to me.

 


ArchanaSudhir wrote:

Where is the special character getting introduced?

 


Interesting, so it's really a protected blank ('A0'x). Probably it has not been introduced within SAS. I rather suspect it has been read from raw data.

 

You can easily get rid of such protected blanks by changing them into ordinary blanks:

var2=translate(var2,' ','A0'x);

(in a data step).

View solution in original post


All Replies
Trusted Advisor
Posts: 1,115

Re: PROC compare giving a difference even though values are same.

Hi Archana,

 

PROC COMPARE truncates long character values for display. So, you may want to let PROC COMPARE write the results to a dataset (which might be helpful in general for this issue).

 

Another possible explanation could be non-standard blank characters (e.g. 'A0'x) in Var2. These would not be affected by the STRIP function.

Trusted Advisor
Posts: 1,115

Re: PROC compare giving a difference even though values are same.

Here is an example:

data have1;
length Var2 $30;
Var2='0'; output;
Var2='1'; output;
run;

data have2;
length Var2 $30;
Var2='0                            ,'; output; /* there is an 'A0'x character at pos. 16! */
Var2='1'; output;
run;

proc compare data=have1 c=have2 out=cmp outbase outcomp outdif outnoequal;
run;

proc print data=cmp noobs;
run;

proc print data=cmp noobs;
format var2 $hex60.;
run;

PROC COMPARE output (excerpt):

__________________________________________________________
           ||  Base Value           Compare Value
       Obs ||  Var2                  Var2
 ________  ||  ___________________+  ___________________+
           ||
        1  ||  0                     0               
__________________________________________________________

First PROC PRINT output:

_TYPE_     _OBS_    Var2

BASE         1      0
COMPARE      1      0                            ,
DIF          1      ...............X.............X

Second PROC PRINT output (red color added):

_TYPE_     _OBS_    Var2

BASE         1      302020202020202020202020202020202020202020202020202020202020
COMPARE      1      302020202020202020202020202020A0202020202020202020202020202C
DIF          1      2E2E2E2E2E2E2E2E2E2E2E2E2E2E2E582E2E2E2E2E2E2E2E2E2E2E2E2E58
Contributor
Posts: 53

Re: PROC compare giving a difference even though values are same.

Thanks for the explanatipn, ! But is there a function like strip, which I could use to show both as equal in value, irrespective of special characters.

 

Thanks,

Archana

Trusted Advisor
Posts: 1,115

Re: PROC compare giving a difference even though values are same.

There are options in PROC COMPARE to judge numeric values as "equal" if they are sufficiently close, but I'm not aware of a similar option for character values. So, to get a "clean" PROC COMPARE output you'll have to supply clean input datasets (which is a good idea anyway). For data cleaning (e.g. in a data step) you can use any functions, statements, etc. as appropriate.

 

But first of all, you should investigate what exactly are the differences between the VAR2 values (see my code example) and what causes these differences. Maybe there is something in your data generation process that does not only cause these annoying differences, but even more serious issues. And differences in variable values which actually should be equal can be already a serious issue, e.g. in match merging.

Contributor
Posts: 53

Re: PROC compare giving a difference even though values are same.

Thanks for quick response.

Both var1 ans var2 are 0 for my dataset.

 

Where is the special character getting introduced?

Var1                             Var2

 

5(3.8)                           30A020202020202020202020202020202020202020202020202020202020

5(3.8)                           302020202020202020202020202020202020202020202020202020202020

..............................   2E582E2E2E2E2E2E2E2E2E2E2E2E2E2E2E2E2E2E2E2E2E2E2E2E2E2E2E2E

2(2.1)                           30A020202020202020202020202020202020202020202020202020202020

2(2.1)                           302020202020202020202020202020202020202020202020202020202020

 

Thansk

Archana

Solution
‎04-21-2016 03:09 PM
Trusted Advisor
Posts: 1,115

Re: PROC compare giving a difference even though values are same.


ArchanaSudhir wrote:

Both var1 ans var2 are 0 for my dataset.


Var1 does not look like 0 to me.

 


ArchanaSudhir wrote:

Where is the special character getting introduced?

 


Interesting, so it's really a protected blank ('A0'x). Probably it has not been introduced within SAS. I rather suspect it has been read from raw data.

 

You can easily get rid of such protected blanks by changing them into ordinary blanks:

var2=translate(var2,' ','A0'x);

(in a data step).

Contributor
Posts: 53

Re: PROC compare giving a difference even though values are same.

HI

I used translate, but still that A0 is not getting replaced and I am getting the same differenceSmiley Sad

Trusted Advisor
Posts: 1,115

Re: PROC compare giving a difference even though values are same.

Please show your code. Then, I'm sure, we'll find the issue quickly.

Contributor
Posts: 53

Re: PROC compare giving a difference even though values are same.

Truely appreciate your help,  ! it workedSmiley Happy

☑ This topic is SOLVED.

Need further help from the community? Please ask a new question.

Discussion stats
  • 9 replies
  • 367 views
  • 0 likes
  • 2 in conversation