10-22-2014 05:30 PM
First of all thanks for reading my question.
Here is what I want to do:
I want to know the number of observation changed between two identical datasets for each variable.
I want to create an excel report showing the variable name and the number of differences.
Here is what I’m doing:
I used proc compare to get the number of differences (Ndif). I created a sas datasets using ODS and extract the numbers from the dataset using data step.
Here is the problem:
Proc compare is generating scientific notation for Number of differences (Ndif) ?
How can I have proc compare to display the full number.
NOTE: Both datasets have 8 million observation and 500 variables.
I tried to use merge statement but it wasn’t efficient.
Here is a sample code:
ods listing close ;
ods output compsum=outputsum;
proc compare base=work.dsn1 compare=work.dsn2
maxprint=(100,500) novalues listequalvar;
ods output close ;
ods listing ;
length Field_Name $40;
if (index(batch, 'NUM') gt 0 or index(batch, 'CHAR') gt 0) and type ne 'h' then
if countw(batch, ' ') = 3 or countw(batch, ' ') = 4 then &curr_count. = 0;
if scan(batch, 2) eq 'NUM' and countw(batch, ' ') = 7 then &curr_count. = scanq(batch, 5) ;
if scan(batch, 2) eq 'NUM' and countw(batch, ' ') = 6 then &curr_count. = scanq(batch, 4);
if scan(batch, 2) eq 'CHAR' and countw(batch, ' ') = 6 then &curr_count. = scanq(batch, 5);
if scan(batch, 2) eq 'CHAR' and countw(batch, ' ') = 5 then &curr_count. = scanq(batch, 4);
keep Field_Name &curr_count.;
Thanks in advance
10-22-2014 06:51 PM
In the dataset step you can change the format for the variables. Currently it is likely that the NDIF has a format like best8. Try
format ndif f16.0;
However you may have enough differences that NDIF is actually trying to exceed SAS precision.