Proc compare to see the inconsistency only between two data sets ?

Reply
Contributor
Posts: 20

Proc compare to see the inconsistency only between two data sets ?

Hello All, 

I want to check consistency between two data sets, Suppose I receive data sets month on month I want to compare  the present received data sets with the previous received on following attributes

1. To check if there is attribute named 'gender' & 'DOB' for a user and it is consistent in both the months

2. Suppose if I check 2013M12 data with 2014M12( after one year) & If there is a field for age, If in 2013M12 if the age is 52 for a user and If I am checking in 2014M12 I want check if the same user is having age 53

I usually use Proc compare statement ; with novalues, nosummary, allstats, briefsummary; It prints results max of 50 pages, however I am not able to see if there is any inconsistency for a user between periods for a particular attribute

I want only those results for the users which have inconsistency between periods with respect to a attribute( Suppose if in the period 2013M12 for user '456987' the gender is 'M' and for the same user for the period 2014M01 if the gender as changed to 'F' it should show in the result).

Thanks in advance

Pra

Contributor
Posts: 37

Re: Proc compare to see the inconsistency only between two data sets ?

Hi ,

I would suggest you to try dataset programming rather than proc compare . Proc Compare can give you only a summary of differences. If you need a detailed compare report of each of the variables on a row by row basis , code a dataset merge program . You can think of using Arrays , if you have too many fields to compare.

Thanks

Karthik

Frequent Contributor
Posts: 115

Re: Proc compare to see the inconsistency only between two data sets ?

Data sets comparson using Proc Sql if you want to try

  

*COMPARE TWO DATA SETS . KEEP ONLY OBSERVATIONS THAT ARE NOT
IN BOTH DATA SETS;

proc sql noprint;

create table datasetnew as

select * from dataset_1 union select * from dataset_2

except

select * from dataset_2 intersect select * from dataset_1;

quit ;

Regards,

Naveen Srinivasan

L&T Infotech

Contributor
Posts: 20

Re: Proc compare to see the inconsistency only between two data sets ?

Posted in reply to naveen_srini

31-03-2015 13-36-28.png

Hello Naveen,

When I tried I got the above error, I understand that there is mismatch with respect to numeric format, however I was able to combine the two data sets into single one and dint face any issues wrt to formats.

Also, the point is the number of observation between two data sets are not same, the second or latest data set will have more observations ..

On trying with some other data set : I was able to create  a new data set

31-03-2015 14-05-00.png

For this particular, I see there is inconsistency But, Since my data sets runs in to thousands, How to identify those records  only with inconsistencies ?

thanks

Ask a Question
Discussion stats
  • 3 replies
  • 310 views
  • 0 likes
  • 3 in conversation