Building models with SAS Enterprise Miner, SAS Factory Miner, SAS Visual Data Mining and Machine Learning or just with programming

Proc Compare

Reply
Contributor
Posts: 42

Proc Compare

Can someone please give me some advice?

I have two datasets that contain unique records based on a unique key (record key). Data from one may not be in the other or the unique key may be there but only some value may have changed. I have three values to compare (reckey, caseload, and casedesc). I need to find all observations from the base dataset that no longer exist in the compare dataset as well as those new in the compare dataset that may not exist in the base dataset. I also need to find all observations where the observation exists in both datasets but some combination of the three values has changed.

I have experimented with proc compare but can't get it to give me what I want. I'd like a simple list of all observations that are either different or non-matched in either dataset. I don't need the summaries, just the actual observations.

I am hoping to use this as a learning experience with proc compare and I'm hoping someone can point me in the right direction.

Thank you.

Super User
Super User
Posts: 7,042

Re: Proc Compare

Proc compare is not the best tool for indentifying miss matches.  It is useful for looking for changes in observations that are in both datasets.

If you just have a couple of variables to compare then just do it yourself in a data step.

data check;

  merge one (in=in1 rename=(caseload=caseload1 casedesc=casedesc1))

        two (in=in2 rename=(caseload=caseload2 casedesc=casedesc2))

  ;

  by reckey ;

  length match $50 ;

  if not in1 then match='IN ONE ONLY';

  else if not in2 then match='IN TWO ONLY';

  else do;

    if caseload1 ne caseload2 then match='CASELOAD MISMATCH';

    if casedesc1 ne casedesc2 then match=catx(', ',match,'CASEDESC MISMATCH));

  end;

run;

Ask a Question
Discussion stats
  • 1 reply
  • 217 views
  • 0 likes
  • 2 in conversation