Hi,
I'm merging two dataset, previous and current with a key in my by statement. I'd like to know which ones are new and which are changed from the previous dataset. I can get the new records from the IN= option, but how can I find which are changed?
key=compress(usubjid||visit||visdat||vissttim||qstestcd||qsorres);
data incurr(drop=date siteidp usubjidp catp testcdp resultp visitp visdatep visstp)
inprev(drop=key date siteidc usubjidc catc testcdc resultc visitc visdatec visstc)
inboth;
merge prevdata(in=inprev) currdata(in=incurr);
by key;
if incurr and ^ inprev then output incurr;
else if inprev and ^ incurr then output inprev;
if inprev and incurr then output inboth;
run;
It's extremely difficult to do this in the same DATA step that merges. But it's easier to do after the fact.
Extract a subset of PREVDATA, taking just those KEY values that appear in INBOTH. It looks like you know how to use IN= variables to subset, so that step should be relatively easy. (Just make sure that KEY is the only variable that you bring in from INBOTH when you subset.) Then run a PROC COMPARE, to compare that subset with INBOTH.
Hi,
I first did the subset of prevdata with inboth and it gave me the same amount of obs that was in prevdata. Then I did the proc compare and it didn't show me which were changed from prevdata. What am I doing wrong?
data compare;
merge inprev(in=prev) inboth(in=both);
by key;
if prev;
run;
sorts by key...
proc compare base=compare compare=inboth;
run;
Value Comparison Results for Variables
__________________________________________________________
|| Base Value Compare Value
Obs || key key
________ || ___________________+ ___________________+
||
1 || 01001001BASELINE(V3W 01001001BASELINE(V3W
2 || 01001001BASELINE(V3W 01001001BASELINE(V3W
3 || 01001001BASELINE(V3W 01001001BASELINE(V3W
4 || 01001001RUN-IN(V2W-3 01001001BASELINE(V3W
5 || 01001001RUN-IN(V2W-3 01001001BASELINE(V3W
6 || 01001001RUN-IN(V2W-3 01001001BASELINE(V3W
In merging, the values from INBOTH are always being used for all variables. They overwrite any values from INPREV. To overcome that, the MERGE statement should look like this:
merge inprev(in=prev) inboth(in=both keep=key);
*Create an indicator of origin of 'key';
data changes;
merge prevdata(in=inprev) currdata(in=incurr);
by key;
length change_flag $ 24;
if incurr and inprev then change_flag = "Both Files --- No Change";
else if inprev then change_flag = "Previous File Only";
else if incurr then change_flag = "Current File Only";
run;
*Find frequencies of origins of 'key';
proc freq data = changes;
tables change_flag;
run;
Hi,
Are you trying to update your master table with new values from the transaction table. Then UPDATE statement maybe the best way and look my example code :
data one;
infile datalines ;
input val1 $ val2 $;
valx=val2;
datalines;
A AA
B BB
C CC
D DD
E EE
;
RUN;
data two;
infile datalines ;
input val1 $ val2 $;
datalines;
A AA1
B BB
C CC
D DD1
E EE
;
RUN;
proc sort data=one;
by val1;
proc sort data=two;
by val1;
data compare_one_two;
update one two;
by val1;
IF val2 ^=valx then status="Updated";
run;
Hope this helps
Thanks,
Surya
The poster was asking about which key (merging) variables had changed. In your example, no key variables have changed, only the values associated with them.
I wrote a paper at the end of the previous century
with a macro to show the differences.
SUGI 23: %COMPARWS: Compare with Summary: A Macro Using Proc Compare
www2.sas.com/proceedings/sugi23/Posters/p170.pdf
Join us for SAS Innovate April 16-19 at the Aria in Las Vegas. Bring the team and save big with our group pricing for a limited time only.
Pre-conference courses and tutorials are filling up fast and are always a sellout. Register today to reserve your seat.
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.