Hi,
I'm merging two dataset, previous and current with a key in my by statement. I'd like to know which ones are new and which are changed from the previous dataset. I can get the new records from the IN= option, but how can I find which are changed?
key=compress(usubjid||visit||visdat||vissttim||qstestcd||qsorres);
data incurr(drop=date siteidp usubjidp catp testcdp resultp visitp visdatep visstp)
inprev(drop=key date siteidc usubjidc catc testcdc resultc visitc visdatec visstc)
inboth;
merge prevdata(in=inprev) currdata(in=incurr);
by key;
if incurr and ^ inprev then output incurr;
else if inprev and ^ incurr then output inprev;
if inprev and incurr then output inboth;
run;
It's extremely difficult to do this in the same DATA step that merges. But it's easier to do after the fact.
Extract a subset of PREVDATA, taking just those KEY values that appear in INBOTH. It looks like you know how to use IN= variables to subset, so that step should be relatively easy. (Just make sure that KEY is the only variable that you bring in from INBOTH when you subset.) Then run a PROC COMPARE, to compare that subset with INBOTH.
Hi,
I first did the subset of prevdata with inboth and it gave me the same amount of obs that was in prevdata. Then I did the proc compare and it didn't show me which were changed from prevdata. What am I doing wrong?
data compare;
merge inprev(in=prev) inboth(in=both);
by key;
if prev;
run;
sorts by key...
proc compare base=compare compare=inboth;
run;
Value Comparison Results for Variables
__________________________________________________________
|| Base Value Compare Value
Obs || key key
________ || ___________________+ ___________________+
||
1 || 01001001BASELINE(V3W 01001001BASELINE(V3W
2 || 01001001BASELINE(V3W 01001001BASELINE(V3W
3 || 01001001BASELINE(V3W 01001001BASELINE(V3W
4 || 01001001RUN-IN(V2W-3 01001001BASELINE(V3W
5 || 01001001RUN-IN(V2W-3 01001001BASELINE(V3W
6 || 01001001RUN-IN(V2W-3 01001001BASELINE(V3W
In merging, the values from INBOTH are always being used for all variables. They overwrite any values from INPREV. To overcome that, the MERGE statement should look like this:
merge inprev(in=prev) inboth(in=both keep=key);
*Create an indicator of origin of 'key';
data changes;
merge prevdata(in=inprev) currdata(in=incurr);
by key;
length change_flag $ 24;
if incurr and inprev then change_flag = "Both Files --- No Change";
else if inprev then change_flag = "Previous File Only";
else if incurr then change_flag = "Current File Only";
run;
*Find frequencies of origins of 'key';
proc freq data = changes;
tables change_flag;
run;
Hi,
Are you trying to update your master table with new values from the transaction table. Then UPDATE statement maybe the best way and look my example code :
data one;
infile datalines ;
input val1 $ val2 $;
valx=val2;
datalines;
A AA
B BB
C CC
D DD
E EE
;
RUN;
data two;
infile datalines ;
input val1 $ val2 $;
datalines;
A AA1
B BB
C CC
D DD1
E EE
;
RUN;
proc sort data=one;
by val1;
proc sort data=two;
by val1;
data compare_one_two;
update one two;
by val1;
IF val2 ^=valx then status="Updated";
run;
Hope this helps
Thanks,
Surya
The poster was asking about which key (merging) variables had changed. In your example, no key variables have changed, only the values associated with them.
I wrote a paper at the end of the previous century
with a macro to show the differences.
SUGI 23: %COMPARWS: Compare with Summary: A Macro Using Proc Compare
www2.sas.com/proceedings/sugi23/Posters/p170.pdf
Are you ready for the spotlight? We're accepting content ideas for SAS Innovate 2025 to be held May 6-9 in Orlando, FL. The call is open until September 25. Read more here about why you should contribute and what is in it for you!
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.