Hi:
I think you are -almost- on the right track. But, you should not need to make dataset AA... the merge can take care of getting you the observations you want. However, the construct
[pre]
if A and B ;
[/pre]
in your MERGE step will not work as you expect unless you also use the IN= data set option in the MERGE statement. Something like the program shown below. Also remember, that when you sort by CITY and PROVINCE that your city of Gatineau will sort BEFORE your city of Ottawa.
cynthia
[pre]
** read in some test data for A;
data A;
infile datalines dlm=' ' dsd;
input userid $ city $ province $ date_registration : anydtdte.;
format date_registration mmddyy10.;
return;
datalines;
lyncgar Ottawa Ont "sept 12, 2008"
chalmyl Ottawa Ont "aug 23, 2008"
charman Ottawa Ont "july 11, 2008"
camejef Ottawa Qc "oct 01, 2008"
falardn Ottawa Qc "july 14, 2008"
renadia Gatineau Qc "sept 25,2008"
lavonat Gatineau Qc "jan 12,2008"
philisa Gatineau Qc "jan 08,2008"
;
run;
** read in some test data for B;
Data B;
infile datalines;
input city $ province $ nunits;
return;
datalines;
Ottawa Ont 2
Ottawa Qc 1
Gatineau Qc 2
;
run;
proc sort data=b;
by city province;
run;
proc sort data=a;
by city province descending date_registration;
run;
ods listing;
proc print data=a;
title 'what is in dataset A before merge';
run;
proc print data=b;
title 'what is in dataset B before merge';
run;
** The goal in the merge is to use an internal CNTR variable. If there is a match;
** between dataset A and dataset B, based on CITY and PROVINCE, then increment the CNTR variable.;
** As long as the CNTR variable is less than or equal to the NUNITS variable, then;
** output an observation. This also requires that the CNTR variable will get reset to 0 for every;
** new FIRST.PROVINCE observation.;
** As long as the A dataset is sorted by CITY, PROVINCE and descending DATA_REGISTRATION, then;
** the most recent observations will be output, based on the value of NUNITS.;
data cntrOK cntr_notOK aonly bonly;
merge b(in=fromb)
a(in=froma);
by city province;
retain cntr;
if first.province then cntr = 0;
if fromb and froma then do;
cntr + 1;
if cntr le nunits then output cntrOK;
else if cntr gt nunits then output cntr_notOK;
end;
else if fromb and not froma then output bonly;
else if froma and not fromb then output aonly;
run;
ods listing;
proc print data=cntrOK;
title 'cntrOK data -- desired output';
run;
proc print data=cntr_notOK;
title 'cntr_notOK data -- observations where internal cntr was gt nunits';
run;
** these may be empty dataset;
proc print data=aonly;
title 'what is in dataset AONLY';
run;
proc print data=bonly;
title 'what is in dataset BONLY';
run;
[/pre]