Help using Base SAS procedures

merge datasets

Reply
Occasional Contributor
Posts: 8

merge datasets

Hi guys,

I am pretty much new in programming in SAS. Here is my question...

Is it possible to merge 2 datasets and output only a certain amount of records if the "by" variables are in dataset A and in dataset B?

Example:

In dataset A, I have "UserID", "City" , "Prov" and "Date_Registration".
In dataset B, I have "City", "Prov" and "NUNITS"

I want to output "UserID", "City" and "Prov" from dataset A, but only the number of records last registered requested (NUNITS) in dataset B.

So if I sort dataset A by descending "Date_Registration", I want to output the number of records listed at the top.

Thanks a lot for your help!!
Trusted Advisor
Posts: 2,113

Re: merge datasets

The short answer is yes. However, the "word picture" that you have provided is ambiguous. Please provide some sample data for input and the output that you want.

Doc Muhlbaier
Duke
N/A
Posts: 0

Re: merge datasets

chalmyl,

1) create flag based on last registered date on dataset B.
2) When you merge the datasets, output records that meets step 1).

GL.
Occasional Contributor
Posts: 8

Re: merge datasets

I haven't found a way to attached some data, but here is what I am looking for...

Dataset A:

USERID CITY PROVINCE DATE_REGISTRATION
lyncgar Ottawa Ont sept12, 2008
chalmyl Ottawa Ont aug23, 2008
charman Ottawa Ont july11, 2008
camejef Ottawa Qc oct01, 2008
falardn Ottawa Qc july14, 2008


Dataset B;

CITY PROVINCE NUNITS
Ottawa Ont 2
Ottawa Qc 1


Output:

USERID CITY PROVINCE DATE_REGISTRATION
lyncgar Ottawa Ont sept12, 2008
chalmyl Ottawa Ont aug23, 2008
camejef Ottawa Qc oct01, 2008



I tried the following but it only output the 2 first records...

proc sort data = A;
by CITY PROVINCE descending DATE_REGISTRATION;
run;

proc sort data = B;
by CITY PROVINCE;
run;

data test;
merge A B;
by CITY PROVINCE;
retain initial_NUNITS;

if first.CITY then do;
if first.PROVINCE then do;
initial_NUNITS = NUNITS;
i = 0;
end;
end;

if A and B then do;
i + 1;
if i <= initial_NUNITS then output;
end;

run;
Occasional Contributor
Posts: 8

Re: merge datasets

I guess my inputs / output are not clear enough. Sorry... English is not my primary language...


Dataset A:

USERID CITY PROVINCE DATE_REGISTRATION
lyncgar Ottawa Ont sept12, 2008
chalmyl Ottawa Ont aug23, 2008
charman Ottawa Ont july11, 2008
camejef Ottawa Qc oct01, 2008
falardn Ottawa Qc july14, 2008
renadia Gatineau Qc sept25,2008
lavonat Gatineau Qc jan12,2008
philisa Gatineau Qc jan08,2008


Dataset B;

CITY PROVINCE NUNITS
Ottawa Ont 2
Ottawa Qc 1
Gatineau Qc 2


Output:

USERID CITY PROVINCE DATE_REGISTRATION
lyncgar Ottawa Ont sept12, 2008
chalmyl Ottawa Ont aug23, 2008
camejef Ottawa Qc oct01, 2008
renadia Gatineau Qc sept25,2008
lavonat Gatineau Qc jan12,2008
N/A
Posts: 0

Re: merge datasets

try this. added data step to keep the latest registration before merge.

proc sort data = A;
by CITY PROVINCE descending DATE_REGISTRATION;
run;

data aa;
set a;
by CITY PROVINCE descending DATE_REGISTRATION;
if first.province;
run;

proc sort data = B;
by CITY PROVINCE;
run;

data test;
merge AA B;
by CITY PROVINCE;
if A and B ;
.
. Message was edited by: sastech
SAS Super FREQ
Posts: 8,743

Re: merge datasets

Hi:
I think you are -almost- on the right track. But, you should not need to make dataset AA... the merge can take care of getting you the observations you want. However, the construct
[pre]
if A and B ;
[/pre]

in your MERGE step will not work as you expect unless you also use the IN= data set option in the MERGE statement. Something like the program shown below. Also remember, that when you sort by CITY and PROVINCE that your city of Gatineau will sort BEFORE your city of Ottawa.

cynthia

[pre]
** read in some test data for A;
data A;
infile datalines dlm=' ' dsd;
input userid $ city $ province $ date_registration : anydtdte.;
format date_registration mmddyy10.;
return;
datalines;
lyncgar Ottawa Ont "sept 12, 2008"
chalmyl Ottawa Ont "aug 23, 2008"
charman Ottawa Ont "july 11, 2008"
camejef Ottawa Qc "oct 01, 2008"
falardn Ottawa Qc "july 14, 2008"
renadia Gatineau Qc "sept 25,2008"
lavonat Gatineau Qc "jan 12,2008"
philisa Gatineau Qc "jan 08,2008"
;
run;

** read in some test data for B;
Data B;
infile datalines;
input city $ province $ nunits;
return;
datalines;
Ottawa Ont 2
Ottawa Qc 1
Gatineau Qc 2
;
run;

proc sort data=b;
by city province;
run;

proc sort data=a;
by city province descending date_registration;
run;

ods listing;
proc print data=a;
title 'what is in dataset A before merge';
run;

proc print data=b;
title 'what is in dataset B before merge';
run;

** The goal in the merge is to use an internal CNTR variable. If there is a match;
** between dataset A and dataset B, based on CITY and PROVINCE, then increment the CNTR variable.;
** As long as the CNTR variable is less than or equal to the NUNITS variable, then;
** output an observation. This also requires that the CNTR variable will get reset to 0 for every;
** new FIRST.PROVINCE observation.;
** As long as the A dataset is sorted by CITY, PROVINCE and descending DATA_REGISTRATION, then;
** the most recent observations will be output, based on the value of NUNITS.;
data cntrOK cntr_notOK aonly bonly;
merge b(in=fromb)
a(in=froma);
by city province;
retain cntr;
if first.province then cntr = 0;
if fromb and froma then do;
cntr + 1;
if cntr le nunits then output cntrOK;
else if cntr gt nunits then output cntr_notOK;
end;
else if fromb and not froma then output bonly;
else if froma and not fromb then output aonly;
run;

ods listing;
proc print data=cntrOK;
title 'cntrOK data -- desired output';
run;

proc print data=cntr_notOK;
title 'cntr_notOK data -- observations where internal cntr was gt nunits';
run;

** these may be empty dataset;
proc print data=aonly;
title 'what is in dataset AONLY';
run;

proc print data=bonly;
title 'what is in dataset BONLY';
run;
[/pre]
Occasional Contributor
Posts: 8

Re: merge datasets

It's working... Thanks!!
Ask a Question
Discussion stats
  • 7 replies
  • 181 views
  • 0 likes
  • 4 in conversation