BookmarkSubscribeRSS Feed
chalmyl
Calcite | Level 5
Hi guys,

I am pretty much new in programming in SAS. Here is my question...

Is it possible to merge 2 datasets and output only a certain amount of records if the "by" variables are in dataset A and in dataset B?

Example:

In dataset A, I have "UserID", "City" , "Prov" and "Date_Registration".
In dataset B, I have "City", "Prov" and "NUNITS"

I want to output "UserID", "City" and "Prov" from dataset A, but only the number of records last registered requested (NUNITS) in dataset B.

So if I sort dataset A by descending "Date_Registration", I want to output the number of records listed at the top.

Thanks a lot for your help!!
7 REPLIES 7
Doc_Duke
Rhodochrosite | Level 12
The short answer is yes. However, the "word picture" that you have provided is ambiguous. Please provide some sample data for input and the output that you want.

Doc Muhlbaier
Duke
deleted_user
Not applicable
chalmyl,

1) create flag based on last registered date on dataset B.
2) When you merge the datasets, output records that meets step 1).

GL.
chalmyl
Calcite | Level 5
I haven't found a way to attached some data, but here is what I am looking for...

Dataset A:

USERID CITY PROVINCE DATE_REGISTRATION
lyncgar Ottawa Ont sept12, 2008
chalmyl Ottawa Ont aug23, 2008
charman Ottawa Ont july11, 2008
camejef Ottawa Qc oct01, 2008
falardn Ottawa Qc july14, 2008


Dataset B;

CITY PROVINCE NUNITS
Ottawa Ont 2
Ottawa Qc 1


Output:

USERID CITY PROVINCE DATE_REGISTRATION
lyncgar Ottawa Ont sept12, 2008
chalmyl Ottawa Ont aug23, 2008
camejef Ottawa Qc oct01, 2008



I tried the following but it only output the 2 first records...

proc sort data = A;
by CITY PROVINCE descending DATE_REGISTRATION;
run;

proc sort data = B;
by CITY PROVINCE;
run;

data test;
merge A B;
by CITY PROVINCE;
retain initial_NUNITS;

if first.CITY then do;
if first.PROVINCE then do;
initial_NUNITS = NUNITS;
i = 0;
end;
end;

if A and B then do;
i + 1;
if i <= initial_NUNITS then output;
end;

run;
chalmyl
Calcite | Level 5
I guess my inputs / output are not clear enough. Sorry... English is not my primary language...


Dataset A:

USERID CITY PROVINCE DATE_REGISTRATION
lyncgar Ottawa Ont sept12, 2008
chalmyl Ottawa Ont aug23, 2008
charman Ottawa Ont july11, 2008
camejef Ottawa Qc oct01, 2008
falardn Ottawa Qc july14, 2008
renadia Gatineau Qc sept25,2008
lavonat Gatineau Qc jan12,2008
philisa Gatineau Qc jan08,2008


Dataset B;

CITY PROVINCE NUNITS
Ottawa Ont 2
Ottawa Qc 1
Gatineau Qc 2


Output:

USERID CITY PROVINCE DATE_REGISTRATION
lyncgar Ottawa Ont sept12, 2008
chalmyl Ottawa Ont aug23, 2008
camejef Ottawa Qc oct01, 2008
renadia Gatineau Qc sept25,2008
lavonat Gatineau Qc jan12,2008
deleted_user
Not applicable
try this. added data step to keep the latest registration before merge.

proc sort data = A;
by CITY PROVINCE descending DATE_REGISTRATION;
run;

data aa;
set a;
by CITY PROVINCE descending DATE_REGISTRATION;
if first.province;
run;

proc sort data = B;
by CITY PROVINCE;
run;

data test;
merge AA B;
by CITY PROVINCE;
if A and B ;
.
. Message was edited by: sastech
Cynthia_sas
SAS Super FREQ
Hi:
I think you are -almost- on the right track. But, you should not need to make dataset AA... the merge can take care of getting you the observations you want. However, the construct
[pre]
if A and B ;
[/pre]

in your MERGE step will not work as you expect unless you also use the IN= data set option in the MERGE statement. Something like the program shown below. Also remember, that when you sort by CITY and PROVINCE that your city of Gatineau will sort BEFORE your city of Ottawa.

cynthia

[pre]
** read in some test data for A;
data A;
infile datalines dlm=' ' dsd;
input userid $ city $ province $ date_registration : anydtdte.;
format date_registration mmddyy10.;
return;
datalines;
lyncgar Ottawa Ont "sept 12, 2008"
chalmyl Ottawa Ont "aug 23, 2008"
charman Ottawa Ont "july 11, 2008"
camejef Ottawa Qc "oct 01, 2008"
falardn Ottawa Qc "july 14, 2008"
renadia Gatineau Qc "sept 25,2008"
lavonat Gatineau Qc "jan 12,2008"
philisa Gatineau Qc "jan 08,2008"
;
run;

** read in some test data for B;
Data B;
infile datalines;
input city $ province $ nunits;
return;
datalines;
Ottawa Ont 2
Ottawa Qc 1
Gatineau Qc 2
;
run;

proc sort data=b;
by city province;
run;

proc sort data=a;
by city province descending date_registration;
run;

ods listing;
proc print data=a;
title 'what is in dataset A before merge';
run;

proc print data=b;
title 'what is in dataset B before merge';
run;

** The goal in the merge is to use an internal CNTR variable. If there is a match;
** between dataset A and dataset B, based on CITY and PROVINCE, then increment the CNTR variable.;
** As long as the CNTR variable is less than or equal to the NUNITS variable, then;
** output an observation. This also requires that the CNTR variable will get reset to 0 for every;
** new FIRST.PROVINCE observation.;
** As long as the A dataset is sorted by CITY, PROVINCE and descending DATA_REGISTRATION, then;
** the most recent observations will be output, based on the value of NUNITS.;
data cntrOK cntr_notOK aonly bonly;
merge b(in=fromb)
a(in=froma);
by city province;
retain cntr;
if first.province then cntr = 0;
if fromb and froma then do;
cntr + 1;
if cntr le nunits then output cntrOK;
else if cntr gt nunits then output cntr_notOK;
end;
else if fromb and not froma then output bonly;
else if froma and not fromb then output aonly;
run;

ods listing;
proc print data=cntrOK;
title 'cntrOK data -- desired output';
run;

proc print data=cntr_notOK;
title 'cntr_notOK data -- observations where internal cntr was gt nunits';
run;

** these may be empty dataset;
proc print data=aonly;
title 'what is in dataset AONLY';
run;

proc print data=bonly;
title 'what is in dataset BONLY';
run;
[/pre]
chalmyl
Calcite | Level 5
It's working... Thanks!!

sas-innovate-2024.png

Don't miss out on SAS Innovate - Register now for the FREE Livestream!

Can't make it to Vegas? No problem! Watch our general sessions LIVE or on-demand starting April 17th. Hear from SAS execs, best-selling author Adam Grant, Hot Ones host Sean Evans, top tech journalist Kara Swisher, AI expert Cassie Kozyrkov, and the mind-blowing dance crew iLuminate! Plus, get access to over 20 breakout sessions.

 

Register now!

What is Bayesian Analysis?

Learn the difference between classical and Bayesian statistical approaches and see a few PROC examples to perform Bayesian analysis in this video.

Find more tutorials on the SAS Users YouTube channel.

Click image to register for webinarClick image to register for webinar

Classroom Training Available!

Select SAS Training centers are offering in-person courses. View upcoming courses for:

View all other training opportunities.

Discussion stats
  • 7 replies
  • 1100 views
  • 0 likes
  • 4 in conversation