If I'm mot mistake, merge will, by default, make the total number of observations in the final data set set as the sum of the maximum number of observations in a BY-group from either data set.
How do I make the total number of observations in the final data set as the sum of the MINIMUM number of observations in the BY group from either data set?
I have this:
ID region
1 A
1 A
2 A
2 A
3 B
3 B
4 C
and
ID size
1 5
1 6
1 7
2 3
2 4
3 4
3 5
4 5
& I'd like:
ID region size
1 A 5
1 A 6
2 A 3
2 A 4
3 B 4
3 B 5
4 C 5
Thank you in advance.
data minimum;
in1=0;
in2=0;
merge a (in=in1) b (in=in2);
by id;
if in1 and in2;
run;
How do you know that the first row in the "size" data set is the right one?
If you know, just use
if first.ID;
Assumng that there are no duplicates of ID in the "region" data set.
Thank you. I should elaborate. My first data set is actually similar to this:
1 A
1 A
2 A
2 A
3 B
3 B
4 C
4 C
And I'd like to match by the number of observations in this dataset. However, if I use first.id, then it only matches by the very first observation.
Perhaps you should share some real data, because this logic makes little sense.
Merging data without proper keys (unique by variables) leads to unpredictable results, and inconsistent data.
data minimum;
in1=0;
in2=0;
merge a (in=in1) b (in=in2);
by id;
if in1 and in2;
run;
Don't miss out on SAS Innovate - Register now for the FREE Livestream!
Can't make it to Vegas? No problem! Watch our general sessions LIVE or on-demand starting April 17th. Hear from SAS execs, best-selling author Adam Grant, Hot Ones host Sean Evans, top tech journalist Kara Swisher, AI expert Cassie Kozyrkov, and the mind-blowing dance crew iLuminate! Plus, get access to over 20 breakout sessions.
Learn the difference between classical and Bayesian statistical approaches and see a few PROC examples to perform Bayesian analysis in this video.
Find more tutorials on the SAS Users YouTube channel.