turn on suggestions

Auto-suggest helps you quickly narrow down your search results by suggesting possible matches as you type.

Showing results for

Find a Community

- Home
- /
- SAS Programming
- /
- SAS Procedures
- /
- How to Merge by Minimum Number of Observations in ...

Topic Options

- Subscribe to RSS Feed
- Mark Topic as New
- Mark Topic as Read
- Float this Topic for Current User
- Bookmark
- Subscribe
- Printer Friendly Page

- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Highlight
- Email to a Friend
- Report Inappropriate Content

03-16-2016 04:26 PM - edited 03-16-2016 04:42 PM

If I'm mot mistake, merge will, by default, make the total number of observations in the final data set set as the sum of the maximum number of observations in a BY-group from either data set.

How do I make the total number of observations in the final data set as the sum of the MINIMUM number of observations in the BY group from either data set?

I have this:

ID region

1 A

1 A

2 A

2 A

3 B

3 B

4 C

and

ID size

1 5

1 6

1 7

2 3

2 4

3 4

3 5

4 5

& I'd like:

ID region size

1 A 5

1 A 6

2 A 3

2 A 4

3 B 4

3 B 5

4 C 5

Thank you in advance.

Accepted Solutions

Solution

03-16-2016
05:42 PM

- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Highlight
- Email to a Friend
- Report Inappropriate Content

03-16-2016 04:56 PM

data minimum;

in1=0;

in2=0;

merge a (in=in1) b (in=in2);

by id;

if in1 and in2;

run;

All Replies

- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Highlight
- Email to a Friend
- Report Inappropriate Content

03-16-2016 04:31 PM

How do you know that the first row in the "size" data set is the right one?

If you know, just use

`if first.ID;`

Assumng that there are no duplicates of ID in the "region" data set.

Data never sleeps

- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Highlight
- Email to a Friend
- Report Inappropriate Content

03-16-2016 04:41 PM

Thank you. I should elaborate. My first data set is actually similar to this:

1 A

1 A

2 A

2 A

3 B

3 B

4 C

4 C

And I'd like to match by the number of observations in this dataset. However, if I use first.id, then it only matches by the very first observation.

- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Highlight
- Email to a Friend
- Report Inappropriate Content

03-16-2016 04:55 PM

Perhaps you should share some real data, because this logic makes little sense.

Merging data without proper keys (unique by variables) leads to unpredictable results, and inconsistent data.

Data never sleeps

Solution

03-16-2016
05:42 PM

- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Highlight
- Email to a Friend
- Report Inappropriate Content

03-16-2016 04:56 PM

data minimum;

in1=0;

in2=0;

merge a (in=in1) b (in=in2);

by id;

if in1 and in2;

run;

- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Highlight
- Email to a Friend
- Report Inappropriate Content

03-16-2016 05:42 PM

Perfect. If modified to "if in1" this gives the exact results I needed. Thank you!