DATA Step, Macro, Functions and more

Many-to-Many merge situation with "multiple" key variables

Reply
N/A
Posts: 0

Many-to-Many merge situation with "multiple" key variables

I have another question, if I want to count the number of records in the group but the records have been defined with more than one key (by) variables (e.g. by id x instead of by id only), is there a way to adjust the following example code for a group of key variables to obtain that number (the count)?

Example (only one key variable, id):

data old;
input id x;
cards;
1 5
1 7
1 7
2 1
2 2
;

DATA new;
Set old;
by id;
RETAIN count;
IF first.ID THEN count=0;
count = count + 1;
IF last.ID THEN OUTPUT;
KEEP ID count;
RUN;

The final idea is to obtain the number of many-to-many cases within a merge. I want to identify how many cases (ids) are not unique (e.g. id=1 and x=7 in the previous example) in each data file (i.e. “MERGE statement has more than one data set with repeats of BY values.) to compute the total % of many-to-many cases in the merged file.

Thanks Message was edited by: Caramel
Super Contributor
Super Contributor
Posts: 3,174

Re: Many-to-Many merge situation with "multiple" key variables

Posted in reply to deleted_user
Possibly consider PROC FREQ with a CLASS ID X; and an OUTPUT statement but no VAR statement, and use the output variable _FREQ_ as your count of the unique ID and X combinations.

Scott Barry
SBBWorks, Inc.
Super User
Posts: 5,441

Re: Many-to-Many merge situation with "multiple" key variables

Posted in reply to deleted_user
Adding X to the BY-clause, and using X with first./last. will do the trick as well:


DATA new;
Set old;
by id x;
RETAIN count;
IF first.x THEN count=0;
count = count + 1;
IF last.x THEN OUTPUT;
KEEP ID x count;
RUN;

Regards,
Linus
Data never sleeps
Ask a Question
Discussion stats
  • 2 replies
  • 133 views
  • 0 likes
  • 3 in conversation