## MERGE

Solved
Super Contributor
Posts: 1,041

# MERGE

Hi,

I am pulling info from two datasets(sources) and i am suspecting some common records(repeats) by the encounters variable.

we need to count only once when there are repeats

Can we do the simple merge step like shown below?????

/*DOES THIS TESTER DATASET GIVE EXCLUSIVES TO EACH OF THE TWO DATASETS AND ALSO THE COMMONS
data tester;========getting 193
merge one two;
by encounters;
run;

When  i split the data i am getting like this...........total number here match the count we get above

data test1 test2 test3;

merge one(in=a) two(in=b);

by encounters;

if a and not b then output test1;========62

if b and not a then output test2;========48

if a and b then output test3;============83

run;

Thanks

Accepted Solutions
Solution
‎12-16-2013 05:02 PM
Super User
Posts: 8,069

## Re: MERGE

What is the question?

If you want to count unique values of ENCOUNTERS?  Overall and by each dataset?

data test1;

if eof then put n= a= b= ;

merge one(in=in1) two(in=in2) end=eof;

by encounters;

if first.encounters then do;

n+1;a+in1; b+in2;

end;

run;

All Replies
Super Contributor
Posts: 339

## Re: MERGE

If the encounter repeats in both data sources, there will still be a duplicate entry in your merged data.

data one;

input encounter fakedata;

datalines;

1 1

1 1

1 2

;

run;

data two;

input encounter fakedata2;

datalines;

1 3

1 4

1 5

1 6

;

run;

data tester;

merge one two;

by encounter;

run;

Data step merge does NOT support many to many appropriately and still will manage one-to-many without failing output.

If you notice the TESTER result, this is what happens: starting with the first record for a by variable in each dataset, merges 1:1 until either data set runs out of records for the by group at which points the last value is retained and output with all subsequent records from the other source that have the same by value. Hence this does by no mean kill duplicates from either set.

So in short, no it does not solve your issue. At least not how you've worded it.

Super Contributor
Posts: 1,041

## Re: MERGE

Hi,

I think I put it in wrong words.....

when I meant repeats ...........I mean the same encounter  is present in  source1 and also source 2 .

I mean if we set those two datasets we would have repeats by encounters

so I was asking if the simple merge I showed would bring up exclusive encounters to both as well as common to both???????

in total I was wondering if it would bring up the unique records from both the datasets???????

Thanks

Super User
Posts: 8,069

## Re: MERGE

If each individual file does not have duplicates then the example you showed will separate them into the three possible groups.

Solution
‎12-16-2013 05:02 PM
Super User
Posts: 8,069

## Re: MERGE

What is the question?

If you want to count unique values of ENCOUNTERS?  Overall and by each dataset?

data test1;

if eof then put n= a= b= ;

merge one(in=in1) two(in=in2) end=eof;

by encounters;

if first.encounters then do;

n+1;a+in1; b+in2;

end;

run;

Frequent Contributor
Posts: 120

## Re: MERGE

Hi,

If counting the number of duplicate row in a table can help you then please try it.

It will generate an output column that shows how many times each row occurs.

Proc SQL;

select * , count (*) as count

from DSN

group by var1, var2

having count (*) >1;

quit;

Regards

Uma Shanker Saini

🔒 This topic is solved and locked.