BookmarkSubscribeRSS Feed
🔒 This topic is solved and locked. Need further help from the community? Please sign in and ask a new question.
robertrao
Quartz | Level 8

Hi,

I am pulling info from two datasets(sources) and i am suspecting some common records(repeats) by the encounters variable.

we need to count only once when there are repeats

Can we do the simple merge step like shown below?????

/*DOES THIS TESTER DATASET GIVE EXCLUSIVES TO EACH OF THE TWO DATASETS AND ALSO THE COMMONS
data tester;========getting 193
merge one two;
by encounters;
run;

When  i split the data i am getting like this...........total number here match the count we get above

data test1 test2 test3;

merge one(in=a) two(in=b);

by encounters;

if a and not b then output test1;========62

if b and not a then output test2;========48

if a and b then output test3;============83

run;

Thanks

1 ACCEPTED SOLUTION

Accepted Solutions
Tom
Super User Tom
Super User

What is the question?

If you want to count unique values of ENCOUNTERS?  Overall and by each dataset?

data test1;

if eof then put n= a= b= ;

merge one(in=in1) two(in=in2) end=eof;

by encounters;

if first.encounters then do;

  n+1;a+in1; b+in2;

end;

run;

View solution in original post

5 REPLIES 5
Vince28_Statcan
Quartz | Level 8

If the encounter repeats in both data sources, there will still be a duplicate entry in your merged data.

data one;

     input encounter fakedata;

     datalines;

1 1

1 1

1 2

;

run;

data two;

     input encounter fakedata2;

     datalines;

1 3

1 4

1 5

1 6

;

run;

data tester;

     merge one two;

     by encounter;

run;

Data step merge does NOT support many to many appropriately and still will manage one-to-many without failing output.

If you notice the TESTER result, this is what happens: starting with the first record for a by variable in each dataset, merges 1:1 until either data set runs out of records for the by group at which points the last value is retained and output with all subsequent records from the other source that have the same by value. Hence this does by no mean kill duplicates from either set.

So in short, no it does not solve your issue. At least not how you've worded it.

robertrao
Quartz | Level 8

Hi,

I think I put it in wrong words.....

when I meant repeats ...........I mean the same encounter  is present in  source1 and also source 2 .

I mean if we set those two datasets we would have repeats by encounters

so I was asking if the simple merge I showed would bring up exclusive encounters to both as well as common to both???????

in total I was wondering if it would bring up the unique records from both the datasets???????

Thanks

Tom
Super User Tom
Super User

If each individual file does not have duplicates then the example you showed will separate them into the three possible groups.  

Tom
Super User Tom
Super User

What is the question?

If you want to count unique values of ENCOUNTERS?  Overall and by each dataset?

data test1;

if eof then put n= a= b= ;

merge one(in=in1) two(in=in2) end=eof;

by encounters;

if first.encounters then do;

  n+1;a+in1; b+in2;

end;

run;

umashankersaini
Quartz | Level 8

Hi,

If counting the number of duplicate row in a table can help you then please try it.

It will generate an output column that shows how many times each row occurs.

Proc SQL;

select * , count (*) as count

from DSN

group by var1, var2

having count (*) >1;

quit;

Regards

Uma Shanker Saini

Ready to join fellow brilliant minds for the SAS Hackathon?

Build your skills. Make connections. Enjoy creative freedom. Maybe change the world. Registration is now open through August 30th. Visit the SAS Hackathon homepage.

Register today!
What is Bayesian Analysis?

Learn the difference between classical and Bayesian statistical approaches and see a few PROC examples to perform Bayesian analysis in this video.

Find more tutorials on the SAS Users YouTube channel.

Click image to register for webinarClick image to register for webinar

Classroom Training Available!

Select SAS Training centers are offering in-person courses. View upcoming courses for:

View all other training opportunities.

Discussion stats
  • 5 replies
  • 969 views
  • 6 likes
  • 4 in conversation