Desktop productivity for business analysts and programmers

How to append data and remove duplicates in SAS with some constraints

Accepted Solution Solved
Reply
Frequent Contributor
Posts: 82
Accepted Solution

How to append data and remove duplicates in SAS with some constraints

I have a data set A which has values like:
1
1
2
3
3
4
5
5

(a.k.a some values are repeating)

I have another data set B with values like:

1
3
3
5
7
7
8

(a.k.a some values are repeating , some values are also in A, and some are new)

I want to create a dataset C such that it has every value from A and only those from B that are not in A.
So the final should look like:

1
1
2
3
3
4
5
5
7
7
8

So in short, duplicates within the same set are to be retained but if there is a duplicate across A and B, only retain the one from A. There is also a description column that says whether the value is in A or B.

How to do this?


Accepted Solutions
Solution
‎05-21-2014 04:12 PM
Super Contributor
Posts: 275

Re: How to append data and remove duplicates in SAS with some constraints


DATA A;
INPUT X;
CARDS;
1
1
2
3
3
4
5
5
;
RUN;

DATA B;
INPUT X;
CARDS;
1
3
3
5
7
7
8
;
RUN;

PROC SQL;
  SELECT * FROM A
  UNION ALL
  SELECT * FROM B WHERE X NOT IN (SELECT * FROM A);
QUIT;

View solution in original post


All Replies
PROC Star
Posts: 1,146

Re: How to append data and remove duplicates in SAS with some constraints

What about a case where A has

3

and B has

3

3

Should your result have one 3 or two 3's?

Also, do you need any other variables copied across, or only this one?

Tom

Frequent Contributor
Posts: 82

Re: How to append data and remove duplicates in SAS with some constraints

In that case, it should have one 3 only.

Also there are other variables, but they are descriptive, no decision is made based on those variables.

Super User
Posts: 5,366

Re: How to append data and remove duplicates in SAS with some constraints

It may be short but it's not that simple.

data want;

   from_a=0;

   merge a (in=from_a) b (in=from_b);

   by value;

   if first.value then do;

      if from_a and from_b then source='Both A and B';

      else if from_a then source='A only';

      else source='B only';

   end;

   retain source;

   if source='B only' then output;

   else if from_a then output;

run;

Solution
‎05-21-2014 04:12 PM
Super Contributor
Posts: 275

Re: How to append data and remove duplicates in SAS with some constraints


DATA A;
INPUT X;
CARDS;
1
1
2
3
3
4
5
5
;
RUN;

DATA B;
INPUT X;
CARDS;
1
3
3
5
7
7
8
;
RUN;

PROC SQL;
  SELECT * FROM A
  UNION ALL
  SELECT * FROM B WHERE X NOT IN (SELECT * FROM A);
QUIT;

Frequent Contributor
Posts: 82

Re: How to append data and remove duplicates in SAS with some constraints

When SAS concatenates data sets with the union all statement and each data set has the exact same column names and column data types, will SAS use the ordering of the columns to figure out the corresponding columns or the actual names?

Super User
Posts: 9,874

Re: How to append data and  remove duplicates in SAS with some constraints

union all corr

Super User
Posts: 9,874

Re: How to append data and remove duplicates in SAS with some constraints

proc sql;
create table want as
 select * from a
  union all
( select * from b
   except all
  select * from a)
;
run;

Xia Keshan

🔒 This topic is solved and locked.

Need further help from the community? Please ask a new question.

Discussion stats
  • 7 replies
  • 3683 views
  • 4 likes
  • 5 in conversation