BookmarkSubscribeRSS Feed
🔒 This topic is solved and locked. Need further help from the community? Please sign in and ask a new question.
eagles_dare13
Obsidian | Level 7

I have a data set A which has values like:
1
1
2
3
3
4
5
5

(a.k.a some values are repeating)

I have another data set B with values like:

1
3
3
5
7
7
8

(a.k.a some values are repeating , some values are also in A, and some are new)

I want to create a dataset C such that it has every value from A and only those from B that are not in A.
So the final should look like:

1
1
2
3
3
4
5
5
7
7
8

So in short, duplicates within the same set are to be retained but if there is a duplicate across A and B, only retain the one from A. There is also a description column that says whether the value is in A or B.

How to do this?

1 ACCEPTED SOLUTION

Accepted Solutions
slchen
Lapis Lazuli | Level 10


DATA A;
INPUT X;
CARDS;
1
1
2
3
3
4
5
5
;
RUN;

DATA B;
INPUT X;
CARDS;
1
3
3
5
7
7
8
;
RUN;

PROC SQL;
  SELECT * FROM A
  UNION ALL
  SELECT * FROM B WHERE X NOT IN (SELECT * FROM A);
QUIT;

View solution in original post

7 REPLIES 7
TomKari
Onyx | Level 15

What about a case where A has

3

and B has

3

3

Should your result have one 3 or two 3's?

Also, do you need any other variables copied across, or only this one?

Tom

eagles_dare13
Obsidian | Level 7

In that case, it should have one 3 only.

Also there are other variables, but they are descriptive, no decision is made based on those variables.

Astounding
PROC Star

It may be short but it's not that simple.

data want;

   from_a=0;

   merge a (in=from_a) b (in=from_b);

   by value;

   if first.value then do;

      if from_a and from_b then source='Both A and B';

      else if from_a then source='A only';

      else source='B only';

   end;

   retain source;

   if source='B only' then output;

   else if from_a then output;

run;

slchen
Lapis Lazuli | Level 10


DATA A;
INPUT X;
CARDS;
1
1
2
3
3
4
5
5
;
RUN;

DATA B;
INPUT X;
CARDS;
1
3
3
5
7
7
8
;
RUN;

PROC SQL;
  SELECT * FROM A
  UNION ALL
  SELECT * FROM B WHERE X NOT IN (SELECT * FROM A);
QUIT;

eagles_dare13
Obsidian | Level 7

When SAS concatenates data sets with the union all statement and each data set has the exact same column names and column data types, will SAS use the ordering of the columns to figure out the corresponding columns or the actual names?

Ksharp
Super User
proc sql;
create table want as
 select * from a
  union all
( select * from b
   except all
  select * from a)
;
run;

Xia Keshan

hackathon24-white-horiz.png

The 2025 SAS Hackathon has begun!

It's finally time to hack! Remember to visit the SAS Hacker's Hub regularly for news and updates.

Latest Updates

Creating Custom Steps in SAS Studio

Check out this tutorial series to learn how to build your own steps in SAS Studio.

Find more tutorials on the SAS Users YouTube channel.

SAS Training: Just a Click Away

 Ready to level-up your skills? Choose your own adventure.

Browse our catalog!

Discussion stats
  • 7 replies
  • 12677 views
  • 5 likes
  • 5 in conversation