BookmarkSubscribeRSS Feed
🔒 This topic is solved and locked. Need further help from the community? Please sign in and ask a new question.
Alireza_Boloori
Fluorite | Level 6

Hello everyone,

 

I have a data like this:

 

ID   X1    X2

1    1       1

2    1       1

3    1       2

4    1       2

5    2       3

6    2       3

7    2       3

8    2       4

 

and I want to remove duplicate observations under X2 for every value under X1, which makes the data as such:

 

ID     X1    X2

1      1       1

3      1       2

5      2       3

8      2       4

 

I was wondering how it can be done in SAS. Any idea/help is really appreciated!

1 ACCEPTED SOLUTION

Accepted Solutions
novinosrin
Tourmaline | Level 20

I honestly think you didn't test my code. Well no worries, I did another test for you:-

18 data have;
19 input ID X1 X2;
20 datalines;

NOTE: The data set WORK.HAVE has 8 observations and 3 variables.
NOTE: DATA statement used (Total process time):
real time 0.01 seconds
cpu time 0.01 seconds


29 ;
30 proc sort data=have out=want nodupkey;
31 by x1 x2;
32 run;

NOTE: There were 8 observations read from the data set WORK.HAVE.
NOTE: 4 observations with duplicate key values were deleted.
NOTE: The data set WORK.WANT has 4 observations and 3 variables.
NOTE: PROCEDURE SORT used (Total process time):
real time 0.01 seconds
cpu time 0.01 seconds

View solution in original post

6 REPLIES 6
novinosrin
Tourmaline | Level 20

Haven't you tried:

 

Proc sql;

select distinct ID  , X1  ,  X2

from your_table;

quit;

novinosrin
Tourmaline | Level 20

data have;
input ID X1 X2;
datalines;
1 1 1
2 1 1
3 1 2
4 1 2
5 2 3
6 2 3
7 2 3
8 2 4
;

proc sort data=have out=want nodupkey;
by x1 x2;
run;

Alireza_Boloori
Fluorite | Level 6

@novinosrin Thanks! However, it does not remove the duplicates. I had to add this to it:

 

data want ;
    set want ;
    by X1 X2;
    if first.X2;
run;

novinosrin
Tourmaline | Level 20

I honestly think you didn't test my code. Well no worries, I did another test for you:-

18 data have;
19 input ID X1 X2;
20 datalines;

NOTE: The data set WORK.HAVE has 8 observations and 3 variables.
NOTE: DATA statement used (Total process time):
real time 0.01 seconds
cpu time 0.01 seconds


29 ;
30 proc sort data=have out=want nodupkey;
31 by x1 x2;
32 run;

NOTE: There were 8 observations read from the data set WORK.HAVE.
NOTE: 4 observations with duplicate key values were deleted.
NOTE: The data set WORK.WANT has 4 observations and 3 variables.
NOTE: PROCEDURE SORT used (Total process time):
real time 0.01 seconds
cpu time 0.01 seconds

Alireza_Boloori
Fluorite | Level 6
Well! My original data was not EXACTLY the same as the one I wrote initially, so it might be the reason for this. Otherwise, I did test your code. Thanks for your time!
Reeza
Super User
I think he mixed up the two solutions somehow. @Alireza_Boloori please mark the appropriate answer as correct.

Catch up on SAS Innovate 2026

Nearly 200 sessions are now available on demand with the SAS Innovate Digital Pass.

Explore Now →
Develop Code with SAS Studio

Get started using SAS Studio to write, run and debug your SAS programs.

Find more tutorials on the SAS Users YouTube channel.

SAS Training: Just a Click Away

 Ready to level-up your skills? Choose your own adventure.

Browse our catalog!

Discussion stats
  • 6 replies
  • 3642 views
  • 0 likes
  • 3 in conversation