Solved
Contributor
Posts: 52

# remove duplicated pairs of variable values

I would like to remove observations where the value in 2 columns are the same (exists before). For example, pair A and B exist already so I would like to remove the fourth observation. similarly, I would like to remove the last obs as the pair B and C already exist.

 student1 student2 treatment A B keep A C keep A D keep B A remove B C keep B D keep C A keep C B remove

Accepted Solutions
Solution
‎12-07-2017 06:36 AM
PROC Star
Posts: 220

## Re: remove duplicated pairs of variable values

If you can live with an arbitrary order of your students in the rows, you can use SORTC to get the students in the same order everywhere. Then it is just a question of removing the duplicates (SORT with NODUPKEY):

```data sorted;
set have;
call sortc(student1,student2);
run;

proc sort nodupkey;
by student1 student2;
run;```

All Replies
Solution
‎12-07-2017 06:36 AM
PROC Star
Posts: 220

## Re: remove duplicated pairs of variable values

If you can live with an arbitrary order of your students in the rows, you can use SORTC to get the students in the same order everywhere. Then it is just a question of removing the duplicates (SORT with NODUPKEY):

```data sorted;
set have;
call sortc(student1,student2);
run;

proc sort nodupkey;
by student1 student2;
run;```
Highlighted
Super User
Posts: 9,024

## Re: remove duplicated pairs of variable values

!!!Post test data in the form of a datastep using the code window which is the {i} above post!!!

```data have;
input student1 \$ student2 \$;
datalines;
A B
A C
A D
B A
B C
B D
;
run;

data want;
set have;
array student{2};
call sortc(of student{*});
run;

proc sort data=want nodupkey;
by student:;
run;```
☑ This topic is solved.