I would like to remove observations where the value in 2 columns are the same (exists before). For example, pair A and B exist already so I would like to remove the fourth observation. similarly, I would like to remove the last obs as the pair B and C already exist.
| student1 | student2 | treatment |
| A | B | keep |
| A | C | keep |
| A | D | keep |
| B | A | remove |
| B | C | keep |
| B | D | keep |
| C | A | keep |
| C | B | remove |
If you can live with an arbitrary order of your students in the rows, you can use SORTC to get the students in the same order everywhere. Then it is just a question of removing the duplicates (SORT with NODUPKEY):
data sorted; set have; call sortc(student1,student2); run; proc sort nodupkey; by student1 student2; run;
If you can live with an arbitrary order of your students in the rows, you can use SORTC to get the students in the same order everywhere. Then it is just a question of removing the duplicates (SORT with NODUPKEY):
data sorted; set have; call sortc(student1,student2); run; proc sort nodupkey; by student1 student2; run;
!!!Post test data in the form of a datastep using the code window which is the {i} above post!!!
data have;
input student1 $ student2 $;
datalines;
A B
A C
A D
B A
B C
B D
;
run;
data want;
set have;
array student{2};
call sortc(of student{*});
run;
proc sort data=want nodupkey;
by student:;
run;
April 27 – 30 | Gaylord Texan | Grapevine, Texas
Walk in ready to learn. Walk out ready to deliver. This is the data and AI conference you can't afford to miss.
Register now and lock in 2025 pricing—just $495!
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.
Ready to level-up your skills? Choose your own adventure.