I would like to remove observations where the value in 2 columns are the same (exists before). For example, pair A and B exist already so I would like to remove the fourth observation. similarly, I would like to remove the last obs as the pair B and C already exist.
student1 | student2 | treatment |
A | B | keep |
A | C | keep |
A | D | keep |
B | A | remove |
B | C | keep |
B | D | keep |
C | A | keep |
C | B | remove |
If you can live with an arbitrary order of your students in the rows, you can use SORTC to get the students in the same order everywhere. Then it is just a question of removing the duplicates (SORT with NODUPKEY):
data sorted; set have; call sortc(student1,student2); run; proc sort nodupkey; by student1 student2; run;
If you can live with an arbitrary order of your students in the rows, you can use SORTC to get the students in the same order everywhere. Then it is just a question of removing the duplicates (SORT with NODUPKEY):
data sorted; set have; call sortc(student1,student2); run; proc sort nodupkey; by student1 student2; run;
!!!Post test data in the form of a datastep using the code window which is the {i} above post!!!
data have; input student1 $ student2 $; datalines; A B A C A D B A B C B D ; run; data want; set have; array student{2}; call sortc(of student{*}); run; proc sort data=want nodupkey; by student:; run;
Don't miss out on SAS Innovate - Register now for the FREE Livestream!
Can't make it to Vegas? No problem! Watch our general sessions LIVE or on-demand starting April 17th. Hear from SAS execs, best-selling author Adam Grant, Hot Ones host Sean Evans, top tech journalist Kara Swisher, AI expert Cassie Kozyrkov, and the mind-blowing dance crew iLuminate! Plus, get access to over 20 breakout sessions.
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.