I have a table looks like this...
character character_match
23456819-230759479-I 43290578-43209957-G
43290578-43209957-G 23456819-230759479-I
.... ....
.... ....
How to retain either one of the two rows like the first two?
Thanks!
I don't understand your question.
So I have one table with a charater variable and a match-key variable
I made a copy of this table but renamed the character variable as Character_match
Then I matched these two tables together based on the match-key variable.
So three conditions would happen in the matching process...
1. Character was matched to Character_match with the exact same value (self-to-self)
2. Character was matched to Character_match with different values (self-to-other)
3. Character (Character_match) was matched to Character_match (Character) with different values (other-to-self)
Because Character_match is just a copy of Character, so condition 2 and 3 are permutations..
Now I want to retain the rows that have a unique combination of Character and Character_match (different values) regardless of the order.
The typical min(), max() method will not work here since they are character variables.
yes, matchkey variable is also in the table. The matchkey doesn't matter in terms of duplications.
: You haven't shown an example of what you want, so we can only guess. Does the following example come close to what you want?
data have;
informat character character_match $20.;
input character character_match;
cards;
23456819-230759479-I 43290578-43209957-G
43290578-43209957-G 23456819-230759479-I
23456819-230759478-I 43290578-43209956-G
43290578-43209956-G 23456819-230759478-I
23456819-230759479-J 23456819-230759479-J
;
data same different;
set have;
if character eq character_match then output same;
else do;
x=ifc(character gt character_match,character,character_match);
y=ifc(character lt character_match,character,character_match);
output different;
end;
run;
proc sort data=different nodupkey out=oneeach dupout=dups;
by x y;
run;
sql self-join
something like: (untested code) :
select * from table1 a join table1 b
on a.character =b.character_match
;
Available on demand!
Missed SAS Innovate Las Vegas? Watch all the action for free! View the keynotes, general sessions and 22 breakouts on demand.
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.