BookmarkSubscribeRSS Feed
cindyforest7
Calcite | Level 5

I have a table looks like this...

character                                      character_match

23456819-230759479-I               43290578-43209957-G

43290578-43209957-G               23456819-230759479-I

....                                                 ....

....                                                 ....

How to retain either one of the two rows like the first two?

Thanks!

6 REPLIES 6
Haikuo
Onyx | Level 15

I don't understand your question.

cindyforest7
Calcite | Level 5

So I have one table with a charater variable and a match-key variable

I made a copy of this table but renamed the character variable as Character_match

Then I matched these two tables together based on the match-key variable.

So three conditions would happen in the matching process...

1. Character was matched to Character_match with the exact same value (self-to-self)

2. Character was matched to Character_match with different values (self-to-other)

3. Character (Character_match) was matched to Character_match (Character) with different values (other-to-self)

Because Character_match is just a copy of Character, so condition 2 and 3 are permutations..

Now I want to retain the rows that have a unique combination of Character and Character_match (different values) regardless of the order.

The typical min(), max() method will not work here since they are character variables.

art297
Opal | Level 21

: An example have and want file examples would help some of us understand what you are trying to do.  Does your file also contain the matchkey variable and, if so, are the duplicates you want to delete limited to having the same value for the matchkey variable?

cindyforest7
Calcite | Level 5

yes, matchkey variable is also in the table. The matchkey doesn't matter in terms of duplications.

art297
Opal | Level 21

: You haven't shown an example of what you want, so we can only guess.  Does the following example come close to what you want?

data have;

  informat character character_match $20.;

  input character character_match;

  cards;

23456819-230759479-I               43290578-43209957-G

43290578-43209957-G               23456819-230759479-I

23456819-230759478-I               43290578-43209956-G

43290578-43209956-G               23456819-230759478-I

23456819-230759479-J              23456819-230759479-J

;

data same different;

  set have;

  if character eq character_match then output same;

  else do;

    x=ifc(character gt character_match,character,character_match);

    y=ifc(character lt character_match,character,character_match);

    output different;

  end;

run;

proc sort data=different nodupkey out=oneeach dupout=dups;

  by x y;

run;

Peter_C
Rhodochrosite | Level 12

sql self-join

something like: (untested code) :

select * from table1 a join table1 b

     on a.character =b.character_match

;

hackathon24-white-horiz.png

The 2025 SAS Hackathon has begun!

It's finally time to hack! Remember to visit the SAS Hacker's Hub regularly for news and updates.

Latest Updates

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

SAS Training: Just a Click Away

 Ready to level-up your skills? Choose your own adventure.

Browse our catalog!

Discussion stats
  • 6 replies
  • 2935 views
  • 0 likes
  • 4 in conversation