DATA Step, Macro, Functions and more

Permutation-character variables

Reply
Contributor
Posts: 25

Permutation-character variables

I have a table looks like this...

character                                      character_match

23456819-230759479-I               43290578-43209957-G

43290578-43209957-G               23456819-230759479-I

....                                                 ....

....                                                 ....

How to retain either one of the two rows like the first two?

Thanks!

Respected Advisor
Posts: 3,156

Re: Permutation-character variables

Posted in reply to cindyforest7

I don't understand your question.

Contributor
Posts: 25

Re: Permutation-character variables

So I have one table with a charater variable and a match-key variable

I made a copy of this table but renamed the character variable as Character_match

Then I matched these two tables together based on the match-key variable.

So three conditions would happen in the matching process...

1. Character was matched to Character_match with the exact same value (self-to-self)

2. Character was matched to Character_match with different values (self-to-other)

3. Character (Character_match) was matched to Character_match (Character) with different values (other-to-self)

Because Character_match is just a copy of Character, so condition 2 and 3 are permutations..

Now I want to retain the rows that have a unique combination of Character and Character_match (different values) regardless of the order.

The typical min(), max() method will not work here since they are character variables.

PROC Star
Posts: 7,480

Re: Permutation-character variables

Posted in reply to cindyforest7

: An example have and want file examples would help some of us understand what you are trying to do.  Does your file also contain the matchkey variable and, if so, are the duplicates you want to delete limited to having the same value for the matchkey variable?

Contributor
Posts: 25

Re: Permutation-character variables

yes, matchkey variable is also in the table. The matchkey doesn't matter in terms of duplications.

PROC Star
Posts: 7,480

Re: Permutation-character variables

Posted in reply to cindyforest7

: You haven't shown an example of what you want, so we can only guess.  Does the following example come close to what you want?

data have;

  informat character character_match $20.;

  input character character_match;

  cards;

23456819-230759479-I               43290578-43209957-G

43290578-43209957-G               23456819-230759479-I

23456819-230759478-I               43290578-43209956-G

43290578-43209956-G               23456819-230759478-I

23456819-230759479-J              23456819-230759479-J

;

data same different;

  set have;

  if character eq character_match then output same;

  else do;

    x=ifc(character gt character_match,character,character_match);

    y=ifc(character lt character_match,character,character_match);

    output different;

  end;

run;

proc sort data=different nodupkey out=oneeach dupout=dups;

  by x y;

run;

Valued Guide
Posts: 2,177

Re: Permutation-character variables

Posted in reply to cindyforest7

sql self-join

something like: (untested code) :

select * from table1 a join table1 b

     on a.character =b.character_match

;

Ask a Question
Discussion stats
  • 6 replies
  • 316 views
  • 0 likes
  • 4 in conversation