BookmarkSubscribeRSS Feed
cindyforest7
Calcite | Level 5

I have a table looks like this...

character                                      character_match

23456819-230759479-I               43290578-43209957-G

43290578-43209957-G               23456819-230759479-I

....                                                 ....

....                                                 ....

How to retain either one of the two rows like the first two?

Thanks!

6 REPLIES 6
Haikuo
Onyx | Level 15

I don't understand your question.

cindyforest7
Calcite | Level 5

So I have one table with a charater variable and a match-key variable

I made a copy of this table but renamed the character variable as Character_match

Then I matched these two tables together based on the match-key variable.

So three conditions would happen in the matching process...

1. Character was matched to Character_match with the exact same value (self-to-self)

2. Character was matched to Character_match with different values (self-to-other)

3. Character (Character_match) was matched to Character_match (Character) with different values (other-to-self)

Because Character_match is just a copy of Character, so condition 2 and 3 are permutations..

Now I want to retain the rows that have a unique combination of Character and Character_match (different values) regardless of the order.

The typical min(), max() method will not work here since they are character variables.

art297
Opal | Level 21

: An example have and want file examples would help some of us understand what you are trying to do.  Does your file also contain the matchkey variable and, if so, are the duplicates you want to delete limited to having the same value for the matchkey variable?

cindyforest7
Calcite | Level 5

yes, matchkey variable is also in the table. The matchkey doesn't matter in terms of duplications.

art297
Opal | Level 21

: You haven't shown an example of what you want, so we can only guess.  Does the following example come close to what you want?

data have;

  informat character character_match $20.;

  input character character_match;

  cards;

23456819-230759479-I               43290578-43209957-G

43290578-43209957-G               23456819-230759479-I

23456819-230759478-I               43290578-43209956-G

43290578-43209956-G               23456819-230759478-I

23456819-230759479-J              23456819-230759479-J

;

data same different;

  set have;

  if character eq character_match then output same;

  else do;

    x=ifc(character gt character_match,character,character_match);

    y=ifc(character lt character_match,character,character_match);

    output different;

  end;

run;

proc sort data=different nodupkey out=oneeach dupout=dups;

  by x y;

run;

Peter_C
Rhodochrosite | Level 12

sql self-join

something like: (untested code) :

select * from table1 a join table1 b

     on a.character =b.character_match

;

sas-innovate-2024.png

Don't miss out on SAS Innovate - Register now for the FREE Livestream!

Can't make it to Vegas? No problem! Watch our general sessions LIVE or on-demand starting April 17th. Hear from SAS execs, best-selling author Adam Grant, Hot Ones host Sean Evans, top tech journalist Kara Swisher, AI expert Cassie Kozyrkov, and the mind-blowing dance crew iLuminate! Plus, get access to over 20 breakout sessions.

 

Register now!

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

Click image to register for webinarClick image to register for webinar

Classroom Training Available!

Select SAS Training centers are offering in-person courses. View upcoming courses for:

View all other training opportunities.

Discussion stats
  • 6 replies
  • 1452 views
  • 0 likes
  • 4 in conversation