Solved: Identifying unique pairs of values (irrespective of order) across two ...

dmckeever · Posted 05-16-2013 02:52 PM

Hi,

I'm having trouble identifying unique pairs of values in a dataset that has two variables, represented separately, that, taken together, indicate the value of a third variable. In the following example, I created the third variable, Combined, using catx.

Obs Part1 Part2 Combined

1 ABC 123 ABC:123

2 123 ABC 123:ABC

3 DEF 123 DEF:123

4 123 DEF 123:DEF

5 DEF ABC DEF:ABC

6 ABC DEF ABC:DEF

The problem is that in practical terms, 'ABC:123' and '123:ABC' are functionally identical (and so on down the list). It's the uniqueness of the pairing, not the order in which the parts appear, that matters. I can't figure out how to represent this. Please note that I'm not trying to do a frequency distribution of the pairings, I'm trying to create a new column with values that are standardized representations of the unique pairings. So 'ABC:123' and '123:ABC' would both generate a value like 'ABC123'. Example of what I'm looking for in the finished product:

Obs Part1 Part2 Pairing

1 ABC 123 ABC123

2 123 ABC ABC123

3 DEF 123 DEF123

4 123 DEF DEF123

5 DEF ABC ABCDEF

6 ABC DEF ABCDEF

Any help would be greatly appreciated--I've been hacking away at this with various combinations of sort procedures and first.-/last.- logic and have gotten nowhere.

Thanks,

Dan

aland1 · Posted 05-16-2013 03:06 PM

here's one possibility:

if part1 < part2 then part = cat(part1,part2); else part = cat(part2,part1);

View solution in original post

aland1 · Posted 05-16-2013 03:06 PM

here's one possibility:

if part1 < part2 then part = cat(part1,part2); else part = cat(part2,part1);

dmckeever · Posted 05-16-2013 03:16 PM

Thanks! Your solution solved my problem. Didn't realize you could reference "greater than" or "lesser than" relationships that exist across variables, and not just within a variable (I'm fairly new). Cheers, Dan

Ksharp · Posted 05-16-2013 10:40 PM

data have;
input Obs     Part1  $   Part2 $ ;
cards;
1          ABC     123    
2         123       ABC    
3         DEF     123     
4         123       DEF    
5          DEF     ABC    
6          ABC     DEF
;
run;
data have;
 set have;
 call sortc(Part1,Part2) ;
 x=catx(':',Part1,Part2) ;
run;

Ksharp

Identifying unique pairs of values (irrespective of order) across two variables

Re: Identifying unique pairs of values (irrespective of order) across two variables

Re: Identifying unique pairs of values (irrespective of order) across two variables

Re: Identifying unique pairs of values (irrespective of order) across two variables

Re: Identifying unique pairs of values (irrespective of order) across two variables

Identifying unique pairs of values (irrespective of order) across two variables

Re: Identifying unique pairs of values (irrespective of order) across two variables

Re: Identifying unique pairs of values (irrespective of order) across two variables

Re: Identifying unique pairs of values (irrespective of order) across two variables

Re: Identifying unique pairs of values (irrespective of order) across two variables

Registration is open

SAS Training: Just a Click Away