## How to compare a set of variables, detect identical values, and retain only one value?

Solved
Frequent Contributor
Posts: 75

# How to compare a set of variables, detect identical values, and retain only one value?

The current data looks like:

data have;

input ID var1 var2 var3;

datalines;

01     1     2     3

02     4     2     4

03     5     5     5

;

I want to compare from var1 to var3 and, if the values of any two or more of these variables are equal, then only retain value at one variable. Any variable is OK, but if it matters let's say the preceding variable. E.g., if var1 and var2 have an identical value, then the value at var1 will be retained.

The new data should look like:

01     1     2     3

02     4     2     .

03     5     .     .

EDIT: I think a two-phase data reshape can do the job, but not sure if there's a more efficient way.

First reshape from wide to long format:

data wantlong;set have;

array avarlist

var1 var2 var3;

do i = 1 to 3;

varlong = avarlist(i);output;end;run;

proc sort data=wantlong nodupkey;by ID varlong;

Then bring it back to wide format.

proc transpose data=wantlong out= wantwide prefix=var;
by ID;
id i;
var varlong;
run;

Accepted Solutions
Solution
‎09-25-2015 06:23 AM
Super User
Posts: 9,386

## Re: How to compare a set of variables, detect identical values, and retain only one value?

Hi,

You were right about the array, I would just take each value in reverse order and see if the value exists, if so blank current record:

data want (drop=i j);

set have;

array var{3};

do i=3 to 1 by -1;

do j=i-1 to 1 by -1;

if var{i}=var{j} then var{i}=.;

end;

end;

run;

All Replies
Occasional Contributor
Posts: 11

## Re: How to compare a set of variables, detect identical values, and retain only one value?

I think your solution is a logical and reliable one. Your variables are in fact observations, so why not transpose them...

Another approach may be putting your variables into an array and sort and deduplicate it in the array. There are papers (google sas array sort) about the quicksort algorithm implemented in arrays.

Cheers,

Eric

Solution
‎09-25-2015 06:23 AM
Super User
Posts: 9,386

## Re: How to compare a set of variables, detect identical values, and retain only one value?

Hi,

You were right about the array, I would just take each value in reverse order and see if the value exists, if so blank current record:

data want (drop=i j);

set have;

array var{3};

do i=3 to 1 by -1;

do j=i-1 to 1 by -1;

if var{i}=var{j} then var{i}=.;

end;

end;

run;

Super User
Posts: 10,681

## Re: How to compare a set of variables, detect identical values, and retain only one value?

You want compare it for each row ? not the whole dataset ?

### Code: Program

`data have;input ID var1 var2 var3;datalines;01 1 2 302 4 2 403 5 5 5;run;data want; if _n_ eq 1 then do;  length k 8;  declare hash ha();  ha.definekey('k');  ha.definedone(); end;set have;array x{*} var:;ha.clear();do i=1 to dim(x); k=x{i}; if ha.check()=0 then call missing(x{i});  else ha.add();end;drop k i;run;`

Xia Keshan

Contributor
Posts: 20

## Re: How to compare a set of variables, detect identical values, and retain only one value?

data b(drop= value i j);

SET A2;

RETAIN value;

ARRAY number{1,3} var1-var3;

DO I = 1 to DIM1(number);

DO J = 2 to DIM2(number);

IF J= 2 THEN value = var1;

IF number{i,j} = value THEN number{i,j} = .;

END;

END;

RUN;

🔒 This topic is solved and locked.